Ministry of Science and ICT signboard [Photo: Ministry of Science and ICT]

The government will launch a full survey of artificial intelligence (AI) training data scattered across the public sector.

The Ministry of Science and ICT and the National Information Society Agency (NIA) said they will conduct a cross-government “AI training data status survey” from April 10. It is the first full inventory aimed at systematically identifying AI training data held by ministries and public agencies and laying the groundwork to secure high-quality data that can be used.

With the recent spread of generative AI, demand for data is rising rapidly across industry and the public sector. But public data are managed in a fragmented way by each institution, making it difficult to grasp the overall scale and usability at a glance. Critics have said this also limits AI companies’ ability to link and use the data as training data.

The ministry will conduct the survey across all ministries based on the AI Basic Act. It will systematically assess data assets, identify 100 datasets with high potential for use and link and provide them through an “integrated AI training data provision system”. A feature of the survey is that it covers not only what is held but also data that could be used after future processing. Survey items focus on factors directly tied to AI training usability, including data type and structure, purpose of construction and the scope of possible provision.

The 100 datasets finally selected will be provided after additional processing, including quality improvements and de-identification measures. Data that are difficult to disclose online will be provided through “Data Safe Zones”. Data Safe Zones are spaces with physical and technical security applied to allow safe analysis of non-open data, and 14 zones are being operated at 11 institutions.

The ministry is also working in parallel to upgrade the existing “AI Hub” into the integrated AI training data provision system. It plans to create a virtuous cycle from data discovery to securing and use, and establish a system to promote trading in AI training data.

Kim Kyung-man (김경만), director-general for AI Policy at the ministry, said, “The core of AI performance and quality lies in usable data.” He said, “We will systematically identify public data assets and continue to develop the integrated basis for providing AI training data.”

Keyword

#Ministry of Science and ICT #National Information Society Agency #AI Hub #AI Basic Act #Data Safe Zone
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.