Ryu Je-myung, second vice minister at the Ministry of Science and ICT, briefs reporters on Jan. 15 on the first evaluation results for the homegrown foundation model project.

In a first evaluation of five elite teams participating in the government-led homegrown foundation model development project, LGAI Research, SK Telecom and Upstage advanced to the second stage.

NC AI and Naver Cloud were eliminated. NC AI failed to clear a benchmark threshold, and Naver Cloud was dropped in the first evaluation for failing to meet the government’s originality standard.

In Naver Cloud’s case, the Ministry of Science and ICT judged that HyperCLOVA X Seed 32B Sync, presented as one of the homegrown foundation models, did not meet the originality standard because it fine-tuned and used some weights from Alibaba Qwen 2.5 Vision Encoder.

The ministry had planned to select 4 companies in the first evaluation, but it plans to select 1 more as soon as possible after 2 companies were unexpectedly eliminated. The ministry explained the decision reflects the project’s aim to provide AI development experience to as many domestic companies as possible.

Companies eliminated in the first evaluation, as well as companies that participated in the call for proposals earlier or did not participate at all, can apply for the additional selection. Ryu Je-myung, the ministry’s second vice minister, said, "We will select 1 additional team within the shortest possible time so that it can compete on equal terms with the 3 companies that advanced to the second stage."

The following is a question-and-answer session with Vice Minister Ryu after he briefed reporters on the first evaluation of the homegrown foundation model project.

-I would like to know the selection criteria and timing for recruiting an additional elite team.

Since an unexpected vacancy has opened up as a result of the first evaluation, we will complete the administrative procedures as quickly as possible and select an additional team promptly. Companies eliminated in the first evaluation, 10 consortiums that participated in a preliminary review, and other companies can apply for the additional selection. We want to provide opportunities to all companies with the capacity to form a consortium. We will issue a notice within a short time.

-Ahead of the first evaluation there was controversy over originality. In Naver Cloud’s case, was the problem the use of an open-source encoder?

We included, through the call for proposals guide, basic requirements that a homegrown AI foundation model must have. From that perspective, even if an open-source model was used, the weights should have been cleanly cleared and filled with data secured independently. There were no licensing issues for Naver Cloud, but the evaluation panel judged that using the weights as they were was a problem. It is true that using open source is a global trend. But what the homegrown AI foundation model project essentially aims for is gaining experience by designing it directly from the start. I would like you to view the decision on Naver Cloud from that perspective.

-What is the schedule for the second-stage evaluation and the plan to select 1 additional team?

Since companies that did not pass the first stage can raise objections, we notified those companies. After receiving objections over a 10-day period, we will complete the first evaluation. Because the 3 companies that advanced to the second stage should not have to wait due to the additional team, we plan to complete the additional selection process promptly. The additional participant will use the same project period and the government-supported GPUs as the 3 companies that advanced to the second stage.

-The goal of this project is to select 2 companies in the end. If 2 companies were eliminated in the first evaluation contrary to the plan, is there a need to add 1 more. There could be controversy over fairness.

The final goal of running the project through a compressed competition format with a small number of participants is not to select 2 companies. Rather, it was designed to create the most competitive environment and a structure in which as many domestic companies as possible can deliver many results in a short period. The priority is to let as many companies as possible use GPUs and participate in technology development. Even if they do not advance to the second stage, companies can gain a lot through participation. The additional selection is not an approach rushed to favor specific companies, but one that maximally reflects the project’s purpose.

-There was controversy over the standards. It seems necessary to provide clearer guidelines on how originality will be judged ahead of the additional selection or the second-stage evaluation.

It is fair to say there are no companies, including global frontier companies, that do not use open source. Everyone is using the open-source technology transformer that is the starting point for generative AI, and it is taken for granted that global big tech companies use open source. We absolutely do not view using open source itself as negative. But since the homegrown foundation model project is meant to give domestic companies as much experience as possible, we viewed it as a problem to bring in and use another company’s work as it is.

-Besides Naver Cloud, Upstage and SK Telecom also faced controversy over originality ahead of the first evaluation, did they not?

As for training data weights, there was no talk among the evaluators that there were issues with the other 4 companies besides Naver Cloud. There was criticism of Upstage over a reference mention issue, and there was also some criticism of SK Telecom, but we did not see it as deviating to the extent of falling outside the homegrown foundation model standards.

-Did Naver Cloud make an inquiry in advance regarding the use of the encoder?

As far as we have confirmed, there was none. After the controversy erupted, Naver Cloud sent an explanatory statement, but the evaluation was already under way so we did not reflect it in the evaluation. We viewed that reflecting it would pose procedural problems.

-The first evaluation was conducted through benchmarks and expert user evaluations. How will the second evaluation be conducted?

The evaluation criteria were made through consultation with the companies involved. The criteria are largely 3 parts: benchmarks, expert evaluation and real-user evaluation. Expert evaluation is ultimately an objective performance evaluation. It focuses on technical originality and technological capability, including how well the company can prepare for what comes next. For user evaluation, people in the field who actually use AI assess how useful the AI is. AI is not necessarily good just because it has many parameters. Even if it is small, it can be used efficiently at industrial sites. Such usability is also very important. There will be no major changes in the criteria in the second evaluation. However, we plan to further specify matters such as from-scratch by gathering more opinions from academic and industry experts. There was some criticism related to the evaluation, but I want to emphasize that the criteria were made under a shared understanding with the participating companies. The benchmark evaluation also adopted methods that all participating companies accepted and agreed to.

Keyword

#Ministry of Science and ICT #Naver Cloud #SK Telecom #Upstage #HyperCLOVA X
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.