AI technology company CrowdWorks said on Monday it has joined the Motif Technologies consortium as a key data-supply partner and will take part in the government’s “independent AI foundation model” project.
CrowdWorks will be responsible for building training data for a 300 billion-parameter reasoning-focused large language model being developed by the Motif consortium. It said it will focus on creating step-by-step reasoning, or chain-of-thought, specialised training datasets to boost the model’s logical thinking capability, beyond simple data processing.
It will also deploy its in-house unstructured document data preprocessing solution, the Alpy Knowledge Compiler. The company explained it will precisely parse complex-structured unstructured documents such as tables and charts and convert them into a form AI can understand.
A CrowdWorks official said, "The intelligence of a reasoning-focused LLM depends on how logical and sophisticated the data it learns is." The official added, "We will bring together data refinement technologies validated through collaboration with domestic big tech companies to support the development of an independent AI model."