Finance AI bottleneck is data, not model performance, solved with ontology

Han Tae-kyung, head of the AI stock market data vendor business division at Weaker Korea. [Photo: subject]

[DigitalToday reporter Seulgi Son] "The bottleneck in artificial intelligence (AI) is in data design, not model performance."

Han Tae-kyung (한태경), head of the AI stock market data vendor business at Weaker Korea, said finance AI cannot become more advanced because it lacks an ontology that links on-the-ground expertise so large language models (LLMs) can use it.

Han, a quant trader and developer with 18 years of experience, majored in computer engineering and statistics at Korea University and earned a master's degree in financial engineering at KAIST. He worked at Samsung Asset Management, Dumulmuri and MotAI, among others. He is now developing an AI stock market analysis service at Weaker Korea based on a finance ontology.

He assessed that the advent of generative AI has greatly improved productivity in financial data analysis. AI now performs tasks that previously required manual work, such as analyzing corporate earnings announcements, extracting news topics and data modeling, raising work efficiency by dozens of times, he said.

However, Han said data is more important in finance AI than advances in model performance.

Han said, "In the field, a model with adequate performance that is connected to the latest data is far more useful than a high-performance model without the latest data." He added, "More important than AI itself is designing what to feed AI."

He said the way LLMs conduct financial analysis is search-based, making the limits clear. For example, when asked "Which Korean stock has the highest return on equity (ROE)," a typical LLM lists "Korean stocks known for having high ROE." But if an LLM is connected to an appropriate financial data engine, it can actually tell you "Korean stocks with high ROE," he said.

He said, "In finance AI, data that is not disclosed on the web and professional analysis tools are key, and in the end, on-the-job experts must design which data to connect to which questions."

Han explained that the core of a finance ontology is defining the relationships and context among financial data — such as corporate earnings and stock prices, macroeconomic indicators and statistical models — so AI can use them. But he pointed out that most related data is tacit knowledge, and awareness of the need to build ontologies in the industry is also low.

He said, "Since quants and investment professionals are already recognized as having high market value, they have little incentive to systematize their know-how as AI training data." He added, "As a result, finance AI is not adequately reflecting the tacit knowledge of the actual field."

For instance, a high debt-to-equity ratio is a textbook risk signal for investment, but real-world judgment can differ. Han said, "Apple's debt ratio is about 500 percent, but nobody thinks it will go bankrupt." He added, "You need to look at a company's cash flow, fundraising capacity and industry structure together, but placing and defining the data that creates that context is each expert's individual area, and much of it remains personal know-how."

He said finance AI has limits unless such tacit field knowledge is connected to LLMs. Han said, "Only people who know the work can do the orchestration of calling the tools and data that LLMs will use." He added, "As AI advances, the role and importance of domain experts with this know-how will grow further."

Seulgi Son sageson@d-today.co.kr

Keyword