Rising costs raise questions over big model-centric AI landscape

As the cost of using AI rises rapidly among companies, the big model-centric structure of the AI industry is shifting.

TechCrunch reported on June 9 that AI model developers have kept inference costs relatively low so far based on funds they raised. That allowed companies to use advanced models at prices cheaper than their actual cost.

But the mood is changing as AI adoption spreads among companies and spending on AI rises sharply in the process. More companies say they feel burdened by AI costs.

Brian Armstrong (브라이언 암스트롱), CEO of U.S. cryptocurrency exchange Coinbase, forecast that 80 percent of work will move to models that are 99 percent cheaper within 12 to 18 months. He said demand for AI is virtually infinite, but tasks that truly need the latest large models account for only the remaining 20 percent.

TechCrunch said that if such a forecast becomes reality, it could fundamentally change the economic structure of the AI industry.

So far, AI service companies have mainly used cutting-edge models, focusing on competition over quality. If low-cost small models can deliver the same quality, the cost savings could hurt profitability at AI companies such as OpenAI and Anthropic. Both OpenAI and Anthropic are nearing an IPO, making the impact potentially larger. If it is confirmed that most work can be handled with small models, it becomes difficult to justify spending hundreds of billions of dollars to train frontier models, TechCrunch reported.

Some cases are also emerging in which companies say they increased their use of low-cost small models and saw results. AI legal tech company Harvey, working with inference platform Fireworks AI, mixed Anthropic's Claude Opus with open-source model GLM 5.1 developed by Chinese AI company Zhipu AI, and cut inference costs by three times without a drop in quality, TechCrunch reported.

Gabe Pereira (게이브 퍼레이라), a Harvey co-founder, said, "The definition of quality is changing from using the most powerful model for every task to using the model that produces the correct answer most efficiently."

Many view the current situation as competition between proprietary models and open-source, but the key issue is whether models are large or small, TechCrunch reported. Switching from GPT-5.5 to DeepSeek V4 Flash cuts costs, and switching to GPT-5.4-Mini does as well, TechCrunch added.

No-code AI agent platform Lindy is a case that switched its base model from Anthropic to DeepSeek v4.

The News Stack reported that Flo Crivello (플로 크리벨로), Lindy's founder and CEO, said on social media platform X (Twitter), "We moved 100 percent of Lindy traffic to DeepSeek v4. We saved millions of dollars, and performance actually improved in core use cases. A transformative change for the business."

Crivello previously said in April that AI inference had become Lindy's biggest cost item, surpassing labor costs. Lindy then selected DeepSeek v4 after evaluating open-source models for 6 to 9 months. The transition was not easy. It was far more complex than expected, it said. Crivello said, "It required 100 times more work than expected," adding, "Evaluation work to verify model performance in a real work environment and rewriting prompts were the main challenges." Lindy said the situation could change again in the future. Crivello said, "If Anthropic sharply cuts prices in its next model, we could go back."

As concerns grow over AI costs and security, PCs are also becoming more prominent as hardware for running AI. Semiconductor companies are increasing spending on AI chips for PCs, and PC makers are also moving quickly. AI agents that run on PCs are also emerging one after another.

Chi-gyu Hwang delight@d-today.co.kr

Keyword