Gartner scenario outlook for generative AI inference costs. [Photo: Gartner]

A forecast says companies' actual AI cost burdens are unlikely to fall even if the unit price of AI tokens drops sharply in the future.

Global IT research firm Gartner said in a report on Monday that inference costs for large language models (LLMs) with 1 trillion parameters will fall by more than 90 percent by 2030 compared with 2025. It also said cost efficiency could improve by as much as 100 times compared with early 2022 models of the same scale.

It cited improved efficiency in semiconductors and infrastructure, innovation in model design, wider use of inference-specialised chips and broader adoption on edge devices as drivers of the unit price decline.

But Gartner said falling token prices do not necessarily lead to lower corporate AI costs. AI agents consume 5 to as much as 30 times more tokens per task than existing chatbots. It said usage is rising faster than token prices are falling.

Gartner presented multi-model orchestration as a response strategy. It said repetitive, frequent tasks should be handled with small models or domain-specific language models, while frontier-grade model inference should be used selectively only for high-value, complex tasks.

Will Sommer (윌 소머), a senior director analyst at Gartner, said, "Basic AI functions are becoming effectively close to zero cost, but the computing resources and systems supporting advanced inference remain scarce."

Keyword

#Gartner #AI tokens #large language model #AI agents #multi-model orchestration
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.