[Photo: Shutterstock]

[DigitalToday reporter Chi-gyu Hwang (황치규)] Costs associated with using generative AI and apps and services are rising, fuelling concerns that companies' expense burdens could grow.

Next-generation GPUs and AI accelerators are expected to lower inference costs, but some say the benefits are unlikely to be passed on to users.

According to a recent report by The Register, a harsh reality is approaching executives who expected AI to replace employees more cheaply.

Coding AI such as Claude Code, Codex and GitHub Copilot has become the biggest success story in the AI sector. The problem is that data centres built to train AI models were not designed for situations in which many users use services at the same time. Training and inference are completely different tasks.

That is also why Nvidia acquired AI chip startup Grok for $20 billion. AMD, AWS, Intel and Google are all redesigning GPUs and AI accelerators to lower per-token costs. But most of that hardware is scheduled to be released in the second half of this year, and large-scale deployment will not be possible until early to mid-2027, The Register said.

Against this backdrop, model developers appear to be testing whether users will keep using AI even if prices rise because they are addicted to it.

OpenAI doubled token prices with the release of GPT-5.5. The price is $5 per 1 million input tokens and $30 for output. Google has also joined in. Newly released Gemini Flash 3.5 is 3 to 6 times more expensive than the previous model.

Agent tools consume tokens dozens of times faster than general chatbots, adding to the burden of price increases, The Register reported.

Microsoft scrapped seat-based billing for GitHub Copilot and switched to usage-based pricing. Anthropic is also reviewing its pricing model.

The Register forecast: "Having AI do work costs about $30 an hour. Hiring a person costs $40 an hour plus benefits. AI companies can keep prices high with the logic that it is 'cheaper than people'. Before long, AI prices may be shown not in token units but as the 'cost of replacing one full-time employee'."

It is also not easy to expect competition to push prices down. Major model developers are all running at a loss, leaving no margin to cut, The Register reported.

Keyword

#The Register #OpenAI #Google #Microsoft #Nvidia
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.