Coinbase said it is spreading prompts across cheaper models to keep costs from rising much even as its use of artificial intelligence (AI) tokens grows exponentially.
Business Insider reported on June 8 that Coinbase Chief Executive Brian Armstrong (브라이언 암스트롱) said in a post on X, formerly Twitter, that he is keeping costs steady by distributing prompts appropriately.
The remarks come as debate over cost efficiency in the AI industry has picked up again. New models such as Opus 4.8 and GPT-5.5 are assessed to be ahead on performance, but they can consume more tokens. When Anthropic released Opus 4.7, some users also pointed to usage limits. That suggests that separate from a trend of putting high-performance models front and center, cost control over which model to attach to which task has become more important in operating real services.
Armstrong said he expects the cost structure to be reshaped more quickly going forward. "Within 12 to 18 months, 80 percent of work will run on models that are 99 percent cheaper," he forecast. He also suggested that the newest models are unlikely to be general-purpose tools everyone uses all the time and may instead focus on certain tasks requiring high-level reasoning. He cited situations such as scientific innovation or agent orchestration as cases where such models are needed.
Market reaction was mixed, but there was broad agreement on the direction of mixing models. Venture investor Marc Andreessen called Armstrong's remarks "interesting". Box CEO Aaron Levie said the figures Armstrong presented were "somewhat extreme," but he also saw a strong likelihood that future AI use will split between high-performance work and bulk processing. That would mean leading models handle difficult tasks, while low-cost models take on large-scale repetitive work.
The mood around not trying to save tokens also appears to be fading. Some in the tech industry at the time embraced a culture of showing off high token costs or the amount of use of the latest models. In the startup sector in particular, there were also many arguments that tokens should not be spared. Y Combinator CEO Garry Tan advised founders to use tokens aggressively, and startup founder Lance Yant said in April that it is foolish to save tokens.
Still, the mood has recently shifted toward allocating models depending on cost and the nature of the work. Glyn co-founder Tony Gentilcore said Armstrong's view was correct. He added that "everyone in tech already knows this," and said only financial markets simplify things by calculating Opus pricing as infinite.
As companies move into a stage of attaching AI to real services, deciding which work to handle under what cost structure is emerging as a key operating variable, as important as competition among the latest models.