AI & Enterprise
OpenAI shared internal method to cut inference costs; could lower ChatGPT operating expenses
OpenAI engineers have reportedly shared internally an optimisation technique that can cut AI inference costs to less than half. Foreign media reports said the method was mentioned inside OpenAI in early June and has already been applied to some processing for ChatGPT guest users. Inference costs are ongoing expenses incurred whenever an AI model generates responses. Analysts have cited OpenAI’s cost burden, including an estimate that it spent more than $5 billion on inference in the first half of 2025.