AI demand debate clouds memory order visibility

Memory expansion phase based on HBM [Photo: SK]

Whether demand for artificial intelligence (AI) tokens is translating into real economic output has emerged as a precondition for forecasting supply and demand for DRAM and high-bandwidth memory (HBM). Amid criticism that the output generated by AI tokens is not captured in statistics, Computex 2026 produced a series of forecasts that token demand will rise 40-fold by 2030. The industry sees limits to visibility on memory orders until the reality of demand is confirmed.

The strength of demand forecasts was evident at Computex, held on its biggest-ever scale under the theme "AI Together". Cristiano Amon (크리스티아노 아몬), Qualcomm's chief executive, projected in a keynote speech in Taipei, Taiwan on June 1 that global AI token consumption is currently about 31.7 billion every 10 seconds and will rise about 40-fold to 1.27 trillion by 2030. He cited estimates that conversational AI uses about 10,000 tokens per task, while reasoning AI uses 100,000 and agentic AI that carries out tasks on its own consumes about 1 million tokens per task. His logic is that token demand rises structurally as agents run continuously at machine speed rather than human speed.

The supply side pointed in the same direction. Chey Tae-won (최태원), chairman of SK Group, said at Computex on June 2 that a memory bottleneck is expected to continue until 2030, and that SK hynix will double wafer production capacity within 5 years. Chey said that the more caching is needed, the more memory is required, and memory demand is rising further as companies worldwide make large investments in AI data centres and AI PCs emerge. He also cited supply constraints, saying it takes at least 3 years to build a new fab and more than 5 years for a greenfield site. Sumit Sadana (수밋 사다나), Micron's chief business officer, also said AI context length is increasing 30-fold a year and memory per server has doubled over the past 3 years, stressing a shift toward a memory-centric structure.

The problem is that there are no statistics to verify such demand forecasts. While 37 percent of total token use is concentrated in computing and mathematics, an analysis says U.S. software investment GDP has not deviated from its existing trend. If more than one-third of tokens are used for coding and mathematical computations, traces should appear somewhere in software production or investment statistics, but that change is not confirmed in macro indicators. Input indicators such as token consumption and output indicators such as economic statistics are telling different stories.

The asymmetry between costs and output is also clear. Data centre power is tallied in watts and capital spending in dollars, so input costs are measured clearly. By contrast, much of the output produced by AI sits in a statistical "dark area" outside official data. That is because value absorbed through internal productivity gains or offered as free services does not go through market transactions and is not reflected in GDP. This asymmetry makes it hard to distinguish whether current token demand is structural new demand or temporary use during adoption experiments. Expansion investment decisions are being made when only one side of inputs and outputs is visible.

◆New demand or temporary use... conditions for order visibility

This distinction is directly linked to investment decisions in the memory industry. If token demand is new demand, it can explain the current race to expand capacity, but if temporary use is a large share, a downturn in orders could arrive without warning. Even if demand persists, efficiency technologies could change the composition of demand.

According to iM Securities, a compression technology called "TurboQuant", which Google is applying in part to Gemini 3.0, reduces the capacity of short-term memory storage (KV cache) by more than six times, allowing six times more users to be handled with the same HBM capacity. Intel's unveiling at Computex of a data centre GPU called "Crescent Island" equipped with 480GB of LPDDR5X without HBM is also seen as a detour strategy in the same context.

Order signals are solid for now. Nvidia CEO Jensen Huang introduced the supply chain for its next-generation Vera Rubin platform in a keynote speech and mentioned SK hynix as an HBM4 supplier, while Samsung Electronics responded as Song Jae-hyuk (송재혁), president and chief technology officer of its semiconductor division, unveiled an eighth-generation HBM prototype for the first time. Still, these orders are based on customers' demand forecasts, leaving a task of verification because that is separate from the persistence of end demand.

The key is when the persistence of token demand is confirmed in statistics. Until an output measurement system is in place, the gap between suppliers' forecasts and macro indicators is likely to persist, and the gap itself is likely to become a source of debate over the industry cycle. In the meantime, order visibility for Samsung Electronics and SK hynix is expected to remain dependent on customer forecasts. An industry official said, "The moment when token demand is proven through output will be a turning point for the memory industry cycle."

Dae-geon Seok d2dg@d-today.co.kr

Keyword