[DigitalToday reporter Chi-gyu Hwang] "Nvidia is now a heterogeneous AI infrastructure platform company."
That is the conclusion drawn by well-known tech analyst Patrick Moorhead (패트릭 무어헤드) after listening to what was said at Nvidia's annual GTC 2026 conference held on March 16 (local time).
At GTC 2026, Nvidia unveiled its new AI infrastructure platform, Vera Rubin. It presented seven types of chips and five types of rack systems in a form that can be mass-produced simultaneously, leading to an assessment that it was the most complete architecture announcement in the history of GTC.
At the heart of the Vera Rubin platform is the NVL72 rack, which integrates the Rubin GPU and the Vera CPU using NVLink6, a high-speed interconnect technology between GPUs. Nvidia said Vera Rubin achieved 10 times higher inference throughput per watt and one-tenth the cost per token compared with its previous model, Blackwell. Microsoft Chief Executive Satya Nadella confirmed Vera Rubin is already running on Azure.
What caught Moorhead's attention in the announcement was the integration of Groq, an AI inference chip company. The Groq 3 LPX rack introduced by Nvidia has 256 LPU processors, 128 GB of on-chip SRAM and bandwidth of 640 terabytes per second. Chief Executive Jensen Huang said combining Groq with Vera Rubin increases inference throughput per megawatt by 35 times. Samsung Electronics will produce the LP30 chip, with shipments planned for the second half of 2026.
The Vera CPU also drew attention. Huang said, "I didn't think we'd sell this many CPUs on their own. It's already certain it will become a multibillion-dollar business."
He said the Vera CPU targets AI agents. Tasks in which AI agents call tools and compile code are handled on the CPU. If the CPU is slow, the GPU stops, and the Vera CPU is designed to resolve that bottleneck.
Alibaba, ByteDance, Meta and Oracle Cloud will cooperate on deploying the Vera CPU, while Dell, HPE, Lenovo and Supermicro will handle manufacturing.
Tasks in which AI agents call tools and compile code are handled on the CPU. That is because if the CPU is slow, the GPU stops.
At the event, Nvidia's software strategy also took clearer shape. Dynamo 1.0, an open-source inference operating system for AI factories, was officially launched. Major cloud companies including AWS, Microsoft, Google Cloud and Oracle Cloud, as well as PayPal, Pinterest and ByteDance, adopted it.
Nvidia also announced the NeMo stack, which strengthens security on OpenCloue, an open-source AI agent tool. Huang likened OpenCloue to Windows and Mac and said, "It is an operating system for personal AI, an innovation as important as HTML and as important as Linux." He added that Adobe, Atlassian, SAP, Salesforce, ServiceNow, CrowdStrike and Siemens are adopting it.
The pace of expansion in the physical AI ecosystem that Nvidia is pushing also exceeded expectations, according to an assessment. Industrial robot companies including ABB, Fanuc, Kuka and Yaskawa announced they adopted Nvidia Omniverse and the Isaac simulation platform.
The combined number of installed robots at the four companies exceeds 2,000,000 units. BYD, Geely and Nissan decided to apply Nvidia DRIVE Hyperion to Level 4 autonomous driving. Uber plans to expand an Nvidia-based robotaxi service to 28 cities from 2027.
Unresolved tasks also remain. Moorhead said, "Five types of racks, seven types of chips and multiple interconnects are still complex for general companies, not hyperscalers, to operate. Energy constraints are the same. DSX, Nvidia's dynamic power supply software, is only an optimisation tool and does not increase power itself. The performance figures claimed by Groq integration also need to be verified at customer sites."