[DigitalToday reporter Jinju Hong (홍진주)] Nvidia and Google Cloud are moving to build hyperscale infrastructure targeting next-generation agentic AI and physical AI.
On April 23 (local time), online media outlet Gigazine reported that the two companies plan to expand the Google Cloud AI Hypercomputer and provide enterprises with Nvidia Vera Rubin-based A5X bare-metal instances.
A5X is a dedicated physical server based on Nvidia’s Vera Rubin NVL72 rack-scale system. Unlike virtual servers shared by multiple users, it is structured so a single company exclusively uses the resources. It is optimised for large-scale AI training and inference and high-performance simulation.
The key is scalability. A5X can scale to as many as 80,000 Rubin GPUs at a single site and up to 960,000 Rubin GPUs in a multi-site cluster. To do that, it combines Nvidia ConnectX-9 SuperNIC with Google’s network technology to form hyperscale AI clusters. Nvidia described it as infrastructure aimed at demand for “AI factories.”
Performance and cost metrics were also presented. Nvidia said A5X cuts inference cost per token to as little as one-tenth versus the previous generation and boosts throughput per watt by up to 10 times. For companies, that means running more AI workloads on the same infrastructure or significantly reducing costs.
The two companies are expanding the collaboration beyond server supply to Google Cloud’s broader AI services. Google is preparing a preview of “Gemini” on Google Distributed Cloud running on Nvidia Blackwell and Blackwell Ultra GPUs, and plans to also provide confidential virtual machines equipped with Nvidia Blackwell GPUs.
In agentic AI, the companies will integrate Nvidia Nemotron and the Nvidia NeMo framework with Google’s enterprise AI platform to support multimodal reasoning, large-scale data processing, and robotics and physical AI simulation. It is a strategy targeting industrial AI operating environments beyond basic generative AI.
The collaboration shows a shift in the direction of AI infrastructure competition. Beyond individual GPU performance, building hyperscale clusters, power efficiency, network integration and linking with enterprise AI platforms are emerging as key competitive strengths.
.@GoogleCloud and NVIDIA are expanding their partnership across agentic and physical AI. At #GoogleCloudNext, the companies made several announcements, including: ✅ NVIDIA Vera Rubin-powered A5X instances, scaling up to nearly 1M Rubin GPUs ✅ Gemini on Google Distributed… pic.twitter.com/5RxjUtfRJl