Google Cloud launches first Blackwell AI GPU-powered instances — 72-way GB200 with 72 B200 GPUs and 36 Grace CPUs

Dell servers based on Nvidia GB200. Image is for illustrative purposes only.
(Image credit: CoreWeave)

Google Cloud has introduced its A4X virtual machines, which are powered by Nvidia's GB200-based NVL72 machines. These rack-scale systems feature 72 B200 GPUs and 36 Grace CPUs. According to Google, the new VMs are designed for large-scale AI workloads, such as large language models with long context windows, reasoning models, and scenarios that require massive concurrency. Google also offers A4 VMs for general AI training and development.

Google's A4X VMs leverage Nvidia's NVL72 machines, which have 72 B200 GPUs and 36 72-core Grace CPUs (2,596 Armv9-based Neovers V2 cores) interconnected with NVLinks. This enables seamless memory sharing across all 72 GPUs, improving response times and inference accuracy. The system also supports concurrent inference requests, making it suitable for multimodal AI applications.

Performance-wise, A4X VMs deliver four times the training efficiency of previous A3 VMs that used Nvidia's H100 GPUs. In particular, Google Cloud promises 'over 1 ExaFLOPS' of computing power per GB200 NVL72 system, potentially producing 1440 FP8/INT8/FP6 PetaFLOPS performance, suitable for training and inference with concurrent workloads.

A4X VMs also feature the Titanium ML network adapters built on Nvidia's ConnectX-7 NICs to ensure fast, secure, and scalable ML performance as it enables 28.8 terabits per second (72 × 400 Gbps) of uninterrupted low-latency GPU-to-GPU traffic using RoCE. Google Cloud's Jupiter network fabric connects NVL72 domains between each other, enabling seamless scaling to tens of thousands of Blackwell GPUs in a non-blocking cluster. In particular, AI teams can deploy A4X VMs via Google Kubernetes Engine (GKE), which supports clusters of up to 65,000 nodes. Google also touts advanced sharing and pipelining techniques to maximize GPU utilization for large deployments.

A4X VMs also seamlessly integrate with Google Cloud services. Google supports Cloud Storage FUSE, which improves training data throughput by 2.9 times, while Hyperdisk ML accelerates model load times by 11.9 times.

Google Cloud now offers both A4 and A4X VMs, each optimized for different AI workloads. A4X, with GB200 NVL72 systems, is aimed at large-scale AI, long-context language models, and high-concurrency applications. At the same time, A4, powered by B200 GPUs and unknown processors, is better suited for general AI training and fine-tuning. The pricing of A4X and A4 is unknown.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

Read more
Nvidia Blackwell Ultra B300
Nvidia announces Blackwell Ultra B300 —1.5X faster than B200 with 288GB HBM3e and 15 PFLOPS dense FP4
ExaAI's H200 cluster
Exacluster reveals one of the industry's first clusters based on Nvidia's H200 Hopper GPUs for AI and HPC: 192 96-core CPUs
Dell servers based on Nvidia GB200
Nvidia says Blackwell-based servers are in full production - 200 different configurations now available
ExaAI's H200 cluster
Exacluster with 144 Nvidia H200 AI GPUs detailed by its designer: Hydra Host enters the scene
Nvidia
Analysts halve Nvidia Blackwell cabinet shipment forecasts for 2025 — prediction contrasts AI boom
Nvidia
Nvidia unveils DGX Station workstation PCs with GB300 Blackwell Ultra inside
Latest in Artificial Intelligence
ChatGPT Security
Some ChatGPT users are addicted and will suffer withdrawal symptoms if cut off, say researchers
Ant Group headquarters
Ant Group reportedly reduces AI costs 20% with Chinese chips
Nvidia
U.S. asks Malaysia to 'monitor every shipment' to close the flow of restricted GPUs to China
Ryzen AI
AMD launches Gaia open source project for running LLMs locally on any PC
Intel CEO at Davos
At Nvidia's GTC event, Pat Gelsinger reiterated that Jensen 'got lucky with AI,' Intel missed the boat with Larrabee
Nvidia
Nvidia unveils DGX Station workstation PCs with GB300 Blackwell Ultra inside
Latest in News
Despite external similarities, the RTX 3090 is not at all the same hardware as the RTX 4090 — even if you lap the GPU and apply AD102 branding.
GPU scam resells RTX 3090 as a 4090 — complete with a fake 'AD102' label on a lapped GPU
Inspur
US expands China trade blacklist, closes susidiary loopholes
WireView Pro 90 degrees
Thermal Grizzly's WireView Pro GPU power measuring utility gets a 90-degree adapter revision
Qualcomm
Qualcomm launches global antitrust campaign against Arm — accuses Arm of restricting access to technology
Nvidia Ada Lovelace and GeForce RTX 40-Series
Analyst claims Nvidia's gaming GPUs could use Intel Foundry's 18A node in the future
Core Ultra 200S CPU
An Arrow Lake refresh may still be in the cards with only K and KF models, claims leaker