AMD preps rack-scale Instinct MI450X IF128 with 128 GPUs to challenge Nvidia's VR200 NVL144 in 2026

(Image credit: AMD)

AMD plans to launch its first two rack-scale Instinct accelerators in 2026 to compete with Nvidia's VR300 NVL144, reports SemiAnalysis. They are named AMD's Instinct MI450X IF64 and Instinct MI450X IF128, and both are designed for AI deployment. If they prove to be a success, it could change the landscape of AI hardware over time.

While AMD's Instinct MI300-series AI and HPC GPUs are very powerful on paper, they cannot compete against Nvidia's GB200 NVL72 rack-scale solution in terms of performance scalability, as their maximum scale-up world size is eight processors.

The Instinct MI450X IF128 will be AMD's first system to support multiple AI processors across two racks using Infinity Fabric, extended over Ethernet. The machine will rely on 16 1U servers running one AMD EPYC 'Venice' CPU with four Instinct MI450X GPUs equipped with their own LPDDR memory pool and a PCIe x4 SSD. Each of the 128 GPUs will have over 1.8 TB/s of unidirectional internal bandwidth for inter-GPU communication within the same scaling domain, thus enabling significantly larger compute clusters than AMD has supported so far.

For scale-out communication outside the local group of GPUs (i.e., MI450X IF128 machines), the system will include up to three 800GbE Pensando network cards for each GPU. This provides a total outbound network bandwidth of 2.4 Tb/s per device (via PCIe). A secondary configuration will also be available, allowing each GPU to use two 800GbE network cards connected using a PCIe interface. However, this version will not be able to use the full bandwidth of the interfaces, as the PCIe 5.0 links are insufficient to fully support two high-speed network cards.

TOPICS

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

1 Comment Comment from the forums

bit_user

The article said:
the system will include up to three 800GbE Pensando network cards for each GPU.
I know the acquisition was more recent, but I feel like AMD hasn't gotten nearly as much out of their Pensando acquisition as Nvidia has benefited from their ownership of Mellanox.

Good luck to AMD, I'd say. Too much of their progress, in this space, has felt linear. It seems like they thought just getting some big HPC wins would carry them into the datacenter AI market. To gain any market share against Nvidia's pie, you have to be pushing on all fronts and firing on all cylinders. It sounds like they're finally start to learn that.
Reply