Nvidia had already announced its next-generation Ampere A100 GPU last month, but that was the SXM4 version of the graphics card aimed at scientific and data center applications. Now, Nvidia announces the PCI-Express 4.0 variant of the data center GPU, which comes with the same GPU and memory, but that's where the similarities end.
Of course, the most notable difference is its connectivity. The PCIe 4 slot on this graphics card makes it much more flexible than the SXM variant, as it can be slotted into most PC's instead of only purpose-built systems. This allows for easy upgrading of existing server infrastructures, or installation into workstations for scientific use. It also allows many system integrators to easily build the A100 into their systems.
“Adoption of NVIDIA A100 GPUs into leading server manufacturers’ offerings is outpacing anything we’ve previously seen,” said Ian Buck, VP and GM of Accelerated Computing at NVIDIA. “The sheer breadth of NVIDIA A100 servers coming from our partners ensures that customers can choose the very best options to accelerate their data centers for high utilization and low total cost of ownership.”
But, this flexibility will come at a cost. The PCIe 4 variant of the Ampere A100 GPU has a significantly lower TDP of 250W vs the SXM's 400W TDP. Nvidia says this leads to a 10% performance penalty in single-GPU workloads. However, multi-GPU scaling is limited to just two GPUs via NVLink, where NVSwitch supports up to eight GPUs. For distributed workloads, Nvidia says the performance penalty (looking at eight GPUs) is up to 33%.
It appears much of the additional power TDP on the SXM variant goes to inter-GPU communications, based on the 10% performance difference for a single GPU. 400W is also the maximum power limit, so the GPU may not actually use that much in some workloads. Note that power delivery is also much easier with SXM, as the power comes through the mezzanine connector. The PCIe models by contrast only get up to 75W of power from the x16 slot, with additional power coming via 8-pin or 6-pin PEG connectors.
Power and TDP aside, the graphics cards are nearly identical. The same Ampere GPU is utilized with its 6912 CUDA cores, making the GPU measure in at an impressive 826mm square in size, despite being manufactured on the 7nm process. The GPU is also still flanked with six HBM2 stacks that offer up a whopping 40 GB of ultra-fast memory.
How Long Do Gamers Have to Wait?
For the gamers among us, we'll have to wait a little longer to get access to Nvidia's PCI-Express 4.0 goodness. We do expect the consumer Ampere graphics cards to feature the new connectivity standard, but don't expect to see Ampere GeForce RTX 3000 cards until around September. Given the leaks are rolling in faster than we can write about them, we'll surely be seeing new GPUs by the end of the year.