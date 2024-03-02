One of the world's largest server makers Dell has spilled the beans on Nvidia's upcoming AI GPUs, codenamed Blackwell. Apparently, these processors will consume up to 1000W, requiring Dell to use its engineering know-how to cool these GPUs down. The comments by Dell may indicate some architectural peculiarities of Nvidia's upcoming compute GPUs.

"Obviously, any line of sight to changes that we are excited about what's happening with the H200 and its performance improvement," said Yvonne Mcgill, Dell's chief financial officer. "We are excited about what happens at the B100 and the B200, and we think that's where there's actually another opportunity to distinguish engineering confidence. Our characterization in the thermal side, you really don't need direct liquid cooling to get to the energy density of 1,000 watts per GPU."

Being not exactly aware of Nvidia's plans regarding its Blackwell architecture, we can only refer to common knowledge of heat dissipation, which says that thermal dissipation typically tops out around 1W per square millimeter of the chip die area.

This is where is comes interesting from a chip manufacturing point of view. Nvidia's H00 (being built on a custom 4nm-class process technology) already dissipates around 700W, albeit with the power of HBM memory included, and the chip die is 814^2 large. This die is built on a TSMC custom performance-enhanced 4nm-class process technology. Nvidia's next-generation GPU will probably be built on another performance-enhanced process technology and we can only guess what exacly it is going to be at a 3nm-class process technology.

"That happens next year with the B200," said McGill referring to Nvidia's next AI and HPC GPU. The opportunity for us really to showcase our engineering and how fast we can move and the work that we've done as an industry leader to bring our expertise to make liquid cooling perform at scale, whether that's things in fluid chemistry and performance, our interconnect work, the telemetry we are doing, the power management work we're doing, it really allows us to be prepared to bring that to the marketplace at scale to take advantage of this incredible computational capacity or intensity or capability that will exist in the marketplace."

When it comes to high-performance AI and HPC applications, we tend to remember the number of floating points operations per cycle (FLOPS) and then power that it takes to achieve these FLOPS and cool them. What matters for software developers is how to use those FLOPS efficiently. What matters for hardware developers is how to cool processors producing those FLOPS down. This is where Dell's technologies are poised to exceed over the company's rivals and which is exactly why Dell's CFO spoke out about Nvidia's next-generation Blackwell GPUs.