If you happen to need eight Tesla V100 GPUs with a total of 40,960 CUDA cores and 128GB of GPU memory crammed into one beefy chassis, Nvidia's got you covered. Yes, the DGX-1 can play Crysis (sorry to steal your thunder, commenters), but Nvidia designed it specifically for artificial intelligence workloads in the data center.
Artificial intelligence is reshaping the face of computing from the data center down to mobile devices. Nvidia GPUs already power a good portion of today's compute-intensive AI training, which is the critical step that enables the lighter-weight inference part of the deep learning equation. Processing data on the edge through inference, such as in IoT and mobile devices, brings its unique set of challenges, but the real processing grunt work occurs in the data center.
Nvidia sells plenty of GPUs, but as with any business model, it's always more lucrative to step up to the system level. It also helps build a more robust developer ecosystem to further the larger objective of bringing AI to a broad spate of applications that touch every facet of modern applications.
Tesla 150W FH-HL Accelerator
Of course, Nvidia isn't going to stray too far from the discrete component side of the house. The company also announced a 300W Tesla V100 accelerator and a 150W FH-HL (Full Height-Half Length) version. Both models feature the Volta GV100, but Nvidia designed the smaller FH-HL model to bring the power of inference to commodity inferencing servers.
The new model is designed to reduce power consumption and heat generation in scale out deployments. Nvidia CEO Jen-Hsun Huang displayed the 150W version of the card on stage at GTC, but didn't provide any technical details. Inference workloads aren't as intense, and considering the lower power envelope, the card will logically be slower than its 300W counterpart. The important takeaway is that the smaller version will enable a more power- and cost-efficient GPU avenue to tackling inference workloads.
Nvidia announced its first-generation DGX-1 supercomputer last year. The Tesla-powered DGX-1 came bristling with eight P100 data center GPUs. The company has already seen strong uptake, with Elon Musk's OpenAI nonprofit among the first recipients. Nvidia claimed the first-generation DGX-1 replaces 400 CPUs with up to 170 FP16 TFLOPS courtesy of 28,672 CUDA cores. NVLink provided up to 5X the performance of a standard PCIe connection.
Those are impressive specs, indeed, but now Nvidia is upping the ante with the new Tesla V100-powered DGX-1. The new system packs eight Volta GPUs crammed into a 3U chassis to deliver a whopping 960 TFLOPs of power from 40,960 CUDA cores. It also brings the addition of 5,120 Tensor cores (more coverage here), and NVLink 2.0 increases throughput to 10X that of a standard PCIe connection (300GB/s). Power consumption weighs in at 3,200W. Nvidia claims the platform can replace 800 CPUs (or 400 servers) and can reduce the time required for a training task from eight days with a Titan X down to eight hours.
Software is just as important as the hardware. The DGX-1 comes with an integrated Nvidia deep learning software stack and cloud management services, which the company claims speeds time to deployment. The software stack supports many of the common tools, such as Pytorch, Caffe, and TensorFlow, among others, and also includes a Docker containerization tool. The Volta DGX-1 will ship in Q3 for $149,000. Customers can purchase the solution today with P100 GPUs, and Nvidia will provide a free upgrade to V100 GPUs in Q3, company CEO Jen-Hsun Huang announced on stage at GTC.
Nvidia designed the DGX Station for smaller deployments, so it's essentially a smaller and cheaper alternative with a lower entry-level price point.
Nvidia claims the svelte system can replace up to 400 CPUs while consuming 1/20th the power. The company claims the system is whisper quiet due to its integrated water cooling system. It pulls 1500W to power four V100 GPUs and deliver 480 TFLOPS (FP16). Nvidia claims the workstation provides a 100X speed-up for large data set analysis compared to a 20-node Spark server cluster.
The system carries a $69,000 price tag and also comes with a software suite.
Nvidia also touted its HGX-1, which it developed in collaboration with Microsoft to power Azure deep learning, GRID graphics, and CUDA HPC workloads, but the company was light on details. The HGX-1 appears to be a standard DGX-1 chassis paired with a complimentary 1U server.
The servers are connected via flexible PCIe cabling, and the configuration allows Microsoft to mix and match CPU and GPU requirements, such as having two or four CPUs paired with a varying number of GPUs in the secondary 3U DXG-1 chassis. The system isn't available to the public, but likely serves as a nice reminder to Nvidia's investors that it is heavily engaged in wooing the high-margin hyperscale data center customers.