AWS and Nvidia build a supercomputer with 16,384 Superchips, Team Up for Generative AI Infrastructure

Nvidia
(Image credit: Nvidia)

Although many companies are developing accelerators for artificial intelligence (AI) workloads, Nvidia's CUDA platform is currently unrivaled regarding AI support. As a result, demand for Nvidia-based AI infrastructure is high. To address it, Amazon Web Services and Nvidia entered a strategic partnership under which AWS will offer Nvidia-based infrastructure for generative AI. The two companies will partner on several key projects.

"Today, we offer the widest range of Nvidia GPU solutions for workloads including graphics, gaming, high performance computing, machine learning, and now, generative AI," said Adam Selipsky, CEO at AWS. "We continue to innovate with Nvidia to make AWS the best place to run GPUs, combining next-gen Nvidia Grace Hopper Superchips with AWS's EFA powerful networking, EC2 UltraClusters' hyper-scale clustering, and Nitro's advanced virtualization capabilities." 

Project Ceiba is a cornerstone of this collaboration, aiming to create the world's fastest GPU-powered AI supercomputer hosted by AWS and available exclusively for Nvidia. This ambitious project will integrate 16,384 Nvidia GH200 Superchips (using the GH200 NVL32 solution packing 32 GH200 GPUs with 19.5 TB of unified memory) that are set to offer a staggering 65 'AI ExaFLOPS' of processing power. This supercomputer is for Nvidia's generative AI research and development projects. 

The Nvidia DGX Cloud hosted on AWS is another major component of the partnership. This AI-training-as-a-service platform is the first commercially available instance to incorporate the GH200 NVL32 machine with 19.5 TB of unified memory. The platform provides developers with the largest shared memory available in a single instance, significantly accelerating the training process for advanced generative AI and large language models, potentially exceeding 1 trillion parameters.

In addition, AWS will be the first to offer a cloud-based AI supercomputer based on Nvidia's GH200 Grace Hopper Superchips. This unique configuration will connect 32 Grace Hopper Superchips per instance using NVLink. It will scale up to thousands of GH200 Superchips (and 4.5 TB HBM3e memory) connected with Amazon's EFA networking and supported by advanced virtualization (AWS Nitro System) and hyper-scale clustering (Amazon EC2 UltraClusters).

The collaboration will also introduce new Nvidia-powered Amazon EC2 instances. The instances will feature H200 Tensor Core GPUs with up to 141 GB of HBM3e memory for large-scale generative AI and high-performance computing (HPC) workloads. Additionally, G6 and G6e instances, equipped with NvidiaL4 and L40S GPUs, respectively, are designed for a wide array of applications ranging from AI fine-tuning to 3D workflow development and leverage Nvidia Omniverse for creating AI-enabled 3D applications.

Finally, the collaboration will introduce Nvidia's advanced software to speed up generative AI development on AWS. This includes the NeMo LLM framework and NeMo Retriever for creating chatbots and summarization tools and BioNeMo for accelerating drug discovery processes. 

"Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation," said Jensen Huang, founder and CEO of Nvidia. "Driven by a common mission to deliver cost-effective state-of-the-art generative AI to every customer, Nvidia and AWS are collaborating across the entire computing stack, spanning AI infrastructure, acceleration libraries, foundation models, to generative AI services."

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • bit_user
    *Yawn*

    So... how many Watts does it burn, at full load? I guess it might be interesting to know something about the pricing structure, as well.
    Reply
  • weber462
    Computational power has become new arms race. Keep it with the masses. Not just rich. Support distributive computing.
    Reply
  • bit_user
    weber462 said:
    Computational power has become new arms race. Keep it with the masses. Not just rich. Support distributive computing.
    I already told you - a lot of problems need tight integration of compute elements and don't map well to a distributed computing processing model.

    In a way, cloud computing is indeed very democratic. I could never access a supercomputer before, but now it's potentially within reach for me to rent a little time on one.
    Reply
  • thisisaname
    Generative AI the gift that keeps on taking, right now it seems to me a solution looking for a problem to solve.
    Reply
  • sygreenblum
    bit_user said:
    *Yawn*

    So... how many Watts does it burn, at full load? I guess it might be interesting to know something about the pricing structure, as well.
    That's a fair question. I just read a NY times and MIT article where AI is expected to exceed AC units as the largest single source requiring as much as 21 percent of the grid by 2030, which would likely be higher that projected EV consumption as well. Google AI/data centers already use more power than the country of Ireland.
    Reply