Nvidia provides the first public view of its fastest AI supercomputer — Eos is powered by 4,608 H100 GPUs, tuned for generative AI

Nvidia
(Image credit: Nvidia)

Nvidia on Thursday published a video that gives the first public glimpse into the architecture of Eos, its newest enterprise-oriented supercomputer designed for advanced AI development at the datacenter scale, and the company's fastest AI supercomputer. 

The Eos machine, currently being used by Nvidia itself, is ranked as the world's No. 9 highest performing supercomputer in the latest Top 500 list, which is measured in FP64; in pure AI tasks, it's likely among the fastest. Meanwhile, its blueprint can be used to build enterprise-oriented supercomputers for other companies too.

"Every day EOS rises to meet the challenges of thousands of Nvidia's in-house developers doing AI research, helping them solve the previously unsolvable," Nvidia stated in the video. 

Nvidia's Eos is equipped with 576 DGX H100 systems, each containing eight Nvidia H100 GPUs for artificial intelligence (AI) and high-performance computing (HPC) workloads. In total, the system packs 1,152 Intel Xeon Platinum 8480C (with 56 cores per CPU) processors as well as 4,608 H100 GPUs, enabling Eos to achieve an impressive Rmax 121.4 FP64 PetaFLOPS as well as 18.4 FP8 ExaFLOPS performance for HPC and AI, respectively.

The design of Eos (which relies on the DGX SuperPOD architecture) is purpose built for AI workloads as well as scalability, so it uses Nvidia's Mellanox Quantum-2 InfiniBand with In-Network Computing technology that features data transfer speeds of up to 400 Gb/s, which is crucial for training large AI models effectively as well as scaling out.

In addition to powerful hardware, Nvidia's Eos also comes with potent software, again, purpose-built for AI development and deployment, the company says. As a result, Nvidia's Eos can address a variety of applications, from a ChatGPT-like generative AI to AI factory.

"Eos has an integrated software stack that includes AI development and deployment software, [including] orchestration and cluster management, accelerated compute storage and network libraries, and an operating system optimized for AI workloads," Nvidia said in the video. "Eos — built from the knowledge gained with prior Nvidia DGX supercomputers such as Saturn 5 and Selene — is the latest example of Nvidia AI expertise in action. […] By creating an AI factory like Eos, enterprises can take on their most demanding projects and achieve their AI aspirations today and into the future."

We don't know how much Eos costs, and it doesn't help that pricing of Nvidia's DGX H100 systems is confidential and dependent on many factors, such as volumes. Meanwhile, considering the fact that each Nvidia H100 can cost $30,000 — $40,000 depending on the volume, so one can start thinking about how high the numbers we get here.

Anton Shilov
Freelance News Writer

Anton Shilov is a Freelance News Writer at Tom’s Hardware US. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • Amdlova
    Four times more support?
    Reply
  • ngilbert
    I would not bet on this system being the top for AI processing. It may have 4608 H100 chips - but Microsoft's Eagle system at #3 has 14,400 H100s...
    Reply
  • vanadiel007
    What is the power connector to the cards?
    Reply
  • Amdlova
    vanadiel007 said:
    What is the power connector to the cards?
    A small nuclear reactor plant with direct connection.
    Reply
  • Integr8d
    It syphons power directly from the sun.
    Reply
  • gg83
    $180 million in gpu power. Nvidia will undercut the supercomputer companies. They can get the h100's for a much cheaper price! Lol.
    Reply
  • vanadiel007
    It's a smart sales technique. Once AI has passed they will repurpose them for crypto mining, keeping the sales of expensive hardware way higher than it should be.

    Our next GPU's will cost even more than the current generation.
    Reply