AMD’s MI300 APUs Power Exascale 'El Capitan' Supercomputer

El Capitan
(Image credit: DOE)

We already knew AMD would power the world’s fastest supercomputer - the US Department of Energy (DOE) El Capitan. Expected to be installed in 2023 at the Lawrence Livermore National Laboratory (LLNL), the HPE-built system initially leveraged AMD’s Zen 4 CPU cores and MI Instinct GPU accelerators, unlocking unheard-of performance above the 2 Exaflop mark. Yet there’s something that the initial announcement didn’t say: the system won’t be leveraging disparate CPU and GPU accelerators. Instead, confirming our speculation, El Capitan will be leveraging AMD’s recently-announced MI 300 Accelerated Processing Units (APUs). It marks the first time an APU is a supercomputer’s central processing grunt - and at Exascale, no less.

“It’s the first time we’ve publicly stated this,” said associate director for HPC (High Performance Computing) at LLNL, Terri Quinn. In a world-first disclosure in a presentation delivered today to the 79th HPC User Forum at Oak Ridge National Laboratory (ORNL), she added that the information came straight from the source: “I cut these words out of [AMD’s] investors document, and that’s what it says: it’s a 3D chiplet design with AMD CDNA3 GPUs, Zen 4 CPUs, cache memory and HBM chiplets.” 

AMD’s MI300 APUs will feature CPU and GPU chiplets in the same 3D-enabled packaging with a coherent, HBM3 memory architecture, powered by the company’s 4th generation Infinity Fabric and next generation Infinity Cache. Leveraging both Zen 4 and the CDNA 3 graphics acceleration architecture, MI300 APUs will leverage TSMC’s 5nm process technology (likely N5 or N5P). However, the balance of CPU and GPU cores per APU is still a wild guess.

Being APUs, El Capitan will benefit from what’s likely to be the densest performance profile ever achieved in the world of supercomputing. Make no mistake: El Capitan will represent the pinnacle of semiconductor performance, design, and integration. It’s not hyperbolic to say that it’s likely to be one of humanity’s most technologically complex endeavors.

It is all thanks to tightly-packaged AMD APUs, bundled into HPE Cray XE racks and tied together with Cray’s Slingshot-11 networking, powered by its 16 nanometer Rosetta controllers that can dish out 200 Gb/sec interconnects. The form factor and the number of accelerators per rack is still question mark. When push comes to shove, El Capitan should also become one of the most energy-efficient systems (if not the most efficient), with operating power limited to 40 MW for an optimal performance/power balance. Workloads will run through El Capitan’s circuits starting from 2Q 2024, with the planned end of life set for 2030.

AMD’s continued roll into the Top500 list of the world’s most powerful supercomputers keeps advancing at a breakneck pace. The company is steamrolling Intel’s previous dominance, already scoring five out of the world’s top ten supercomputers - including first place, thanks to Frontier - against Intel’s single Xeon-based system powering China’s Tianhe-2A, currently ranking ninth. The company has come a long way from its infamous and nearly company-breaking Steamroller architecture family.

Not all news is bad news for Intel, however, as the company too has earned an Exascale contract with the Argonne National Laboratory. The Aurora supercomputer, too, will be a 2-exaflops HPE-Intel system that has undergone several revisions already. Aurora’s installation is already underway, albeit the exact date it enters operation is still unclear. Intel’s delays on its Sapphire Rapids CPUs have already pushed the supercomputer’s installation, so it remains to be seen how long the execution will take.

Nvidia, too has a relevant presence in the world’s top-performing systems, although it currently only operates in the GPU provider space, scoring three systems powered by its GPUs. But it recently achieved an essential contract as a provider of both CPUs and GPUs for MareNostrum 5, to be installed in the Barcelona Supercomputing Centre (BSC) in Spain. The operation could commence as early as 2023.

Sadly, Nvidia has already taken the “Superchip” nomenclature with its Arm-based Grace CPU product for High-Performance Computing (HPC) deployments. So perhaps AMD should be looking to claim an “Überchip” already?

Francisco Pires
Freelance News Writer

Francisco Pires is a freelance news writer for Tom's Hardware with a soft side for quantum computing.

  • KananX
    Wow, APUs have come a long long way since the early days with A8 3870 APUs. They were kinda belittled always, but are taken serious more and more, and these supercomputer APUs are just monsters.
    Reply
  • renz496
    KananX said:
    Wow, APUs have come a long long way since the early days with A8 3870 APUs. They were kinda belittled always, but are taken serious more and more, and these supercomputer APUs are just monsters.
    build in a custom way they can be very powerful. the problem is people are expecting APU to do miracle things on regular PC where it cannot be build in a way to make it more powerful like in console and this upcoming APU.
    Reply
  • KananX
    renz496 said:
    build in a custom way they can be very powerful. the problem is people are expecting APU to do miracle things on regular PC where it cannot be build in a way to make it more powerful like in console and this upcoming APU.
    That is true, however soon we will have RDNA3 APUs with DDR5, this will be much much more potent than current APUs and actually fast enough for 1080p gaming.
    Reply
  • renz496
    KananX said:
    That is true, however soon we will have RDNA3 APUs with DDR5, this will be much much more potent than current APUs and actually fast enough for 1080p gaming.

    except people are dreaming of PC APU having the performance for something like RX5700/RX6600.
    Reply
  • fball922
    KananX said:
    That is true, however soon we will have RDNA3 APUs with DDR5, this will be much much more potent than current APUs and actually fast enough for 1080p gaming.
    Yeah but you know the OEMs will STILL be pairing it with single channel memory... :ROFLMAO::disrelieved:
    Reply
  • jeremyj_83
    renz496 said:
    except people are dreaming of PC APU having the performance for something like RX5700/RX6600.
    The only time I can remember an APU being really custom was that Intel CPU with AMD GPU with HBM combo for the NUC. It never took off and was abandoned after the first generation. IIRC the GPU performance was that of a 1660.
    Reply
  • KananX
    renz496 said:
    except people are dreaming of PC APU having the performance for something like RX5700/RX6600.
    Next I say it will have the performance of 6600 and then you reply “except people are dreaming of having performance of 6800 XT in APUs” there’s no winning with people who want to argue and be edge lords. Bye
    Reply
  • KananX
    fball922 said:
    Yeah but you know the OEMs will STILL be pairing it with single channel memory... :ROFLMAO::disrelieved:
    Yea, well don’t buy prebuilt pcs…
    jeremyj_83 said:
    The only time I can remember an APU being really custom was that Intel CPU with AMD GPU with HBM combo for the NUC. It never took off and was abandoned after the first generation. IIRC the GPU performance was that of a 1660.
    It wasn’t that fast, maybe in its dreams. Yea driver situation with that APU was a disaster, if you can call intel+Radeon a APU.
    Reply