NEC supercomputer combines Intel Xeon and AMD Instinct accelerators to nearly triple performance

(Image credit: AMD)

NEC this week announced that it had been selected to develop a next-generation supercomputer for Japan's National Institutes for Quantum Science and Technology (QST). The machine will use Intel's Xeon 6900P processors, AMD's Instinct MI300A accelerators and will offer performance of around 40 PetaFLOPS. It will be mainly tasked with advancing nuclear fusion research.

The system is set to be installed at QST’s Rokkasho Institute for Fusion Energy in Aomori, Japan, and will include 360 NEC LX 204Bin-3 units powered by 720 Intel Xeon 6900P processors equipped with MRDIMM DDR5 memory and 70 NEC LX 401Bax-3GA units with AMD Instinct MI300A GPUs, reaching a combined theoretical performance of 40.4 PetaFLOPS. The CPUs and GPUs will take advantage of Giga Computing-developed liquid cooling to ensure consistent performance and high reliability.

"By integrating the Intel Xeon 6900P Series, the first server CPU to support MRDIMMs, and the first in Japan, we are delivering a leap in memory performance and bandwidth, an ideal choice for complex calculations and simulations required in fusion research," said Ogi Brkic, Vice President & General Manager, Go-To-Market Builders & Technology Acceleration Office, Sales & Marketing Group, Intel.

For storage, the supercomputer will feature DDN's ES400NVX2 solution, which has a total capacity of 42.2PB and features the Lustre ExaScaler file system. As for network infrastructure, the machine will use an InfiniBand setup with Nvidia's QM9700 switches. On the software side, the machine will use Altair PBS Professional software for workload management and a scheduler optimized for AMD's Instinct MI300A accelerators.

"We appreciate that NEC has selected AMD's Instinct MI300A and this choice is further proof that AMD's MI300 Series GPUs offer a compelling supercomputer accelerator solution," Jon Robottom, Corporate Vice President, Representative Director & General Manager, AMD Japan. "We believe that AMD innovation, together with NEC's advanced technological capabilities, will continue to make a significant contribution to research conducted at the QST and NIFS."

A performance level of 40.4 PetaFLOPS is 2.7 times greater than the two current systems at QST and the National Institute for Fusion Science (NIFS), which will provide a significant boost for supercomputer-based simulations for fusion science research as well as AI, and Big Data applications.

The new supercomputer is set to be operational starting from July 2025.

TOPICS

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

14 Comments Comment from the forums

Pierce2623

Damn they picked the wrong CPUs AND the wrong GPUs. The need AMD CPUs and Nvidia GPUs. All AMD would be ok in a pinch I guess.
Reply
Paspanuki

Pierce2623 said:
Damn they picked the wrong CPUs AND the wrong GPUs. The need AMD CPUs and Nvidia GPUs. All AMD would be ok in a pinch I guess.
Just hanged with NEC CEO; engineers will call you next time to make sure they are right
Reply
Pierce2623

Paspanuki said:
Just hanged with NEC CEO; engineers will call you next time to make sure they are right
They don’t need to call me. They just need to check out some benchmarks. Spec would probably be fairly appropriate if they went with the more scientific test suites.
Reply
bit_user

Pierce2623 said:
Damn they picked the wrong CPUs AND the wrong GPUs. The need AMD CPUs and Nvidia GPUs. All AMD would be ok in a pinch I guess.
What surprised me a little was their use of the MI300A, instead of MI300X. Don't forget that their decision to use AMD could've been influenced by the wait times or costs associated with using Nvidia.

As for the choice of CPU, I wonder whether AMX played into it. Could it really have been as simple as the memory bandwidth advantage of MRDIMMs?
Reply
Pierce2623

bit_user said:
What surprised me a little was their use of the MI300A, instead of MI300X. Don't forget that their decision to use AMD could've been influenced by the wait times or costs associated with using Nvidia.

As for the choice of CPU, I wonder whether AMX played into it. Could it really have been as simple as the memory bandwidth advantage of MRDIMMs?
The wait times on Nvidia is a good point. Yeah I definitely would have chosen the 300x over 300A but they may be working in a more limited budget than these supercomputers normally have.
Reply
bit_user

Pierce2623 said:
The wait times on Nvidia is a good point. Yeah I definitely would have chosen the 300x over 300A but they may be working in a more limited budget than these supercomputers normally have.
Or, they believe there's value in having those integrated CPU cores, in whatever they're planing to use it for. If you have some processing stages that don't run well on a GPU, it's a much better option to have integrated CPU cores than shipping the data back to the host CPU, where the bus and host memory can be a bottleneck.
Reply
Pierce2623

bit_user said:
Or, they believe there's value in having those integrated CPU cores, in whatever they're planing to use it for. If you have some processing stages that don't run well on a GPU, it's a much better option to have integrated CPU cores than shipping the data back to the host CPU, where the bus and host memory can be a bottleneck.
Is the latency really that much better on the APU
is the latency on the APU version really that much better, though? It’s also just weird to use APUs as your primary GPU with Xeon CPUs already installed. The whole thing is weird and seemingly very sub-optimal . The MI300a comes with MORE than enough CPU cores to keep its GPU fully fed.
Reply
bit_user

Pierce2623 said:
Is the latency really that much better on the APU
is the latency on the APU version really that much better, though?
It's not about latency so much as bottlenecks, I'm sure.

Pierce2623 said:
It’s also just weird to use APUs as your primary GPU with Xeon CPUs already installed.
Depends on the balance of processing types. Nvidia can pair up to 72 CPU cores with each of their GPUs. MI300A only has 24 Zen 5 cores. While Zen 5 cores seem a bit more powerful than Neoverse V2 cores, it's not that lopsided.

Pierce2623 said:
The whole thing is weird and seemingly very sub-optimal . The MI300a comes with MORE than enough CPU cores to keep its GPU fully fed.
Yeah, but probably they're not running the same OS instance on it? I think no OS kernel would run across different brands of CPUs like that.

In this case, what they're probably doing is using the host CPU to run the host software, but using the MI300A's embedded CPU cores to run custom data plane logic.
Reply
P.Amini

Pierce2623 said:
Is the latency really that much better on the APU
is the latency on the APU version really that much better, though? It’s also just weird to use APUs as your primary GPU with Xeon CPUs already installed. The whole thing is weird and seemingly very sub-optimal . The MI300a comes with MORE than enough CPU cores to keep its GPU fully fed.
It seems this Japanese giant likes sub-optimal things, or maybe they don't have a clue what they are doing?!
Reply
dalek1234

Why have they chosen Xeons? They've hear of benchmarks and can read specs, no?

Maybe the person in charge of selecting hardware is one of those people that follows the "No one got fired for buying Intel" saying. Somebody needs to tell him that that outdated motto needs reversing.
Reply

Show more comments