When Intel announced earlier this year that its 7nm process technology would be delayed, it brought implications for Aurora, the first Intel-based exascale supercomputer. There was no clear answer back in July, but an official for the U.S. Department of Energy’s (DoE) Office of Science confirmed this week that the system will be delayed.
As reported by HPCwire (opens in new tab), the DoE does not see this is as a major problem, noting that the Argonne National Laboratory, Aurora’s operator, has a contingency plan in place.
Aurora Supercomputer Delayed
“Yes, we have indications that the Aurora system will be delayed,” said Barb Helland, associate director of the Office of Science for Advanced Scientific Computing Research (ASCR) of the DoE’s Office of Science. The exec added that Argonne is cooperating with Intel to “mitigate the consequences not only to Argonne, but to the Exascale Computing Project and to the nation’s high-performance computing users.”
“It’s not unexpected that when we’re entering into contracts for the most advanced supercomputers in the world, 4-5 to five years before they’re deployed, that there will be some schedule delays,” said Helland. “For that reason, we build both cost and schedule contingencies into our project budgets.”
The Aurora supercomputer is based on Intel’s next-generation Xeon processor codenamed Sapphire Rapids running the Golden Cove microarchitecture, as well as the company’s first datacenter GPU codenamed Ponte Vecchio, which is powered by the Xe high-performance computing (HPC) architecture.
Sapphire Rapids is made using Intel’s 10nm Enhanced SuperFin process technology that's expected to be on-track for mass production in 2021. Meanwhile, Intel’s Xe-HPC Ponte Vecchio' GPU is a multi-tile chiplet design using a base tile produced using Intel’s 10nm SuperFin fabrication technology, an Xe-Link I/O tile made by a foundry, a Rambo Cache tile fabbed at the 10nm Enhanced SuperFin process, as well as a Compute Tile that was supposed to use Intel’s 7nm node, which was delayed by about six months. Last month, Intel revealed that the Compute Tile could be made both at an external foundry as well as internally,
Intel says that it has always envisioned Ponte Vecchio as a multi-chiplet product with tiles coming from various sources. Making a key tile at an external foundry is not a problem per se, but tailoring the design’s thermals, voltages and packaging to other parts will take some time. Intel’s Ponte Vecchio will be used outside of Aurora, so it makes sense for Intel to eventually produce its main Compute Tile at its own fabs, but this means that there will be two versions of the Xe-HPC Ponte Vecchio GPU.
Each Aurora blade features two Intel Xeon Scalable "Sapphire Rapids" processors, as well as six Intel Xe-HPC "Ponte Vecchio" GPUs. That's means volume production of Intel's datacenter graphics chips is crucial to enable Aurora.
The First Exascale Supercomputers
So far, the U.S. DoE has revealed three exascale-class supercomputers. Argonne’s National Laboratory’s Aurora was the first system, announced in March 2019, and is expected to deliver over 1 ExaFLOPS performance.
Oak Ridge National Lab’s Frontier supercomputer, powered by AMD’s Epyc ‘Milan’ processors and Radeon Instinct MI200 graphics ,was unveiled in May 2019 and is on-track to deliver 1.5 ExaFLOPS performance in 2021.
This March, the DoE announced Lawrence Livermore Lab’s El Capitan system that is set to hit 2 ExaFLOPS in 2023 using AMD’s Epyc "Genoa" CPUs and AMD CDNA GPUs.
All three systems use HP Enterprise's Cray EX architecture, so they will have many things in common. Aurora is the only Intel-powered supercomputer out of the three.
However, we don't know when Aurora will arrive, and the supercomputer has already faced a major setback. The system was first announced in 2015 and was described as an Intel Xeon Phi "Knights Hill"-powered 180 TeraFLOPS supercomputer due in 2018. Since Intel cancelled its Knights Hill in 2017, the original Aurora project was pushed back too.