Intel on Thursday announced one of its next major steps in supercomputing platforms: the codenamed Falcon Shores design that will bring general-purpose x86 processor cores and highly parallel compute Xe-HPC GPU cores together into one socket, along shared high-bandwidth memory developed by Intel. The product is expected to arrive in 2024.
Bringing together x86 CPU and Xe-HPC GPU resources and into the same socket as well as switching to a unified memory architecture will enable Intel to increase per socket compute density by over five times compared to current platforms (due to new architectures, finer process technologies and the addition of GPU cores), boost memory capacity and bandwidth by more than five times compared to existing designs and upsurge performance per watt by over five times relative to platforms available in February 2022.
"We are working on a brand-new architecture codenamed Falcon Shores, [which] will bring x86 and Xe GPU acceleration together into a Xeon socket, taking advantage of next generation packaging, memory and IO technologies, giving huge performance and efficiency improvements for systems computing large data sets and training gigantic AI models," said Raja Koduri, the head of Intel's Accelerated Computing Systems and Graphics Group just hours before the company's Investor Meeting 2022 kicks off.
For now, Intel calls its Falcon Shores an XPU to emphasize that it packs two types of compute units: CPUs and GPUs. Intel's Falcon Shores will extensively use Intel's multi-chiplet/modular approach to processor design and will offer a flexible ratio of x86 and Xe-HPC cores possibly depending on a target application. The CPU and GPU tiles are said to be made using an 'Angstrom Era Process', which might mean Intel 20A or Intel 18A, and then connected using Intel's advanced packaging technologies, which will be the key enablers for Falcon Shores. The CPU and GPU will use unified high-bandwidth memory to improve performance and greatly simplify compute GPU programming. Interestingly, Intel even implies on an all-new type of memory developed by Intel, but does not elaborate.
"Falcon Shores is built on top of an impressive array of technologies, including an angstrom era process technology, next generation packaging, new extreme bandwidth shared memory being developed by Intel and industry leading I/O," said Koduri. "We have super excited about this architecture as it brings acceleration to much broader range of workloads than the current discrete solutions."
Modern supercomputers use general-purpose CPUs to run workloads that require strong single-thread performance as well as compute GPUs accelerators for highly parallel workloads. For now, this architecture has proven to be balanced in terms of performance, power, and costs, but tighter integration of CPU and GPU resources will allow to boost performance even further and will make accelerated computing accessible to more workloads.
Intel's Falcon Shores will be one of Intel's major steps towards its goal of enabling ZettaFLOPS-class supercomputers by 2027 goal. To increase performance of supercomputers by 1000 times in five years, Intel says it will need new processing architectures (i.e., improvements of x86 and Xe architectures), new process technologies and advanced packaging methods, faster memory and I/O interfaces (unified extreme bandwidth memory and advanced interconnections/packaging seem to address that), and new system architectures. Falcon Ridge brings together all of the pillars that are required for ZettaFLOPS-class supercomputers, the company says.
"Beginning with the technology foundation we have today, you need significant revolutionary gains in architecture, in power efficiency and thermal management, in process and packaging technology and in memory and IO capacity and bandwidth," said Koduri. "We have our advanced technology teams already on their way with inventions to pave the path to Zetta-Scale."