Intel formally confirmed that select 4th Generation Xeon Scalable 'Sapphire Rapids' processors will feature on-package HBM memory late last year, but the company has never demonstrated an actual CPU equipped with HBM or revealed its DRAM configuration. At the International Symposium on Microelectronics hosted by IMAPS earlier this week, the company finally showcased the processor with HBM and confirmed its multi-chiplet design.
While Intel confirmed on numerous occasions that Sapphire Rapids processors will support HBM (presumably HBM2E) and DDR5 memory and will be able to use HBM either with or without main DDR5 memory, it never showcased an actual HBM-equipped CPU, until this week (thanks to the picture published by Tom Wassick/@wassickt).
As it turns out, each of four Sapphire Rapids chiplets has two HBM memory stacks that use two 1024-bit interfaces (i.e., a 2048-bit memory bus). Formally, JEDEC’s HBM2E specification tops at a 3.2 GT/s data transfer rate, but last year SK Hynix started to mass-produce 16GB 1024-pin known-good stacked dies (KGSDs) rated for a 3.6 GT/s operation.
If Intel opts to use such KGSDs, HBM2E memory will provide Sapphire Rapids CPU a whopping 3.68 TB/s peak memory bandwidth (or 921.6 GB/s per die), but only for 128GB of memory. By contrast, SPR's eight DDR5-4800 memory channels supporting one module per channel and offering 307.2 GB/s will support at least 4TB of memory using Samsung's recently announced 512GB DDR5 RDIMM modules.
It is also noteworthy the HBM-equipped Sapphire Rapids comes in a large BGA form-factor and will be soldered directly to the motherboard. This is not particularly surprising as Intel's LGA4677 form-factor is pretty narrow and the CPU does not have enough space on its package for HBM stacks.
Furthermore, processors that require a very high-performance memory subsystem like HBM tend to feature loads of cores that work at high clocks and feature a very high TDP. Keeping in mind that HBM stacks are also power-hungry, it may not be easy to develop a socket that would feed an HBM-equipped beast. Therefore, it looks like HBM-equipped SPRs will only be offered to select clients (just like Intel's Xeon Scalable 9200 CPUs with up to 56 cores) and will mostly be aimed at supercomputers.
Another thing to note is that the shape of the SPR chiplets on the image are rectangular rather than square (as on early images of Sapphire Rapids in LGA4677 packaging). The author of the image said that it comes from an Intel chart "given by an Intel employee and labeled SPR, and verbally noted as Sapphire Rapids." That said, it looks like HBM-supporting Sapphire Rapids CPU might have a different chiplet configuration than regular SPR processors (at the end of the day, regular Xeon Scalable CPUs do not need an HBM interface that takes die space).
Intel's Sapphire Rapids processors will feature a host of new technologies, including PCIe Gen 5 support with CXL 1.1 protocol for accelerators on top, a hybrid memory subsystem supporting DDR5 and HBM, Intel’s Advanced Matrix Extensions (AMX) as well as AVX512_BF16 and AVX512_VP2INTERSECT instructions designed for datacenter and supercomputer workloads, and Intel's Data Streaming Accelerator (DSA) technology, just to name a few.
Earlier this year we learned that Intel's Sapphire Rapids uses a a multi-chip package with EMIB interconnects between the die, unlike its predecessors that are monolithic. While the number of cores is something that depends on yield and power (some reports indicate that SPR will feature up to 56 active cores, but the actual chiplets may carry as many as 80 cores), it is evident that the 4th Generation Xeon Scalable will be the first to use Intel's latest packaging technologies and design paradigm.
clients (just like Intel's Xeon Scalable 9200 CPUs with up to 56 cores) and will mostly be aimed at supercomputers. "
The question is who would be willing enough to integrate those in its supercomputers anymore, after the fiasco with Aurora. Which is still a running fiasco...
Intel already said that Aurora is using these.
6x 9k units of Ponte Vecchio GPUs may be the hold-up now. Who else is doing 47 tile GPUs?
Maybe we'll see a demo in a couple of weeks.
The stock market said otherwise;)
Intel also hasn't shown their Rambo Cache details yet. I saw an Intel patent app with the whole foveros base layer of a GPU populated with a grid of SRAM chiplets, so Intel has some show-and-tell coming.
And they also gained a lot of credibility with OEMs and "the common man" by being able to supply the whole market as if nothing had happened and without increasing the prices at all or reducing the value.
Having a business is all about weighing the pros and cons of everything and going with what you believe will benefit your company the most.
One of their big problems was with their process node, yields were crappy as it only can get. So those wafers werent reallocated for other purposes.
And dont talk like they were able to supply, because clearly the last ERs showed more or less, growth was catastrophic considering we were and still are in a mega semi conductor cycle.
So supply is one thing, but apparently the demand for their products was not the same, because everybody else benefited much more from the cycle than intel was able to, although they were less capacity strained. think about that rather.