AMD EPYC Milan-X Leaks With 64 Cores, 3D V-Cache Rumored

Edit 8/25/2021 7pm pt: Correction made to reflect that the chips are predicted to use 3D V-Cache, not HBM.

Original Article: AMD released the EPYC 7003-series (Milan) processors for data centers a little over a year ago. However, the chipmaker is reportedly preparing a Milan overhaul that features 3D die stacking.

It's still unknown what kind of memory AMD will stack on Milan-X. However, the general consensus is that AMD will tap its new 3D V-Cache technology. This technique, as AMD demoed earlier this year, enables stacking an additional SRAM onto the chip to boost L3 capacity. We don't know exactly how much bandwidth or capacity since AMD hasn't revealed the design to the public yet.

AMD EPYC Milan-X Specifications*

Swipe to scroll horizontally

Processor	Cores / Threads	OPN
EPYC 7773X	64 / 128	100-000000504
EPYC 7573X	32 / 64	100-000000506
EPYC 7473X	24 / 48	100-000000507
EPYC 7373X	16 / 32	100-000000508

*Specifications are unconfirmed.

Hardware detective momomo_us has purportedly uncovered the specifications for Milan-X. The OPNs (Ordering Part Numbers) can be found in AMD's Product Master document, so we know that the product numbers are at least legit, though the jury is still out about whether they match the upcoming Milan-X variants. According to leaked information, Milan-X will in all likelihood launch under the EPYC 7073-series branding, which makes sense since Milan debuted under the 7003-series moniker.

The EPYC 7773X appears to be the flagship SKU for the Milan-X lineup. It's likely the Milan-X counterpart for the existing EPYC 7763. Therefore, the EPYC 7573X, 7473X and 7373X are probably equivalent to the EPYC 7543, 7443 and 7343, respectively. As per momomo_us' tip, Milan-X and Milan share the same core configurations as previously speculated. The clock speeds, on the other hand, remain a mystery. We did notice that there weren't any mentions of the 56-core, 48-core, 28-core or 8-core models for Milan-X. Nevertheless, it's plausible that momomo_us hasn't found them yet, or that AMD is merely strategically targeting certain price bands for what will likely be very pricey high-performance chips.

Given the design, Milan-X may be a stopgap response to Intel's forthcoming Sapphire Rapids with HBM memory. EPYC Genoa is AMD's hard hitter and the true rival to Sapphires Rapid. However, Genoa isn't expected to arrive until 2022.

TOPICS

Zhiye Liu is a news editor, memory reviewer, and SSD tester at Tom’s Hardware. Although he loves everything that’s hardware, he has a soft spot for CPUs, GPUs, and RAM.

4 Comments Comment from the forums

TechLurker

Now if AMD is able to include HBM with their top APUs on Desktop and Mobile, they'll really have something pretty vicious and effective. Use that HBM with the iGPU while gaming to reduce latency, or shift to using it for extra onboard cache before resorting to using the RAM. Even just 4GB would be pretty useful.

Assuming their future TRs are also going to be APUs, a dedicated stack of this could result in an interesting "Super" APU, having say, 16GB of HBM for use with the iGPU when gaming, or split to 8/8 between CPU and iGPU, or shifting to 16GB for number crunching while the iGPU idles or shuts off. An 8C TR with a strong iGPU and some HBM would really make for a next-gen compact build. No need for a dGPU, just some DDR5 RAM, and using all the PCIe lanes on NVMe storage.
Reply
MajorPotato

TechLurker said:
Now if AMD is able to include HBM with their top APUs on Desktop and Mobile, they'll really have something pretty vicious and effective. Use that HBM with the iGPU while gaming to reduce latency, or shift to using it for extra onboard cache before resorting to using the RAM. Even just 4GB would be pretty useful.

Assuming their future TRs are also going to be APUs, a dedicated stack of this could result in an interesting "Super" APU, having say, 16GB of HBM for use with the iGPU when gaming, or split to 8/8 between CPU and iGPU, or shifting to 16GB for number crunching while the iGPU idles or shuts off. An 8C TR with a strong iGPU and some HBM would really make for a next-gen compact build. No need for a dGPU, just some DDR5 RAM, and using all the PCIe lanes on NVMe storage.

That's stupid. The package size would have to be almost the size of mini ITX, in which case there's no point in having it as one package, and that's why computers are as they are. You can't just chuck faster memory at a processor and expect it to run faster, in fact current desktop CPUs can't keep up with DDR4 and DDR5 is just around the corner. for the cpu or igpu to be worthy of HBM they would need to be high performance and have a large die size, which means more heat and more distance between stuff. SRAM is lower latency than HBM, so using HBM would increase latency. cache isn't a synonym or replacement for ram, the cache merely stores stuff that is in ram, so a larger cache doesn't reduce ram usage, the reason it's done that way is because of latency, if you had a cache as an extension of ram you would have to wait to move things back to ram before clearing the cache, that would be very slow.

PCIe bandwidth is a non issue, there is no consumer hardware that can saturate pcie4 x16. PCIe 4 is 2GB/s per lane, 32GB/s (256Gbps) for 16 lane slot. Also PCIe 5 is around the corner which doubles the bandwidth, probably driven by datacentre networking requirements, with 400Gbe interfaces becoming available.
Reply
TechLurker

MajorPotato said:
That's stupid. The package size would have to be almost the size of mini ITX, in which case there's no point in having it as one package, and that's why computers are as they are. You can't just chuck faster memory at a processor and expect it to run faster, in fact current desktop CPUs can't keep up with DDR4 and DDR5 is just around the corner. for the cpu or igpu to be worthy of HBM they would need to be high performance and have a large die size, which means more heat and more distance between stuff. SRAM is lower latency than HBM, so using HBM would increase latency. cache isn't a synonym or replacement for ram, the cache merely stores stuff that is in ram, so a larger cache doesn't reduce ram usage, the reason it's done that way is because of latency, if you had a cache as an extension of ram you would have to wait to move things back to ram before clearing the cache, that would be very slow.

How is an HBM'd up Ryzen or TR-based APU stupid? It's pretty much what many were interested in as AMD continued to evolve their vision of a CPU and GPU being able to directly share resources, balance loads, and straight-talk to each other. Having a stack of 4-8GB of HBM primarily powering the iGPU would mitigate the issues encountered with relying on RAM, on a package not dissimilar to the hybrid Intel/Vega NUC, just better unified thanks to being all in-house. Ryzen's Vega iGPU has always been bottlenecked by RAM, and speculation was that it would have been worse if it was RDNA instead since it's even more sensitive.

I concede on the mix-up of terminology, but the concept of a PC in a chip is not a stupid one, considering how powerful said all-in-ones have slowly become over time, limited mainly by RAM speeds and cooling limits (every other recent Ryzen mobile review; almost good enough to not need a dGPU, but still held back by reliance on RAM).
Reply
MajorPotato

TechLurker said:
How is an HBM'd up Ryzen or TR-based APU stupid? It's pretty much what many were interested in as AMD continued to evolve their vision of a CPU and GPU being able to directly share resources, balance loads, and straight-talk to each other. Having a stack of 4-8GB of HBM primarily powering the iGPU would mitigate the issues encountered with relying on RAM, on a package not dissimilar to the hybrid Intel/Vega NUC, just better unified thanks to being all in-house. Ryzen's Vega iGPU has always been bottlenecked by RAM, and speculation was that it would have been worse if it was RDNA instead since it's even more sensitive.

I concede on the mix-up of terminology, but the concept of a PC in a chip is not a stupid one, considering how powerful said all-in-ones have slowly become over time, limited mainly by RAM speeds and cooling limits (every other recent Ryzen mobile review; almost good enough to not need a dGPU, but still held back by reliance on RAM).

The idea of a pc in a chip is good, but the reality is quite different. Heat is probably the largest factor for why it wont work, followed by cost. If bandwidth was the only reason why iGPUs are slow then we would have DDR6 already, and if you still think it's all about memory bandwidth just overlock your ram and tell me how much faster your iGPU is.

CPU's and GPU's already directly share resources, that's what pcie and DMA are for.

That NUC thing is not an iGPU, it's a discrete GPU with 4GB HBM Memory, and can use system memory the same way any other discrete GPU can.

Without getting into the world of memory access patterns, there's a reason why DDR is what DDR is and HBM is what HBM is.
Reply