Intel 'Emerald Rapids' 5th-Gen Xeon Platinum 8592+ Review: 64 Cores, Tripled L3 Cache and Faster Memory Deliver Impressive AI Performance

The sun shines brighter for Intel.

Intel Xeon Emerald Rapids
(Image: © Tom's Hardware)

Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.

AMD's EPYC assault has, at times, led to massive performance advantages over Intel's competing Ice Lake and Sapphire Rapids processors. The combination of TSMC's superior process node and AMD's innovative chiplet design combined to deliver embarrassing leads as Intel grappled with process node delays and difficulties on the design and verification side of its operations.

In the interim, Intel has worked to improve its process node tech while addressing glaring design and verification issues. Additionally, it remained steadfast in paving the way for the AI-powered future by investing its resources into hardware acceleration for AI workloads via AMX, AVX-512, VNNI, and Bfloat16 support, all of which deliver leading performance with AI models that leverage those features.

Intel Xeon Emerald Rapids

(Image credit: Intel)

Those efforts, combined with a significant redesign of its tile-based Xeon architecture to improve latency and consistency, have yielded impressive gen-on-gen performance improvements. Intel also infused its new chips with up to three times the L3 cache and faster DDR5-5600 memory, which helps deliver gains that far outstrip what we expect from a refresh generation. We recorded solid performance metrics across the full gamut of workloads in our test suite.

Unfortunately, direct pricing comparisons based on the list pricing for Intel and AMD's processors are largely a mirage. OxMs and Tier-1 customers rarely pay the sticker price. However, we do know that Intel has been exceedingly competitive on pricing with its Sapphire Rapids processors, and its trimmed design should allow for an even lower production cost, thus enabling it to become more competitive on the pricing front.

Emerald Rapids' backward compatibility with the existing Eagle Stream platforms enables a drop-in replacement with substantial performance gains but, more importantly, allows OxMs to fast-track systems based on the new design. That will speed up time to market, a key factor given that Intel's next-gen Granite Rapids Xeon arrives early next year.

AMD still holds the lead in sheer core count, but as we see in many of our benchmarks, higher core counts aren't the end-all-be-all for every type of workload — particularly in latency-sensitive workloads like AI. Intel also doesn't directly compete with AMD's 96-core flagship. Instead, the Emerald Rapids 8592+ grapples with AMD's 64-core models in a core-for-core battle that finds it winning in more than a few key areas.

Intel's Emerald Rapids leverages its AI acceleration and in-built acceleration engines to deliver tangible advantages in a broad range of workloads. While Emerald Rapids doesn't win in all benchmarks — there are certainly several workloads that perform far better on AMD's silicon — it has carved out several notable wins, particularly in AI workloads, that significantly improve the company's competitive standing against AMD. That's a much-needed win for Intel as it prepares for its Granite Rapids and Sierra Forest lineup with up to 288 cores for early next year.

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • Tech0000
    Thank you for the write article!
    1. correction: 2nd table first page Intel Xeon 8462Y+ (SPR) price should be the same as 8562Y+ (EMR) price = $5,945. Right now (11.22pst) you have Intel Xeon 8462Y+ (SPR) priced at $3,583 - which is wrong.

    2. I would have liked to see the comparison between EMR and SPR for the same model, e.g. 8462Y+ vs 8562Y+ to better understand and isolate the generational core for core and model improvement (mostly everything else being equal). It's hard to derive conclusions, when you are comparing different models and core configs with test results numbers allover the place - one winning over the other depending on the test performed.
    I suspect that a 8462Y+ vs 8562Y+ comparison would result net very modest gain (due marginally higher all core turbo) and that the real performance gains are in top tier SKUs with triple L3 cache, accepting faster DDR5 etc.

    3. As a workstation ship the single socket 8558U seams to be pretty good "value" (relative to the other intel SKUs) actually. $3720 for a 48 core chip with 250GB L3 cache is not too bad for a corporate WS. Not as capable (in terms of accelerators) as the other loaded high end pricy SKUs, but for 48 cores at $77/core it is pretty good. Maybe this chip can be packaged and used as a candidate Xeon W9-3585X or similar chip...
    Reply
  • thestryker
    I'm assuming a chunk of the losses are due to Zen 4 being much more efficient but it would be nice to see some clock graphs (not for every test, but maybe one per category) if possible. If that isn't possible maybe running this same suite on a 13900K/14900K and 7950X to give some context since these are extremely close in threaded performance despite Intel using more power.

    Appreciate the immediate look and hope to see some more, or maybe some Xeon W review action when those EMR refreshes come out!
    Reply
  • Vanderlindemedia
    Ouch intel.

    AMD is a generation ahead of you.
    Reply
  • tamalero
    Admin said:
    We put Intel’s fifth-gen Emerald Rapids Xeon Platinum 8592+ through the benchmark paces against AMD's EPYC Genoa to see which server chips come out the winner.

    Intel 'Emerald Rapids' 5th-Gen Xeon Platinum 8592+ Review: 64 Cores, Tripled L3 Cache and Faster Memory Deliver Impressive AI Performance : Read more
    Whats with these "you can trust our review made by pros" ?
    The first one I've ever seen do that thing was Gamer Nexus.
    Now seems everyone wants to add those kind of "claims" on their own reviews.
    Reply
  • bit_user
    Thanks for the review, as always!

    Some more potential cons:
    Still significantly lagging Genoa on energy-efficiency.
    PCIe deficit in 1P configurations (80 Emerald Rapids vs. 128 lanes for Genoa). In 2P configurations, Genoa can run at either 128 or match Emerald Rapids' 160 lanes, if you reduce the inter-processor links to just 3.
    Fewer memory channels (8 vs. 12 for Genoa), though the number of channels per-core is the same.
    The 96-core EPYC Genoa 9654 surprisingly falls to the bottom of the chart in all three of the TensorFlow workloads, implying that its incredible array of chiplets might not offer the best latency and scalability for this type of model.
    I did see a few such inversions in Phoronix' review, but fewer and way less severe. This should be investigated. I recommend asking AMD about it, @PaulAlcorn . It almost looks to me like you might've had a CPU heatsink poorly mounted, forgot to replace the fan shroud, or something like that. It's way worse than anything you saw in your original Genoa review, where we basically only saw inversions in stuff that didn't parallelize too well.
    https://www.tomshardware.com/reviews/amd-4th-gen-epyc-genoa-9654-9554-and-9374f-review-96-cores-zen-4-and-5nm-disrupt-the-data-center/5In this review, it almost seems like the EPYC 9554 is outperforming the 9654 more often than not!
    Reply
  • bit_user
    BTW, I find it a little weird that they still don't have a monolithic version that's just 1 of XCC tiles, even as just a stepping stone, before you get down to the range of the regular MCC version.
    Reply
  • bit_user
    Tech0000 said:
    2. I would have liked to see the comparison between EMR and SPR for the same model, e.g. 8462Y+ vs 8562Y+ to better understand and isolate the generational core for core and model improvement (mostly everything else being equal). It's hard to derive conclusions, when you are comparing different models and core configs with test results numbers allover the place - one winning over the other depending on the test performed.
    I suspect that a 8462Y+ vs 8562Y+ comparison would result net very modest gain (due marginally higher all core turbo) and that the real performance gains are in top tier SKUs with triple L3 cache, accepting faster DDR5 etc.
    I'd imagine the issue is that they can only test the review samples they're sent by Intel.

    Phoronix tested a limited number of benchmarks with different DDR5 speeds. Seems like the faster DDR5 wasn't a huge win, but sadly none of the AI benchmarks were included. Those should've skewed the geomean a bit higher.
    https://www.phoronix.com/review/intel-xeon-ddr5-5600
    thestryker said:
    If that isn't possible maybe running this same suite on a 13900K/14900K and 7950X to give some context since these are extremely close in threaded performance despite Intel using more power.
    To make the results more applicable, I'd suggest the E-cores should be disabled.
    Reply
  • thestryker
    bit_user said:
    BTW, I find it a little weird that they still don't have a monolithic version that's just 1 of XCC tiles, even as just a stepping stone, before you get down to the range of the regular MCC version.
    What does the XCC offer in Xeon Scalable that MCC doesn't? I was trying to think of something but the specs of all the SKUs seem so random for EMR I couldn't figure out what you'd be referring to.
    bit_user said:
    To make the results more applicable, I'd suggest the E-cores should be disabled.
    That would remove the entire point I was getting at of using the desktop parts as a comparison. The 13900K/14900K consistently go back and forth with the 7950X in MT performance at stock settings in standard CPU benchmarks despite the extra power consumption on the Intel side. Though with the IPC between RPL/Zen 4 so close maybe disabled E-cores + 1 CCD disabled would make for a good comparison as then it would be just 8 P-cores vs 8 Zen 4 cores. I haven't seen any such comparison though so this is just a wild guess.
    Reply
  • bit_user
    thestryker said:
    What does the XCC offer in Xeon Scalable that MCC doesn't?
    I just meant that perhaps they could get more mileage out of their chiplet usage. Like, maybe there are some XCC tiles with a defect in the EMIB section, so just put those on a substrate by themselves and sell it as 32C or less.

    thestryker said:
    That would remove the entire point I was getting at of using the desktop parts as a comparison.
    Okay, well if you don't exclude the E-cores, then I don't see how those tests would be relevant to these server CPUs.

    thestryker said:
    with the IPC between RPL/Zen 4 so close maybe disabled E-cores + 1 CCD disabled would make for a good comparison as then it would be just 8 P-cores vs 8 Zen 4 cores. I haven't seen any such comparison though so this is just a wild guess.
    Heh, you might just get your chance! The new Xeon E-series 2400 have their E-cores disabled (sounds ironic, eh?). So, if anyone benchmarks a Xeon E-2488 against a Ryzen 7700X, then it'd be exactly what you're talking about.

    Annoyingly (for me), the new Xeon E 2400 also have their GPUs disabled. Otherwise, I might've been interested. I guess they could still announce G-versions, later.
    Reply
  • thestryker
    bit_user said:
    I just meant that perhaps they could get more mileage out of their chiplet usage. Like, maybe there are some XCC tiles with a defect in the EMIB section, so just put those on a substrate by themselves and sell it as 32C or less.
    Ah yeah I get what you mean, but they'd be limited to 4 memory channels and half the PCIe lanes as well. I would love to know what happens in that circumstance though... like do they have to toss the whole thing?
    bit_user said:
    Okay, well if you don't exclude the E-cores, then I don't see how those tests would be relevant to these server CPUs.
    Well like I said originally it's more to give a known quantity comparison than it is to get a direct reflection. What I mean by this being if there was a test that Genoa beat EMR, but the desktop CPUs were closer/equal you could extrapolate that the server CPU differences were more likely due to efficiency than architecture. It would definitely be much better if you had a P-core only setup which matched a Zen 4 setup in performance though for this comparison.
    bit_user said:
    Heh, you might just get your chance! The new Xeon E-series 2400 have their E-cores disabled (sounds ironic, eh?). So, if anyone benchmarks a Xeon E-2488 against a Ryzen 7700X, then it'd be exactly what you're talking about.
    Yeah that would be the ideal comparison. I'd love to see a die shot to see if they're using ones without E-cores.
    bit_user said:
    Annoyingly (for me), the new Xeon E 2400 also have their GPUs disabled. Otherwise, I might've been interested. I guess they could still announce G-versions, later.
    Yeah I was surprised there were so many SKUs listed but none with an IGP. In the past they've always launched at least a few with graphics. Another reason why I'd love to see a die shot.
    Reply