AMD 4th-Gen EPYC Genoa 9654, 9554, and 9374F Review: 96 Cores, Zen 4 and 5nm

The Server Slam Dunk

Epyc
Editor's Choice
(Image: © Tom's Hardware)

Tom's Hardware Verdict

The fourth-gen AMD EPYC Genoa series delivers unmatchable performance in threaded workloads but is also agile enough to outperform Intel's finest in more lightly-threaded fare. With a host of new leading-edge tech, like DDR5, PCIe 5.0, and CXL support, not to mention the adoption of AVX-512, AMD's Genoa is poised for success.

Pros

  • +

    First 5nm x86 data center processor

  • +

    Unmatched core density

  • +

    Performance in both heavily- and lightly-threaded workloads

  • +

    Leading-edge connectivity, like PCIe 5.0 and DDR5

  • +

    Support for AVX-512, VNNI, BFloat 16

  • +

    Support for the CXL interface

Cons

  • -

    Early adopter tax for DDR5

Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.

AMD’s 4th-Gen EPYC Genoa processors are the industry’s first 5nm x86 CPUs for the data center, and the flagship 96-core 192-thread EPYC 9654 leads the charge. The $11,805 EPYC 9654 enables packing an unprecedented amount of compute into slim server designs — up to 192 cores and 384 threads in a single chassis — courtesy of AMD’s chiplet-based chip design paired with the denser 5nm node and the Zen 4 microarchitecture. In addition, AMD says that a wide array of advances, including a 14% increase in IPC from the Zen 4 architecture and improved power delivery, culminate in up to ~30% more performance per core in both integer and floating point operations than Intel’s Ice Lake. That’s made even more impressive by the sheer core count advantage; the highest-end Genoa processor has more than twice the number of cores of the Ice Lake Xeons, and 60% more cores than the as-yet-unreleased Sapphire Rapids’ rumored peak of 60 cores.

The 9004-series Genoa chips also come packed with up to 384MB of L3 cache and the latest in connectivity tech, including support for up to 6TB of memory spread across twelve channels of DDR5, 128 lanes of PCIe 5.0, and CXL 1.1+, all of which makes Intel’s Ice Lake product stack, which tops out at the 40-core Intel Xeon Platinum 8380 for $9,400, look rather dated. Of course, much of that is because Intel’s oft-delayed Sapphire Rapids, which also comes brimming with advanced connectivity tech and has a host of in-built accelerators, is Genoa’s real competitor. However, it won’t arrive until January 2023.

EPYC Genoa also brings plenty of other new additives, too, like support for AVX-512 and AI-accelerating VNNI and Bfloat16 instructions.

But the hefty core counts and performance come at a cost: Genoa’s flagship models come with a peak default TDP of 360W, the highest of any x86 server processor to date, and customers can tune them up to 400W to extract the utmost in performance.

As we’ve seen with GPUs, power consumption is increasing rapidly because of the insatiable demand for more compute packed into the smallest form factors. Genoa is no exception — AMD’s customers have requested higher TDP limits to improve compute density and total cost of ownership (TCO), and improvements in both processor and cooling technology have enabled the company to deliver up to 400W of performance using standard air cooling. That does come with secondary power requirements, though: For instance, our test system’s fans can draw up to 300W alone, and that’s before we pencil in the 300W consumed by the 1.5TB of DDR5 memory.

All told, that results in a platform with a voracious appetite for power, but EPYC Genoa converts that power into incredible amounts of performance and a reduced TCO that is simply unmatched by its x86 competitors. Today we put AMD’s Genoa to the test with the 96-core EPYC 9654, 64-core 9554, and frequency-optimized 32-core 9274F in our labs. Let’s dive in.

AMD 4th-Gen EPYC Genoa 9004 Series Specifications and Pricing

(Image credit: Tom's Hardware)

As you can see on the left, the Genoa processors are much larger than the previous-gen Milan chip next to it, not to mention the consumer AMD and Intel processors we also threw in for comparison.

Genoa’s larger chip package houses up to twelve 5nm Core Compute Dies (CCDs), each packing eight cores. That’s an increase of four additional CCDs compared to the previous gen Milan, necessitating a larger chip package and integrated heat spreader (IHS), which in turn helps improve cooling. The chip also includes a center 6nm I/O die to tie all the chiplets together, which we’ll cover in further depth on the following pages.

The Genoa processors drop into the new SP5 socket that isn’t backward compatible with the Socket SP3 found on previous-gen EPYC systems, meaning the chips require an entirely new platform. In the future, SP5 will also support the Genoa-X processors, which incorporate 3D-stacked L3 cache like Milan-X, and the Bergamo chips, which have new dense Zen 4c cores that enable up to 128 cores in a single socket.

Swipe to scroll horizontally
ModelPriceCores/ThreadsBase/Boost (GHz)TDPL3 Cache (MB)cTDP (W)Package
EPYC Genoa 9654$11,805 96 / 1922.4 / 3.7360W384320-40012+1
EPYC Genoa 9634$10,304 84 / 1682.25 / 3.7290W384240-30012+1
EPYC Genoa 9554$9,087 64 / 1283.1 / 3.75360W256320-4008+1
EPYC Milan 7763$7,89064 / 1282.45 / 3.5280W256Row 3 - Cell 6 Row 3 - Cell 7
EPYC Genoa 9534$8,803 64 / 1282.45 / 3.7280W256240-3008+1
EPYC Milan 7663$6,36656 / 1122.0 / 3.5240W256Row 5 - Cell 6 Row 5 - Cell 7
EPYC Genoa 9454$5,225 48 / 962.75 / 3.8290W256240-3008+1
EPYC Milan 7643$4.99548 / 962.3 / 3.6225W256Row 7 - Cell 6 Row 7 - Cell 7
Xeon Platinum 8380$9,35940 / 802.3 / 3.2 - 3.0270W60Row 8 - Cell 6 Row 8 - Cell 7
Xeon Platinum 8368$6,30238 / 762.4 / 3.4 - 3.2270W57Row 9 - Cell 6 Row 9 - Cell 7
EPYC Genoa 9354$3,420 32 / 643.25 / 3.8280W256240-3008+1
EPYC Genoa 9334$2,990 32 / 642.7 / 3.9210W128200-2404+1
EPYC Genoa 9254$2,299 24 / 482.9 / 4.15200W128200-2404+1
EPYC Genoa 9224$1,825 24 / 482.5 / 3.7200W64200-2404+1
EPYC Genoa 9124$1,083 16 / 323 / 3.7200W64200-2404+1
EPYC Genoa 9474F$6,780 48 / 963.6 / 4.1360W256320-4008+1
EPYC Genoa 9374F$4,850 32 / 643.85 / 4.3320W256320-4008+1
EPYC Milan 7F53$4,86032 / 642.95 / 4.0280W256Row 17 - Cell 6 Row 17 - Cell 7
EPYC Genoa 9274F$3,060 24 / 484.05 / 4.3320W256320-4008+1
EPYC Genoa 9174F$3,850 16 / 324.1 / 4.4320W256320-4008+1

The entire EPYC Genoa 9004 Series family spans 18 models in three categories — Core Performance, Core Density, and Balanced and Optimized — creating a vastly simpler product stack compared to Ice Lake Xeon, which has 56 total models with a wide range of varying feature sets.

AMD has made a concerted effort to limit its product stack to the critical swim lanes. The bulk of the Genoa family are general-purpose chips that slot into the ‘Core Density’ for the highest core counts, and the ‘Balanced and Optimized’ category that’s geared for a mix of performance and TCO. Meanwhile, the F-Series chips, which come with higher frequencies and larger core-to-cache ratios, slot into the ‘Core Performance’ tier. AMD also has a smattering of P-series models, like the 9354P, that are designed for single-processor (1P) systems (listed in the slides below).

The Genoa chips range from 16 to 96 cores, and we notice there are no longer 8-, 28- or 56-core offerings — at least for now. Peak clock speeds span from 3.7 GHz to 4.4 GHz, with the highest boosts coming from the frequency-optimized F-series models.

The Genoa TDP ratings span from 200W to 360W, so the lowest TDP has increased by 45W while the highest end has increased by 80W. The configurable TDP (cTDP), a customer/OEM-adjustable parameter that provides increased performance in systems with robust cooling, now tops out at 400W, an incredible 120W increase over the prior-gen chips.

As you can see from the previous-gen 7003-Series Milan chips we added to the table, the Genoa flagship brings an additional 32 cores over the previous-gen halo part, the EPYC Milan 7763, and costs $3,915 more. AMD has also increased pricing for its 64-core models by $1,200 and $1,750 over the prior-gen models, but we see much more muted price increases further down the stack. For instance, the two 32-core models have increased by $341 and $150, while the 48-core model has only increased by $200. Also, bear in mind that, like Intel, AMD’s server chip pricing is merely a guideline, so actual pricing, particularly to larger customers, can vary dramatically.

We added a few Ice Lake Xeon models to the above table but kept the additions to a minimum due to the large number of Intel SKUs. Besides, Genoa will primarily face off against the forthcoming Sapphire Rapids — as we'll see in the benchmarks, we’ll have to wait for that launch for a fair comparison. It should go without saying, but the flagship Xeon Platinum 8380 is simply outgunned with its 40 cores and a 60MB of L3 cache, while the Genoa stack tops out at 96 cores and 384MB of L3 cache. AMD has six SKUs with higher core counts and claims that nine SKUs offer more performance in integer workloads (last slide in album).

All of the Genoa chips support the following:

  • Simultaneous Multi-Threading (SMT)
  • 12 channels of DDR5-4800 memory in 1DPC configuration (2DPC speeds will be announced in Q1, 2023)
  • 6TB of memory per socket
  • 128 Lanes of PCIe 5.0 (64 lanes support CXL 1.1+)
  • AVX-512, VNNI, Bfloat 16

The Genoa processors mark the debut of several new technologies for x86 servers, like DDR5 and PCIe 5.0, with the former currently commanding a hefty premium over the incumbent DDR4 memory and the latter resulting in higher motherboard costs. Other advances, including the 5nm/6nm production nodes used inside the chips and increased power and cooling requirements for the highest-end models, also add cost. As such, AMD concedes that many customers will continue to deploy its EPYC Milan chips for lower-priority systems, so the two families will co-exist in the market for some time.

Meanwhile, the most compute-intensive and memory bandwidth and memory capacity-hungry workloads will migrate to Genoa. While the upfront costs are higher with Genoa, the TCO advantages pencil out nicely due to increased performance-per-watt and rack density, as shown in the above slides.

Supporting these new features requires a new SP5 socket, platform, and chip design. Let’s move on to the technical details, platform overview, and testing results.

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • Roland Of Gilead
    Okay, for those more knowledgeable than me (I'm not really into server tech), how is it that Intel is so far behind in terms of core count with these systems? Looking at some of the benches (and I might as well be blind in both eyes and using a magnifying glass to scroll through the data!) it seems to me that if Intel were able to increase core count, that they would be comparable in performance to the AMD counterparts? What gives?

    Intel have taken the performance crown with ADL and Raptor in the consumer market, but on the bigger scales can't get close.
    Reply
  • rasmusdf
    Well, basically AMD has mastered the art and technology of connecting lots of small chips together. While Intel still has to make chips as one big lump. Big chips are harder and more expensive to produce. Plus when combining small chips you can always and easily add more.

    Intel might be competing right now in the consumer space - but they don't earn much profit on their expensive to produce CPUs.

    Additionally - AMD is cheating a bit and is several chip nodes ahead in production process - while Intel is struggling to get past 10 nm, AMD is on what, 4 nm? Because of TSMCs impressive technology leadership and Intels stubborness.
    Reply
  • gdmaclew
    Strange that Tom's mentions that DDR5 support for these new Data Center CPUs is a "Con" but they don't mention it in yesterday's article of Intel's new CPUs using the same DDR5.

    But then maybe not, knowing Tom's. It's just so obvious.
    Reply
  • SunMaster
    Roland Of Gilead said:
    Intel have taken the performance crown with ADL and Raptor in the consumer market, but on the bigger scales can't get close.

    Rumour has it that AMD prioritize less on the consumer market, and more on the server marked. So if "one core to rule them all" it means Zen4 is a core designed primarily for server chips. Whether Alder Lake took the "performance crown" or not is at best debateable, as is Raptor Lake vs Zen4 if power consumption is taken into consideration.

    Intel does not use anything equivalent of chiplets (yet). The die size of raptor lake 13900 is about 257 square mm. Each Zen4 ccd is only 70 square mm (two in a 7950x). That gives AMD a tremendous advantage in manufacturing and cost.

    See oMcsW-myRCU:2View: https://www.youtube.com/watch?v=oMcsW-myRCU&t=2sfor some info/estimates on yields and cost of manufacturing.
    Reply
  • bitbucket
    You have forgotten the face of your father.

    Most likely multiple reasons.
    1) Intel struggled for a long time trying to reach the 10nm process
    This delayed entire product lines for a couple of years and ultimately led to Intel outsourcing some production to TSMC which wasn't struggling to shrink the fabrication process
    AMD had already been using TSMC as AMD had sold off their manufacturing facilities years before
    2) AMD moved to a chiplet strategy long before Intel, which I don't believe has a product for sale using chiplets yet, not sure though
    Large monolitic, high core-count CPUs are harder to make than smaller lower core-count CPUs
    - An example is AMD putting two 8-core chiplets in a package (plus IO die) for a product that has 16 cores
    - Intel has recently countered this by going with heterogeneous cores in their CPUs; a mix of bigger/faster and smaller/slower cores
    - I don't believe that the heterogeneous core strategy has been implemented in servers products yet
    Reply
  • InvalidError
    The DIMM slot fragility issue could easily be solved or at least greatly improved by molding DIMM slots in pairs for lateral stability and sturdiness.
    Reply
  • Roland Of Gilead
    bitbucket said:
    You have forgotten the face of your father.
    Hile Gunslinger :)
    Reply
  • Roland Of Gilead
    SunMaster said:
    Whether Alder Lake took the "performance crown" or not is at best debateable, as is Raptor Lake vs Zen4 if power consumption is taken into consideration.
    Fair point
    Reply
  • GustavoVanni
    It is just me or do you guys also think that AMD can fit 24 CCDs in the same package in the not so distant future?

    Sure there's some small SMDs in the way, but it should be doable.

    Just imagine one of those with 192 ZEN4 cores ou 256+ ZEN4c cores.

    Maybe with ZEN5?
    Reply
  • -Fran-
    Intel's only bastion seems to be accelerators and burn as much money as they can on adoption, even worse than AVX512.

    I just looked at the numbers of OpenSSL and I just laughed... Intel is SO screwed for general purpose machines. Their new stuff was needed in the market last year.

    Regards.
    Reply