AMD EPYC Siena Has Similar Performance to Intel Sapphire Rapids at a Lower Power Draw

AMD Zen 4/Zen 4c EPYC Server Family
(Image credit: AMD)

Phoronix recently reviewed AMD's new Zen 4c equipped EPYC 8324P and 8324PN (Siena) server processors in Linux and found that both chips offer outstanding performance in server workloads. In testing, Phoronix found that both AMD Zen 4c chips could match the performance of an Intel Xeon Gold 6421N 32-core processor in most workloads while operating with substantially lower power consumption.

The 8324P and 8324PN are part of AMD's latest 8004 series EPYC server CPU lineup, focusing on power efficiency over raw CPU performance. All CPUs in the 8004 lineup come with AMD's new Zen 4c core, which is 35% smaller than Zen 4 and offers better power efficiency than its larger counterpart. The only disadvantage of AMD's Zen 4c core is its raw performance, which is inferior to a standard Zen 4 core.

The 8324P and PN sit directly in the middle of AMD's 8004 series CPU stack, sporting 32 cores, 128MB of L3 cache, six DDR5 memory channels, and 96 PCIe Gen 5 lanes. They differ in clock speeds and power ratings, with the 8324P featuring a base clock of 2.65 GHz, a 3 GHz boost clock, and 180W TDP. The 8324PN, on the other hand, sports a 2.05 GHz base clock, a 3 GHz boost clock, and a much lower 130W TDP.

This is because the PN series of 8004 CPUs are designed for Network Equipment Building System (NEBS) compliant deployments, requiring systems that can withstand greater operating temperature ranges than regular chips.

(Image credit: Phoronix)

Phoronix benchmarked the two chips in many different benchmarks, equating to seven different categories, including AI workloads, video encoding, code compiling, file compression/decompression, HPC, and more.

On average, the two AMD chips outperformed Intel's Xeon Gold 6421N 32 core at their maximum power settings. At stock settings, the 8324P was 2.7% quicker than the Xeon 6421N, 2.5% slower at its lowest configurable TDP of 155W, and 5% faster in the chip's performance determinism mode.

The 8324PN, unsurprisingly, was a bit slower due to its reduced power rating and base clock specifications. Stock the 8324PN was 11.8% slower than the Intel Xeon but could still beat the chip in its performance determinism mode, garnering a 4.8% lead over the Intel CPU.

AMD also highly favored power efficiency, with both the 8324P and 8324PN outputting average power consumption ratings well below the 100W mark, even when performance determinism mode was enabled. On the contrary, Intel's Sapphire Rapids counterpart couldn't get close to AMD's power efficiency, with an average power consumption rating of 137 watts.

Phoronix's testing reveals that AMD's latest Zen 4c core is very potent and is capable of outputting performance that can match the best Intel can offer today. Intel might have beaten AMD to the E-core game with Alder Lake, but these latest benchmarks prove that AMD can make a delicious efficiency core that can rival Intel's high-performance cores in the right circumstances.

Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

  • bit_user
    Zen 4c is just as fast as Golden Cove.
    Let's not be too reductive. If you dig into the results (and overlook the derivative efficiency & value metrics), there are a few decisive wins for the Xeons, such as in Tensorflow, OpenRadioss, OpenSSL/AES, and a couple of the database benchmarks. However, it's clear that the 8-channel memory configuration is what's really behind a couple of those wins, given how much worse the 6-channel Xeon did.

    It's a more complex story than to just say Zen 4C is equal to Golden Cove. In single-threaded tests, it's definitely not. However, when you pack so many of them into a CPU, then power efficiency becomes a greater liability for Golden Cove and it can't stretch its legs (i.e. run at high enough frequencies to pull ahead).
    Reply
  • jthill
    So, if we overlook all of the many epyc wins golden cove pulls ahead, yes, that's the way to look at it.

    Only slather a lot of words on it so people get distracted from what we're carefully whistling our way past.
    Reply
  • bit_user
    jthill said:
    So, if we overlook all of the many epyc wins golden cove pulls ahead, yes, that's the way to look at it.
    I guess you're talking to me, but I have to guess because your comment doesn't sound like you took the time to read what I wrote. I never said anything about overlooking the majority of cases where the EPYC 8324P beat the Xeon 6421N. I was trying to point out that there were a diversity of results. In some cases, the EPYC won by a large margin. In a handful of others the Xeon even managed a decent margin for its part. Of course, the majority were fairly close, with the EPYC mostly in the lead.

    One more thing I'll note is that these Phoronix benchmark articles have some graphs mixed in where the performance is normalized by cost or by power. If you're not careful to look at only the graphs that actually show performance, you can come away with a slightly skewed impression of the performance picture, particularly if the cheaper & more-efficient part is also very competitive on the performance front - such as in this case!

    I certainly don't deny the balance of the tests, which showed a decisive win for the EPYC 8324P ("Siena"). What bothered me was the reductive conclusion that Zen 4C is equivalent to Golden Cove. It's really not. We know this, because Golden Cove-powered CPUs typically beat Zen 4-powered ones on single-threaded and lightly-threaded benchmarks. Zen 4C has the exact same microarchitecture as Zen 4, but with half the L3 cache. Also, it doesn't clock as high. So, there's no way Zen 4C is even as fast as regular Zen 4 - and Zen 4 wasn't equivalent to Golden Cove.

    jthill said:
    Only slather a lot of words on it so people get distracted from what we're carefully whistling our way past.
    Again, I have to wonder whether you're really referring to my post, because that's really not a lot of words.

    CPUs are complex machines. Their performance and behavior has a lot of nuances. To reach a robust conclusion, you have to work through the data and form a model which explains the nuances. What I was trying to do is to bring people along on that journey, using words. Honestly, it's not such a complex picture, but I think one easily befitting of the analysis I afforded it.
    Reply
  • JayNor
    SPR has the tiled matrix acceleration and the possibility of HBM.
    EMR launching before year end, adding cxl 2.0 memory pools.
    Intel already launched SPR chips with vRAN layer 1 acceleration and extended temperature operating range.

    Sierra Forest and Granite Rapids will both add MCR DIMM support and full cxl 2.0.
    Reply
  • bit_user
    JayNor said:
    SPR has the tiled matrix acceleration and the possibility of HBM.
    I assume that's why even the 6-channel configuration scored wins on TensorFlow/ResNet-50.

    JayNor said:
    Intel already launched SPR chips with vRAN layer 1 acceleration and extended temperature operating range.
    Does Phoronix Test Suite have a benchmark correlating with vRAN performance? If not, maybe Intel should contribute one.

    JayNor said:
    Sierra Forest and Granite Rapids will both add MCR DIMM support and full cxl 2.0.
    That's not for at least 6 months, though.
    Reply
  • bit_user
    BTW, Aaron really should've shown the power usage graph, as well. It's pretty amazing that the 32-core/64-thread EPYC 8324P (at stock settings) averaged just 89.8 W, with a max of only 136 W. That's just 64.5% of what the Xeon Gold 6421N used @ stock settings & 8-channel memory.
    Reply
  • TerryLaze
    bit_user said:
    BTW, Aaron really should've shown the power usage graph, as well. It's pretty amazing that the 32-core/64-thread EPYC 8324P (at stock settings) averaged just 89.8 W, with a max of only 136 W. That's just 64.5% of what the Xeon Gold 6421N used @ stock settings & 8-channel memory.
    If you want to take power as an consideration you would have to compare against ARM servers because that's what customers that need power efficiency go for.
    Anybody still sticking with x86 in server does so because they have to, it's all arm and GPUs now.
    Reply
  • bit_user
    TerryLaze said:
    If you want to take power as an consideration you would have to compare against ARM servers because that's what customers that need power efficiency go for.
    Fair point. I don't know much about the CSP market, but I could believe ARM gets a lot of play there.

    TerryLaze said:
    Anybody still sticking with x86 in server does so because they have to, it's all arm and GPUs now.
    In the CSP market segment targeted by Siena, I think they're more likely to use DPUs, if anything. I think Xilinx has some products in that sector.
    Reply