Be sure to check out the test notes on the previous page for important testing particulars. Be aware that further optimizations could unlock more performance from our server platforms, and all three servers come in 1U chassis that can have an impact on cooling, and thus performance. Also, the servers have varying memory capacities, but that's an unavoidable consequence of the unique platforms.
All AMD entries with "PBO" indicate an auto-overclocked configuration paired with with DDR4-3600. Intel's overclocked configurations also use DDR4-3600.
It's also noteworthy that while we did experience many odd performance characteristics that disadvantage some platforms, this testing represents the current state of the software ecosystem.
As a reminder, here is a quick breakout of each server entry in the charts:
|Chart Entry||Processors||Cores / Threads||Server Test Bed||DRAM|
|2x EPYC 7742||Two EPYC Rome 7742||128 / 256||Supermicro AS-1023US-TR4||16x 32GB DDR4-3200|
|2x Xeon 8280||Two Intel Xeon Platinum 8280||56 / 112||Dell/EMC PowerEdge R460||12x 32GB DDR4-2933|
|EYPC 7742||One EPYC Rome 7702P||64 / 128||Gigabyte R15Z-Z32||8x 32GB DDR4-3200|
Starting off with the LAME encoder, which is the quintessential example of a single-threaded test, may seem a bit...lame, but this series of tests helps explain some of the results you'll see throughout the rest of the review.
Both AMD and Intel have made great strides with per-core performance in their HEDT lineups over the last few years. In the case of the 3990X, that improved performance in light workloads also spans out to multiple cores when the chip is under load, which benefits many of our rendering tests.
The Threadripper 3990X's faster clock speed over competing server chips is a big advantage for some applications, like the single-threaded LAME and FLAC encoding tests, and some of our rendering tests below.
Remember, Windows breaks processors up into groups of 64 cores, and some applications can't scale past those boundaries. Additionally, the server platforms are broken into several NUMA nodes. Zooming out to the threaded Handbrake tests, we see the advantage of the 3990X's clock speed take hold as it takes the top of the chart, even beating out the other Threadripper processors – but not by much. That's largely because the x264 test doesn't fully saturate the cores in both 64-core processor groups, meaning the bottleneck resides elsewhere, and the x265 test only scales across the cores in one processor group. That's particularly painful for the dual-EPYC and Xeon platforms because they suffer from disparate NUMA nodes.
The SVT-AV1 test is designed to scale well across multiple cores, but the relatively short workload doesn't scale to the second processor group, and the higher clock speeds of the 3970X and overclocked Xeon W-3175X.
Now let's look at a few workloads that scale well.
Cinebench R20.06 scales exceedingly well across both processor groups, and surprisingly, the overclocked 3990X even beats out the dual-EPYC server by a slim margin. Frankly, that's astounding. The 3990X also beats the dual-Xeon 8280 server by a whopping 62%. We can see the impact of the 7702P's lower clock speeds here, as that processor with the same number of cores and threads lags far behind the 3990X, even at stock settings. However, it almost matches the dual-Xeon server, highlighting the power of AMD's single-socket server platforms in these types of workloads.
For reference, the overclocked Threadripper 3990X pulled a peak of 589W (package power) during the multi-core Cinebench run, compared to roughly 480W from both Xeon processors.
Our POV-Ray charts are a bit of an eyesore, but that's because this application requires a new extension (to the existing 3.7 engine) so it can run across both processor groups. This highlights some of the challenges AMD will face with the software ecosystem as it works to unlock the full performance in threaded workloads, but also how the company is already moving forward on that front. We also included the pre-patch test results for all impacted platforms to highlight the advantages.
Again, the overclocked 3990X delivers devastating performance that ekes by the dual-EPYC platform and provides more than twice the performance of the patched dual-Xeon 8280 system, but there's a catch: While the workload scaled perfectly across all 128 threads of the 3990X and 256 threads for the dual-EPYC server, the patch doesn't appear to work on more than one NUMA node with Intel processors, which is a separate issue from processor grouping.
We extrapolated the performance of the benchmark if it were to run on both of the Xeon server's NUMA nodes, and will follow up to see if we can get a new version.
The Threadripper 3990X pulled a peak of 639W during this test compared to the dual-Xeon's extrapolated value of 800W.
V-Ray scales across both processor groups/NUMA nodes for both Intel and AMD platforms, which gives the 3990X a nice lead over the dual-Xeon system and all other single-chip competitors at stock settings, though it did take PBO to beat the dual-EPYC system. The Corona ray tracing benchmark also spanned both groups and handed the 3990X a convincing win.
We couldn't run some of our benchmarks on the server platforms, but the Blender benchmark marks another strong win for the 3990X over the competing consumer processors.
The Cinebench single-threaded test finds the 3990X falling behind processors with higher frequencies, but still beating the server competition. Meanwhile, the 3990X puts up an impressive across-the-board win in the single-threaded POV-Ray test.
A few of our other tests, like rendering and visualization, photo editing, and LuxMark, respond better to higher clock rates, so the 3990X struggles to keep pace with other consumer chips.
Compression, Decompression, Encryption, AVX
The 7-zip workload works directly from the memory, removing storage bottlenecks from the equation, but it only executes across one processor group/NUMA node. That disadvantages the 3990X and server platforms (particularly the EPYC server) compared to the consumer processors, at least in terms of performance scaling. Despite the restriction, the 3990X posts incredibly impressive results in the compression tests, but clock rates play a big role in the decompression test. That gives the 3970X the win.
The multi-core y-cruncher test pounds the processor with a threaded AVX workload that spans across both processor groups and NUMA nodes. Here the dual-Xeon 8280 exerts its AVX prowess, but the EPYC 7702P takes the lead. The 3990X falls a bit further down the pecking order, and given that this test works directly from memory, its quad-channel memory subsystem serves as its achilles heel. In either case, it still beats out all other HEDT processors at stock settings.
The AIDA suite of tests, which includes the Zlib compression/decompression test, AES, SHA3, and HASH tests, scales perfectly across NUMA nodes and processor groups, which gives the dual-EPYC system a commanding lead in all of the tests. The dual-Xeon server is also very competitive, and the Threadripper 3990X easily beats all of the consumer-class silicon.
Office and Productivity
These tests find us back in the realm of mainstream and HEDT platforms. The Threadripper 3990X isn't the best solution for many of the mundane workloads in PCMark 10 and the Microsoft Office suite, but it does deliver acceptable levels of performance. Naturally, these workloads aren't optimized for a behemoth like the 3990X, so these results aren't surprising, and this certainly isn't the target market.
Browsers tend to be impacted more by the recent security mitigations than other types of applications, so Intel has generally taken a haircut in these benchmarks of fully-patched systems. While the mitigations have chipped away at Intel's lead in these tests, Intel processors still largely outperform competing AMD chips in these types of strictly single-threaded applications.
MORE: Best CPUs
MORE: All CPUs Content