Skip to main content

AMD 64-Core Threadripper 3990X Review: Battle of the Flagships

The core wars rage on.

AMD Threadripper 3990X
Editor's Choice
(Image: © AMD)

Threadripper 3990X Boost Frequency

The Threadripper 3990X has an impressively high base and boost clock specification given its core count, so we ran a few tests to measure the chip's ability to hit its rated speeds. 

Generally, stock coolers that worked with the previous-gen Threadripper models should suffice for most users, but beefier coolers can unlock more performance. AMD ships all Threadripper CPUs with an Asetek bracket that provides partial coverage of the massive heat spreader using supported closed-loop liquid coolers. According to AMD, this partial coverage is fine for stock operation, but we prefer full-coverage coolers. 

For this round of testing, we used the partial-coverage Corsair H115i and a full-coverage block with our custom loop with two 360mm radiators to compare performance at stock settings. We also used the custom loop for overclocking testing.

Image 1 of 3

(Image credit: Tom's Hardware)
Image 2 of 3

(Image credit: Tom's Hardware)
Image 3 of 3

(Image credit: Tom's Hardware)

We begin by recording the frequencies of each core during a series of commonly-used tests that should expose the peak frequencies. The first two tests are LAME and Cinebench in single-core test mode. These programs only execute on one core of the processor, which typically allows the chip to reach its peak boost frequency within its power, current, and thermal envelope. We also ran PCMark 10, Geekbench, and VRMark in rapid succession to measure frequencies during intermittent "bursty" workloads. 

The per-core frequency recordings create unintelligible charts with 64 cores in action, so the album above only includes the maximum and minimum frequencies recorded during each 1-second measurement interval (100ms sampling). That means these measurements could come from any one core, but it makes the charts easier to digest. We also plotted chip temperature on the right axis (the dark red line).

At stock settings with the AIO cooler, we reached 4.35 GHz frequently, which exceeds the 4.3 GHz rating. That's pretty impressive for a partial-coverage cooler on a 280W chip, but we followed up with testing at stock settings with our custom loop. We recorded more frequent boosts to 4.35GHz, which results in faster performance, and also noticed a gain in minimum clock rates. This highlights that a better cooler offers more performance, even at stock settings.   

Finally, we engaged the auto-overclocking PBO feature. The chip still boosted up to 4.35 GHz, but the boosts were of shorter duration, which explains why this feature often results in slightly lower performance in lightly-threaded work. However, you'll notice that the processor has a much higher minimum boost than either stock configuration, and after examining the performance logs, we noticed much higher multi-core boosts, which explains the big performance gains in threaded applications. 

Threadripper 3990X Overclocking

AMD's Ryzen 3000 processors have drastically improved single-threaded performance, but you'll lose that benefit if you manually overclock. That's because the 7nm chips can't be manually overclocked on all cores to reach the same frequency as the single-core boost frequency. In fact, the all-core overclock ceiling is often 200 to 300 MHz lower than the single-core boost speeds. 

Given the already-prodigious power draw of these 280W TDP chips, temperatures are a concern for manual overclocking, though the solder thermal interface material (sTIM) between the heatspreader and dies does help thermal dissipation. Brute-force manual overclocking may not be the best path forward with conventional cooling, but newer BIOS revisions support per-CCX overclocking, which opens up a new pathway for more fine-grained optimizations. 

We turned to AMD's auto-overclocking Precision Boost Overdrive feature for our battery of tests. This auto-overclocking algorithm preserves most of the benefits of the single core boost, as seen in our boost testing above, while massively speeding up threaded workloads. We paired our PBO-enabled configurations with our custom watercooling loop and a Phanteks full-coverage Glacier C399A TR4 wateblock

We reached temperature peaks as high as 90C during some tests, but mostly hovered in the ~84C range during extended threaded workloads. 

Test Notes

Even though we have plenty of exciting results to share, our testing is full of caveats, but that's largely due to poor software support. You can configure the Threadripper 3990X to present itself as multiple NUMA nodes that the operating system sees as separate entities with their own banks of memory (deeper explanation here). 

That brings about a whole slew of problems with many applications, but you can also assign the processor to appear as one NUMA node, which we did for our testing. However, due to the vagaries of the Windows scheduler, the operating system still sees the first 64 threads as one 'processor group,' while anything above that number of threads appears as a second group. In the case of the 3990X, that means threads 65-128 appear as their own processing group to the operating system, as shown in the image below.

Image 1 of 2

(Image credit: Tom's Hardware)
Image 2 of 2

(Image credit: Tom's Hardware)

Some applications can span across both groups, but many cannot, and we have the added complexity of multiple NUMA nodes. That means we experience sub-par scaling in some workloads with both the 3990X and our server platforms. We ran a limited subset of tests with the server platforms due to concerns of the impact of dissimilar memory and graphics, but be aware that some of these results suffer from processor grouping and/or application scaling issues. The results are still relevant, as it highlights applications where the 3990X may or may not be a better choice than a server platform. We also tested the 3990X with the 32GB memory kit recommended by AMD, but the company says some applications could benefit from more memory capacity. 

Due to these disparities, be aware that further tuning and optimizations could wring more performance out of some applications, particularly on the server side of our testing. Linux doesn't use processor grouping, and we're told that can result in better performance in some applications. In either case, the target market for this processor trends towards the Windows environment and its suite of productivity tools, so we're using it for our first look at performance. 

Threadripper 3990X Power Consumption

Image 1 of 7

(Image credit: Tom's Hardware)
Image 2 of 7

(Image credit: Tom's Hardware)
Image 3 of 7

(Image credit: Tom's Hardware)
Image 4 of 7

(Image credit: Tom's Hardware)
Image 5 of 7

(Image credit: Tom's Hardware)
Image 6 of 7

(Image credit: Tom's Hardware)
Image 7 of 7

(Image credit: Tom's Hardware)

Like the Threadripper 3970X and 3960X, the 3990X peaks around the 280W stock package power limit in both AVX and non-AVX flavors of the AIDA stress test. That’s impressive given its solid performance in some of the heavy workloads we’ll see on the following pages. The power efficiency gains of TSMC’s 7nm process are nothing short of incredible when we kick up the power, too. Engaging the auto-overclocking PBO feature pushed the chip to ~470W under full load. That’s amazing for 64 overclocked cores compared to the overclocked W-3175X’s ‘mere’ 28 cores sucking down ~760W. It will be interesting to test the 3990X’s power/performance ratio in eco mode.

Switching over to the y-cruncher power results finds the overclocked Threadripper 3990X pulling down 391W, which is nearly the same amount of power as the overclocked 12nm 32-core 2990WX. Finally, a quick look at the HandBrake efficiency metrics, which quantify the number of renders you can accomplish (given our workload) per day per watt of power consumed, reveals that the Core i9-10980XE is also very power efficient. However, considering that the 3990X consists of eight 7nm compute die and one 12nm I/O die, while the 10980XE uses a single monolithic die, these metrics are more than acceptable. AMD ties the 3990X’s nine die together with the Infinity Fabric but tuned the fabric to be incredibly power efficient and claims a 27% reduction in IFOP power consumption. The Ryzen 9 3950X takes the crown as the most efficient processor in our test pool, and it’s noteworthy that the massive 3990X is more power-efficient than the eight-core Core i9-9900K in our HandBrake tests.

Test Setup

Server Test BedsProcessorsCores / ThreadsDRAM
Supermicro AS-1023US-TR4Two EPYC Rome 7742128 / 25616x 32GB DDR4-3200
Dell/EMC PowerEdge R460Two Intel Xeon Platinum 828056 / 11212x 32GB DDR4-2933
Gigabyte R15Z-Z32One EPYC Rome 7702P64 / 1288x 32GB DDR4-3200

AMD Socket sTRX4 (TRX40)Threadripper 3990X, 3970X, 3960X
MSI Creator TRX40
4x 8GB G.Skill FlareX DDR4-3200 - Stock: DDR4-3200, OC: DDR4-3600
Intel Socket 2066 (X299)Core i9-10980XE
MSI Creator X299
4x 8GB G.Skill FlareX DDR4-3200 - Stock: DDR4-2933, OC: DDR4-3600
AMD Socket AM4 (X570)AMD Ryzen 9 3950X

MSI MEG X570 Godlike
2x 8GB G.Skill FlareX DDR4-3200 - Stock: DDR4-3200, OC: DDR4-3600
Intel LGA 3647 (C621)Intel Xeon W-3175X
ROG Dominus Extreme
6x 8GB Corsair Vengeance RGB DDR4-2666 - Stock: DDR4-2666, OC: DDR4-3600
AMD Socket SP3 (TR4)Threadripper 2990WX, 2970WX
MSI MEG X399 Creation
2x 8GB G.Skill FlareX DDR4-3200 - Stock: DDR4-2933, OC: DDR4-3466
Intel LGA 1151 (Z390)Intel Core i9-9900K

MSI MEG Z390 Godlike

2x 8GB G.Skill FlareX DDR4-3200 - Stock: DDR4-2666, OC: DDR4-3600
All SystemsNvidia GeForce RTX 2080 Ti

2TB Intel DC4510 SSD

EVGA Supernova 1600 T2, 1600W

Windows 10 Pro (1903 - All Updates)
CoolingCorsair H115i, Enermax Liqtech 360 TR4 II, Custom Loop

MORE: Best CPUs

MORE: Intel & AMD Processor Hierarchy

MORE: All CPUs Content

  • mohammed2006
    Threadripper 3990X performance gape is not enough to justify it over 3970x. which i think is the one to buy.
    Reply
  • King_V
    As the article states, though - this is for specific types of workload/use cases.
    Reply
  • knekker
    A large number of applications don't scale well with NUMA architectures, particularly with Windows, which is the operating system of choice for visual effects artists.
    I work in the VFX industry, where I've been at ILM, DNEG, MPC and Cinesite that work on most of the block buster movies, and I can tell you this. Windows is definitely not the OS of choice, that would be linux.
    I do however currently work at a smaller vfx studio, and they use Windows.
    Reply
  • splave
    Great read Paul! I love that the 64 core makes the 32 core look reasonable now haha
    Reply
  • keith12
    Really enjoyed that one! Great comparison of the HEDT CPU's v Server and Mainstream, the good, the bad, and the ugly!

    Although, I don't get the almost apologetic tone in the Gaming Test notes. Yes, we know these CPU's aren't meant for gaming, but HEDT users, I'm sure, like to down tools too and game after a hard days slog! I suspect they'd like to know, along with the majority of the community, and anyone who'd be genuinely interested in these CPU's in the first place, what kind of gaming performance they can expect (and it's pretty damn good, by all accounts! ) from them.

    Anyway, including the gaming metrics is just being comprehensive. That's why I come to Tom's. Comprehensive is good. Don't resist the urge to include these benches in future comparison's. Don't mind the detractors! :D
    Reply
  • domih
    So you could run a Cassandra 21-node cluster on one PC with 21 Virtual Machines each allocated with 6 threads, keeping 2 threads for the host. With a mobo max memory of 256GB, each VM could be allocated 11GB leaving 25GB for the host. AMD enables you to have fun 🆒
    Reply
  • Phaaze88
    knekker said:
    I work in the VFX industry, where I've been at ILM, DNEG, MPC and Cinesite that work on most of the block buster movies, and I can tell you this. Windows is definitely not the OS of choice, that would be linux.
    I do however currently work at a smaller vfx studio, and they use Windows.
    Would your line of work actually enjoy using the 3990X, or would it just stick with something Intel again, due to the time and money lost swapping platforms?
    Reply
  • bamboe
    Well here they did a linux test if you like it
    https://www.phoronix.com/scan.php?page=article&item=3990x-threadripper-linux&num=1
    Reply
  • rjacko01
    I have also worked Framestore, ILM, MPC etc & the idea of running windows for vfx on that scale is seriously scary. I think it's fair to say 95%+ of vfx are linux, cause only a few smaller houses run windows, often with horrendous results.
    Reply
  • derekullo
    Hypothetically with 256 megabytes of L3 you could also have a 128 thread monero miner.

    My extrapolation from the 3970X (28900 hashes/second x 2) = 57800 hashes/second x 0.9 (due to scaling not being completely linear due to lower clock speeds) = 52020 hashes per second

    Putting that into a monero calculator with a 300 watt power drain for the system and 0.06 Cost per KWh we get $1,364 profit a year.

    $3990 / $1364 = 2.9 years to recoup your investment.

    https://www.cryptocompare.com/mining/calculator/xmr?HashingPower=52020&HashingUnit=H/s&PowerConsumption=300&CostPerkWh=0.06&MiningPoolFee=1
    Comparing this to a Geforce 2080Ti we get a strangely similar 2.89 years to recoup its investment.
    $1300 / $1.23 a day = 1057 days / 365 days = 2.89 years

    With the 3950x clocking higher and being 35% cheaper per core it would make more sense to use 3 - 3950x in 3 separate rigs than a 3990x.
    Reply