AMD Threadripper Pro 3995WX Review: Ripping With 8 Memory Channels

Threadripping with eight memory channels

Lenovo ThinkStation P620
Editor's Choice
(Image: © Tom's Hardware)

Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.

AMD Threadripper Pro 3995WX Desktop PC Application Benchmarks - The TLDR: 

Here we can see that the Threadripper Pro 3995WX continues to deliver the class-leading threaded horsepower we expect of these core-heavy chips in our geometric mean of multi-threaded workloads. Still, the 3995WX's increased memory throughput and capacity doesn't yield tremendous gains in most of these desktop PC-centric applications. 

Instead, the Threadripper 3990X is the right chip for that job, largely due to its higher clock rates. There are exceptions sprinkled throughout our testing below, but it's important to remember that the Threadripper 3990X and 3995WX are specialized chips targeted at certain applications - and there the chips deliver. As we can see, even from the cumulative measurements above, the Threadripper chips devastate Intel's competing chips in threaded workloads. 

Flipping through to the geometric mean of the most lightly-threaded tests in our suite, we can see that the Threadripper 3995WX largely delivers the same amount of performance as its forebearer, the 3990X. Surprisingly, the Threadripper processors outstrip the W-3175X in these tasks, but the Core i9-10980XE continues to hold the single-threaded crown among the workstation-class chips. As expected, consumer-focused chips still dominate our single-threaded rankings. 

Note: We see some inversions in the workloads below, with the 32GB 3995WX configuration outperforming the 128GB setup. We theorize that this is due to the lower memory latency we recorded when only one dual-channel memory controller is active.  

Rendering Benchmarks on AMD Threadripper Pro 3995WX

The rendering benchmarks land right in Threadripper Pro's target market. Cinebench has long been AMD's favorite benchmark for a simple reason; the Zen microarchitecture has always performed extremely well in the threaded benchmark. This benchmark obviously doesn't improve due to the increased memory throughput of the octo-channel 3995WX, and the 3990X takes the top of the chart on the strength of its higher clock rates. Meanwhile, Intel's chips lag woefully behind due to their comparatively-woeful core counts. 

Flipping over to the single-threaded Cinebench workload shows that AMD has stepped forward in per-core performance with the Threadripper 3000 processors. The 3995WX and 3990X take a slim lead over the Core i9-10980XE while thoroughly outstripping the W-3175X. The consumer chips dominate the chart, though. 

We recently integrated the Intel Open Image Denoise Benchmark into our suite. This ray-tracing test uses Intel's oneAPI rendering toolkit. Hence, it provides an interesting take on performance that's more of an academic exercise than an indication of real-world performance – at least for now. OneAPI is still in the early days of development, not to mention adoption, but it is an interesting display of Intel's latest approach - but in a decidedly Intel-friendly test. This test does scale well with additional memory bandwidth, as we can see with the scaling between the 32, 64, and 128GB 3995WX configurations. Ultimately, that leads to the 3995WX taking the lead over the Intel Xeon W-3175X. 

The POV-Ray multi-thread benchmark puts the full heft of Threadripper's threads on full display as the 3995WX offers nearly twice the performance of the W-3175X, but again, the 3990X takes the lead. That's largely because the increased memory throughput doesn't impact this benchmark. The Threadripper chips trail the consumer chips in the single-core POV-Ray benchmark but slide past Intel's competing workstation-class chips again. 

Intel does pull off a few isolated wins in the PCMark 10 subtests, but most of these tests skew towards Threadripper. Flipping through the remainder of the tests, including v-ray, Blender, and C-Ray, show that most of these workloads aren't impacted by the 3995WX's extra available memory throughput/capacity. In either case, the chip delivers roughly the same resounding leads over Intel's competing chips as the Threadripper 3990X. 

Encoding Benchmarks on AMD Threadripper Pro 3995WX

Our encoding tests include benchmarks that respond best to single-threaded performance, like the quintessential examples LAME and FLAC, but the SVT-AV1 and SVT-HEVC tests represent a newer class of threaded encoders. 

It's no surprise to find the Core i9-10980XE, along with the consumer chips, faring better than the Threadripper CPUs in LAME, but the chips are surprisingly strong in the FLAC audio encoding benchmark. 

The SVT-AV1 and SVT-HEVC benchmarks show that these threaded encoders respond well to increased core counts, granting Threadripper Pro impressive results, but the software doesn't appear to be entirely optimized for the 64-core Threadripper's unique architecture - the 32-core Threadripper 3970X leads in these tests. 

Flipping over to HandBrake, we can see that the x264 and x265 tests benefit slightly from the increased memory throughput of the 128GB configuration, but it's important to note that these tests are of relatively short duration. AMD tells us that longer-duration threaded tests can expose larger performance deltas. In either case, the Threadripper chips beat the Intel comparables. 

Web Browsing on AMD Threadripper Pro 3995WX

We test all of these benchmarks in a version-locked Chrome browser, with the notable exception of the Edge test. Intel has really taken quite the performance haircut in web browsers over the last two years, largely due to mitigations for its nagging security concerns. Regardless, most of these benchmarks are almost exclusively lightly-threaded, so Intel has long held the top of the charts despite the mitigations. 

AMD's Zen 3 architecture in the Ryzen 5000 series processors have changed that paradigm entirely, but we see many of the same trends with the 3995WX as we see with the 3990X - the Threadripper chips take the lead in the threaded Edge and WebXPRT 3 test suites, but trails the 10980XE in ARES-6, Jetstream 2, and Speedometer 2.

Office and Productivity on AMD Threadripper Pro 3995WX

If you're looking to build a screaming-fast workstation, you're probably not doing it to run office applications like Word at breakneck speeds. However, these types of applications are ubiquitous the world over, so snappy performance is important for daily tasks. 

The Threadripper chips perform well in the Office suite, with high marks in the Excel, Application Start-Up, and PowerPoint subtests helping to lift the overall score. 

The Intel Core i9-10980XE leads the lightly-threaded GIMP image processing benchmarks, but the W-3175X trails the rest of the test pool by a large margin in a few of the subtests, possibly indicating a conflict with its mesh interconnect. Conversely, the Threadripper processors take an easy lead in the PCMark 10 photo editing subtest, reminding us that much of the performance in individual applications boils down to how well the software can take advantage of extra cores and threads.  

Compilation, Compression, AVX Performance on AMD Threadripper Pro 3995WX

Our 7zip results are interesting, but this benchmark runs directly out of memory. The 3995WX has a sizeable capacity advantage over the 3990X that we tested with 32GB of memory and tighter timings. Keep that in mind as you analyze the results. Those same factors also impact the y-cruncher benchmarks, where Intel maintains a lead in the single-threaded test but trails in the multi-threaded rendition. Also, bear in mind that Geekbench test results are particularly sensitive to memory bandwidth and capacity.

MORE: Best CPUs

MORE: Intel and AMD CPU Benchmark Hierarchy

MORE: All CPUs Content

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • CerianK
    Probably a pointless question, but I assume the 16GB are dual-rank... I would be curious how 16GB single-rank (which I understand exist, but are the minority in the market) modules would perform in the 128GB configuration? Probably no difference, but might be worth exploring with a few select benchmarks, if possible.
    Reply
  • gatg2
    hate to be that guy but, it's not actually the first PCIe 4.0 capable workstation on the market, that honor goes to the Talos II Secure Workstation
    https://www.raptorcs.com/TALOSII/
    Reply
  • fellow
    I love these, especially the 12-16 cores at 4GHz, much closer to 5900 and 5950 for lightly threaded workloads. Great solution for those wanting expandable server and workstation features.

    I like the look of those Raptors too, especially the pci-4 and memory bandwidth. May get a Blackbird for testing and open source (mostly) fast hardware. See Phoenix coverage Part 2— the first were not as promising.

    For Threadripper Pro, has there been any information about the socket and CPU upgrade path?

    My main concern is the upcoming release of Zen3 Threadrippers. I imagine there will then be a Zen3 Threadripper Pro in a couple quarters or a year from now. The memory and pci expansion makes this an excellent platform for future growth.

    Since AMD has been forward looking by using the same socket for Ryzen, is it safe to expect the Zen3 TRPro will be accepted in this new socket?

    Gracias,

    fellow
    Reply
  • Endymio
    ... the most powerful workstation chip on the market - it's 64 cores easily outweigh Intel's
    Emergency edit on aisle four, please.

    Also, do I misunderstand the article, or has Toms yet again pronounced a verdict on a product they as yet haven't seen, or has even been released?
    Reply
  • Intel has nothing to touch the thread ripper so there’s nothing wrong with that statement
    Reply
  • Endymio
    Mandark said:
    Intel has nothing to touch the thread ripper so there’s nothing wrong with that statement
    Examine the highlighted word.
    Reply
  • hitchhiker0
    Fantastic! I like them very much.
    Picking a Threadripper Pro 3975WX, 128 GB RAM, some SSD, some NVidia GPU and make a virtual desktop infrastructure for computer-aided designing.
    You can host 4-6 virtual desktops quickly.
    Reply
  • Stefan Dyulgerov
    Hey in your benchmarks, can you include compilation of the Unreal Engine editor?
    The engine is quite taxing on the cpu both c++ and the shaders.
    Most people that are alone struggle with it. If you work in studio you can share cores, but at home alone:)
    Reply
  • mikewinddale
    Nice review, thanks.

    But I just discovered something interesting that you missed in the review:

    If you install six (6) dimms, applications like AIDA64, CPU-Z, etc. will recognize it as "hexa" channel, but benchmarks will reveal that the actual memory throughput is equivalent to merely dual-channel.

    So you can populate four or eight DIMMs, but be careful with six.

    For my application, I started a 3955WX with 4x64 GB RAM. I discovered that wasn't enough, so I upgraded to 6x64. My application now had enough RAM, but performance declined. So I had to upgrade to 8x64.
    Reply
  • robcowart
    @mikewinddale In my testing it is even worse than sticking to 4 or 8 populated channels. Anything less than all 8 channels has a significant impact on performance. The hardware setup for these tests was: 3995wx, ASUS Pro Sage, 256GB 3200MHz, writing to 4 x Samsung 980 Pro in RAID-0. Interesting is that while throughput dropped, meaning that technically the system is doing less work, the CPU utilization increased when all 8 memory channels weren't populated. I do wonder if the different channel-to-chiplet affinity between your 16-core and my 64-core model is responsible for why you don't see as big of a hit as I do with only 4 channels populated.


    Reply