AMD Threadripper Pro 3995WX Review: Ripping With 8 Memory Channels

Threadripping with eight memory channels

Lenovo ThinkStation P620
Editor's Choice
(Image: © Tom's Hardware)

Why you can trust Tom's Hardware Our expert reviewers spend hours testing and comparing products and services so you can choose the best for you. Find out more about how we test.

Lenovo's ThinkStation P620 platform is the industry's first 64-core workstation system, but it supports the other Threadripper Pro processors. The single-socket system even offers more performance in some threaded workloads than competing dual-socket Intel workstations. 

The P620 is the first and only PCIe 4.0-capable workstation and supports up to two Nvidia Quadro RTX 8000 or four RTX 4000 GPUs, 512GB of memory (with current Lenovo memory options, could expand in the future), and 20TB of storage spread over up to eight direct-attached storage devices. Naturally, the system supports a wide array of different graphics solutions. The P630 comes with 10Gb ethernet (via a Marvell AQtion AQN-107 NIC) as a standard networking option, which is attractive to the workstation crowd. Lenovo also offers an optional Intel 9260 802.11AC (2x2) WiFi + Bluetooth 5.1 adaptor.  

Threadripper Pro processors support 128 lanes of PCIe 4.0, but Lenovo doesn't use all of the lanes for this particular chassis – the P620 supports 80 PCIe 4.0 lanes for the PCIe slots, which leads the workstation segment.

The 33-Litre P620 chassis is identical to the chassis used for the Intel-powered Lenovo P520. Front panel connectivity includes two USB-A 3.2 Gen2 (one supports Always-On and fast charge), two USB-C 3.2 Gen2 ports, and a microphone/headphone combo jack. The rear panel holds four USB 3.2 Gen2 Type-A, two USB 2.0 Type-A, two PS/2 ports, audio in/out and microphone ports, and the Rj45 10Gb Ethernet connection. Our test subject came outfitted with a DVD-ROM and 15-in-1 card reader on the front panel, both of which are optional. Audio comes courtesy of the Realtek ALC4050H.

The side panel has a locking latch. Lenovo supports all of AMD's Pro Manageability features, like Secure Boot and the DASH Manageability suite, along with support for ThinkStation Diagnostics and TPM 2.0 data security. Internal expansion slots consist of four PCIe 4.0 x16 slots and two PCIe 4.0 x8 slots. 

The system comes with Windows 10 Pro 64, which stands in contrast to other Lenovo workstations that come with Windows 10 Pro for Workstations. Lenovo says that it has an agreement with Microsoft to only use the Windows Pro for the first-gen Threadripper Pro platform. Lenovo doesn't believe that results in the loss of any key features, and the P620 also supports Ubuntu Linux LTS. 

The ThinkStation P620 doesn't come with a liquid cooling option. Instead, it features a custom-built air cooler that features two fin stacks with five heatpipes running through each. The forward fin stack, which has an 80mm fan, is shorter than the rear stack. This helps assure that the rear portion of the heat sink, which also has an 80mm fan (but in a higher mounting position), has access to airflow that isn't preheated by the forward fin stack. 

AMD and Lenovo jointly developed this compact air cooler, and it is incredibly efficient given its stature - we didn't encounter any unacceptably high temperatures during plenty of extremely demanding workloads (peaks in the mid-80C range). However, we have to remember that Threadripper processors self-modulate performance based on available thermal and electrical headroom, so we could see yet more performance with beefier air or liquid coolers. 

Lenovo's press materials refer to a dedicated air channel, apparently provided via a large plastic shroud that isolates the CPU from other internal componentry, but our test system didn't ship with one. This might come with specific configurations only, but we've pinged Lenovo for further detail. 

The system itself, which has a 92mm fan to draw air into the front of the case and another 92mm fan for exhaust, is also incredibly quiet, even under full load. Naturally, cooling performance will vary based upon GPU selection. Still, we didn't encounter any issues with the Gigabyte Eagle RTX 3090, which exhausts in the interior of the case, or the Nvidia Quadro RTX 8000, which uses a blower fan to exhaust waste heat out of the rear of the case. The latter type of GPU will obviously be most used in this type of chassis. 

The Threadripper Pro chips differ from their standard Threadripper counterparts with eight channels of DDR4-3200 support and a maximum capacity of 2TB of memory, much like their EPYC server chip counterparts, but our Lenovo ThinkStation P620 only supports 512GB of memory with its one-DIMM-per-channel (1DPC) design. The company says that capacity could expand with future 128GB modules (for a total capacity of up to 1TB). Naturally, 2DPC workstations would enable higher memory capacities. 

In either case, Lenovo's custom WRX80 motherboard allows you to fully populate all eight memory channels across two banks of four DIMMs. As we can see above, the memory modules are actively cooled by a custom enclosure that attaches to the DIMM sockets.

Our test system came armed with 128GB of DDR4-3200 ECC memory spread across eight SK hynix HMA82GR7CJR8N-XN memory modules. The system doesn't allow manipulation of the memory frequency and timings, instead forcing us to use the default SPD profile that imposes JEDEC timings of 24-22-22-52-74. This is of no concern to most professional users but did prevent us from making 100% like-for-like comparisons with our other test subjects in the benchmarks below.

Our system came with the Samsung PM981a, a PCIe 3.0 x4 OEM SSD, but we used our own PCIe 4.0 SSD for testing with professional apps (we verified the SSD operated at PCIe 4.0 speeds). Lenovo doesn't have PCIe 4.0 SSDs currently available to configure with the system, but as you would imagine, those will be listed soon. You can load the chassis with up to five 3.5" SATA HDDs and nine M.2 SSDs, though only two of the latter are mounted via a standard M.2 socket on the motherboard (supports RAID 0 and 1). Additional drives are mounted on PCIe adaptor cards.

Lenovo offers a Flex Bay for the front panel, making access to a swappable 3.5" storage device easy. In contrast, the ThinkStation's M.2 SSDs are mounted to the motherboard in a rather hard-to-access area underneath the GPU, meaning quick M.2 SSD swaps aren't an option. However, with a focus on easily-swappable internal componentry like fans, PSU, and the front bay items, the rest of the chassis is excellent in terms of serviceability.

The 1000W PSU (92% efficiency) is tool-less and pulls out easily with the embedded handle. This power supply connects directly to the motherboard via an embedded power supply connector, which then distributes all of the system power through the motherboard. This arrangement, as shown above, helps to reduce internal wiring. It also means that power connectors for other devices, like the GPU and SATA drives, are fed from ports that hang off the side of the motherboard (second to the last image) instead of through the typical wiring that comes directly from the power supply. 

Naturally, pumping this much power through the motherboard itself requires a rather thick PCB, but we aren't sure of the layer count. The front I/O panel also attaches to the motherboard via a long custom PCB connector, as seen in the last image, all of which obviously results in a rather exotic motherboard compared to what we see in the consumer space. 

Lenovo positions the P620 for workloads spanning from product design, architecture, and 3D CAD/CAM to AR and VR workloads and simulations. The system slots in-between Lenovo's single-socket P520 and the company's dual-socket P720, both of which are powered by Intel processors. 

The Lenovo ThinkStation P620 starts at $3,619 for the 12-core 24-thread 3945WX processor paired with 16GB of memory, Nvidia Quadro P620 2GB, 256GB M.2 PCIe 3 SSD, and the 1000W power supply. This configuration is customizable and swapping the processor for the Threadripper Pro 3995WX bumps pricing up to $10,675. Naturally, you can spend as much as you'd like by adding a plethora of other devices to the build, like more memory, storage, and graphics. 

The highest-end preconfigured system lands at $6,029 with the 16-core 32-thread 3955WX with 32GB of DRAM, Quadro RTX 4000, and 1TB SSD. 

All configurations come with three years of on-site support, which is a critical feature for professional users. For an additional fee, you can extend that warranty up to five years, and also select a higher 'Premier' tier that offers next business day on-site service. 

MORE: Best CPUs

MORE: Intel and AMD CPU Benchmark Hierarchy

MORE: All CPUs Content

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • CerianK
    Probably a pointless question, but I assume the 16GB are dual-rank... I would be curious how 16GB single-rank (which I understand exist, but are the minority in the market) modules would perform in the 128GB configuration? Probably no difference, but might be worth exploring with a few select benchmarks, if possible.
    Reply
  • gatg2
    hate to be that guy but, it's not actually the first PCIe 4.0 capable workstation on the market, that honor goes to the Talos II Secure Workstation
    https://www.raptorcs.com/TALOSII/
    Reply
  • fellow
    I love these, especially the 12-16 cores at 4GHz, much closer to 5900 and 5950 for lightly threaded workloads. Great solution for those wanting expandable server and workstation features.

    I like the look of those Raptors too, especially the pci-4 and memory bandwidth. May get a Blackbird for testing and open source (mostly) fast hardware. See Phoenix coverage Part 2— the first were not as promising.

    For Threadripper Pro, has there been any information about the socket and CPU upgrade path?

    My main concern is the upcoming release of Zen3 Threadrippers. I imagine there will then be a Zen3 Threadripper Pro in a couple quarters or a year from now. The memory and pci expansion makes this an excellent platform for future growth.

    Since AMD has been forward looking by using the same socket for Ryzen, is it safe to expect the Zen3 TRPro will be accepted in this new socket?

    Gracias,

    fellow
    Reply
  • Endymio
    ... the most powerful workstation chip on the market - it's 64 cores easily outweigh Intel's
    Emergency edit on aisle four, please.

    Also, do I misunderstand the article, or has Toms yet again pronounced a verdict on a product they as yet haven't seen, or has even been released?
    Reply
  • Intel has nothing to touch the thread ripper so there’s nothing wrong with that statement
    Reply
  • Endymio
    Mandark said:
    Intel has nothing to touch the thread ripper so there’s nothing wrong with that statement
    Examine the highlighted word.
    Reply
  • hitchhiker0
    Fantastic! I like them very much.
    Picking a Threadripper Pro 3975WX, 128 GB RAM, some SSD, some NVidia GPU and make a virtual desktop infrastructure for computer-aided designing.
    You can host 4-6 virtual desktops quickly.
    Reply
  • Stefan Dyulgerov
    Hey in your benchmarks, can you include compilation of the Unreal Engine editor?
    The engine is quite taxing on the cpu both c++ and the shaders.
    Most people that are alone struggle with it. If you work in studio you can share cores, but at home alone:)
    Reply
  • mikewinddale
    Nice review, thanks.

    But I just discovered something interesting that you missed in the review:

    If you install six (6) dimms, applications like AIDA64, CPU-Z, etc. will recognize it as "hexa" channel, but benchmarks will reveal that the actual memory throughput is equivalent to merely dual-channel.

    So you can populate four or eight DIMMs, but be careful with six.

    For my application, I started a 3955WX with 4x64 GB RAM. I discovered that wasn't enough, so I upgraded to 6x64. My application now had enough RAM, but performance declined. So I had to upgrade to 8x64.
    Reply
  • robcowart
    @mikewinddale In my testing it is even worse than sticking to 4 or 8 populated channels. Anything less than all 8 channels has a significant impact on performance. The hardware setup for these tests was: 3995wx, ASUS Pro Sage, 256GB 3200MHz, writing to 4 x Samsung 980 Pro in RAID-0. Interesting is that while throughput dropped, meaning that technically the system is doing less work, the CPU utilization increased when all 8 memory channels weren't populated. I do wonder if the different channel-to-chiplet affinity between your 16-core and my 64-core model is responsible for why you don't see as big of a hit as I do with only 4 channels populated.


    Reply