32-Core Overclock: How I Pushed the Threadripper 3970X 1 GHz Over Its Limit
I take the Cinebench world record with the Threadripper 3970X
AMD's recent CPUs are not known for their overclocking prowess. While the company ships all of its processors, including its powerful Threadrippers, unlocked, most don't have a lot of headroom for you to gain more than a few hundred MHz of clock speed. However, with the help of LN2 (liquid nitrogen) cooling and some settings tweaks, I was able to push AMD's 32-core Ryzen Threadripper 3970X as high as 5.5-GHz (5.4 - 5.5 on all cores) and set some new world records on Cinebench in the process.
I achieved a score of 23,323 points with the Threadripper 3970X, beating the previous record of 17,027 with the Threadripper 2990WX, and well over Intel's flagship Xeon W-3175X at 4.8GHz that previously set a record at 17,000 points.
The Threadripper 3970X is undoubtedly a behemoth. The PCB and integrated heat spreader (IHS) are very robust in comparison to Intel's current chips. This is something that I absolutely find importance in. I apply hundreds of pounds of pressure to the processor when I mount my liquid nitrogen pot, an eight-pound hunk of copper, and when I tighten the screws with pliers I have to trust that the components will not bend or break. Any amount of bowing will drop memory channels and cause other headaches.
With the Threadripper 3970X, I didn't experience any issues at all, even when pushing the boundaries beyond my own comfort: AMD’s socket and hold down is absolute perfection to me. It’s entirely idiot proof, kind of like loading a cassette into a tape player (what are those?).
My only niggle is AMD's choice of threading for the holes that you use to mount the cooler to the socket. AMD's socket xTRX4 uses a very rare M3.5 x 0.6mm threading, which is important because I have to buy long rods to mount my LN2 pot. A yard of the rod and four thumb nuts will set you back a hard-to-accept $85.
Enough banter, lets overclock this baby. I will give you an idea of what it will hit with an All-In-One watercooler (will it even work?), a custom watercooling loop, and then finally on liquid nitrogen. You can find a list of parts I will use below.
- AMD 3970X Processor
- Thermal Grizzly Kryonaut LHE Edition
- ASRock TRX40 Taichi Motherboard
- GSKILL NEO 8GBx4 F4-3800C14-8GTZN
- Enermax Max Tytan 1250W Power Supply
- Enermax Liqtech TR4 II Series 360 AIO
- Byski A Ryzen Tech V2 X Water Block w/ 9x120 MO-RA radiator and D5 Pumps
- 8ECC Thread Ripper Ln2 Pot
I used the ASrock TRX40 Taichi motherboard, which has a 16-Phase VRM with 90-amp chokes all in a normal ATX form factor. It’s a tank that's built to handle the 64-core Threadripper 3990X that should be released soon, so it easily handles the “baby” 32-core model. I didn't even actively cool the VRM or use the included VRM cooler, but it remains cool to the touch. This board is built for extreme overclocking (XOC), so normal ambient overclocking is a joke. Consider the motherboard handled.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
First off, let’s test the Enermax Liqtech TR4 II Series all-in-one water cooler. This cooler is made specifically for Threadripper so it should work, but can we actually overclock on it? The answer is, hell yes we can. The cold plate is extremely flat out of the box which meshes perfectly with AMD’s well-refined integrated heat spreader (IHS). It should also go without saying that the full-coverage cold plate, which covers the entirety of the IHS, is a must.
I blasted copious amounts of voltage to the cores with no sweat. I managed an impressive 4,450 MHz in Cinebench R20, hitting a peak of 78C on the hottest core as I blasted my way to a monsterous score of 19,600. Honestly, that is shockingly high. This chip means business.
Next, I moved on to the custom loop with a Byski waterblock, and I was quite surprised with the results. Keep in mind that the threaded rod and those thumb nuts cost me more than the water block did, which is outrageous. I reached the same frequency limit as I did with the Enermax Liqtech TR4 II Series.
The Byski did allow me to use slightly less core voltage (0.05V less) as the temps were, on average, 5C cooler, but this made no difference and 5C is nearly irrelevant to me. Bumping up the voltage further did not help me push clocks any higher, so it's clear the CPU is at the limit of ambient cooling. Let’s remove that limit.
LN2 Overclocking the Threadripper 3970X
Now that we're done with ambient cooling, lets dump some cold juice on it. I strapped my Der8auer-designed 8ECC Threadripper pot on with my expensive threaded rods and thumb screws and went to work.
If you think just anyone could do this, you would be wrong. We have many different things at play here: Mo’ chiplets, mo’ problems.
Threadripper is tricky when you first start out on LN2 benching because we need to limit our fabric clock in order to run extremely cold temperatures, but the limit varies with each processor. There is not a standard set-in-stone rule as to what will work.
The fabric can easily run at 1,800-1,900 MHz on all cores with ambient cooling. On LN2, I can run easily run it at 1,600 MHz with no cold bug issues (even at -189C). However, I can't restart the computer when I raise the fabric to 1,700 MHz. So when I hang I get an error code on post until I heat the pot back up to -158C and boot into windows. Then I have to drop back down to -189C and try again.
Once you have your fabric sorted, then you have to tweak the memory and Uclk. The memory divider you decide on is directly tied to the Uclk. Memory and Uclk will run at a 1:1 ratio until you break 1,800 MHz (3,600 MHz). Then it switches over to 1:2.
For example, 2,000 MHz (4,000 MHz) memory will leave you with a 1,000 MHz Uclk. This is referenced as “coupled mode” in AMD's Ryzen Master software when you have a 1:1 ratio. Efficiency-wise, most benchmarks benefit from this coupled mode. In this case, instead of going for crazy bandwidth and looser latency, I opted for 3,600 MHz memory at tight latency to keep the Uclk higher and coupled to the memory.
This can be frustrating when you're at the bleeding edge of clock speeds. I can tell you that we need every single degree of cold that we can get: The system sucks down 1,200W in Cinebench R20 with a 1.78vCore and healthy amounts of SoC voltage. The Idle wattage alone, as read from the Enermax Maxtytan 1250W PSU, shows 200+ watts.
In short, the cores love the cold and the fabric hates the cold, which makes things tricky.
Now the fabric is happy at 1,600 MHz, the memory is happy with Uclk coupled at 1,800 MHz, and I'm happy because I'm not wasting LN2 to find the settings that work anymore. Now that we are in the OS “full pot” as us XOC’ers call it (the pot is literally full of LN2 and is as cold as possible), we need to control our speeds.
Ryzen Master is an excellent overclocking application. It lets us change the core clocks individually, by half a chiplet, by an entire chiplet, or by the entire processor. With the 32-core processor essentially being four 8-core chiplets, its safe to treat them like four separate processors.
In my case, I dialed in two of the chiplets at 5,400 MHz, one at 5,425 MHz, and the last chiplet at an impressive 5,550 MHz. Most benchmarks can handle the different clock speeds quite well and will assign the work accordingly. Geekbench, on the other hand, seems to get confused with some threads finishing certain tasks before the others and prefers continuity between the cores.
With these speeds, I took the Cinebench R20 Protected by Benchmate World Record at a staggering 23,323 points. That beats the previous record of 17,027 with the Threadripper 2990WX and it's well over the score from Intel's flagship Xeon W-3175X that hit 17,000.
One thing that bugs me about these chips is that the integrated memory controller (IMC) is quite amazing, but confusing. I can easily run nearly 5,000 MHz on quad-channel memory, but there is no performance benefit at all. The Uclk is decoupled, and efficiency is terrible. The best performance in most cases is at 3,600 MHz, and that is why this G.Skill NEO 3800C14 Bin memory is perfection.
The 3,800 MHz & Cas Latency 14 is an extremely impressive combination at 32GB with such low voltage. In my opinion, the XMP profile doesn't make sense, though, as you will actually see a gain in performance when you lower the speed to 3,600 MHz (as I have hammered down your throat this whole article).
In other words, buy the memory for the bin quality, not for the XMP profile.
In closing, when I asked motherboard designer Nick Shih from ASRock how much vCore I can use safely on LN2 he said, “It will scale until your power supply blows up.” He’s not kidding. On LN2, I pushed 1.78 vCore into the processor and peak wattage hit 1200W at 5.45GHZ in Cinebench R20.
These CPUs, even being 7nm, are not fragile by any means. There is no degradation on the CPUs at all after LN2 use. This is a nice sign for AMD as some of their auto-overclock PBO voltages seem to climb quite high, but I doubt AMD will have any issues that arise from it. The chips can take it, no problem.
The 3970X feels like overclocking a server part. There is a ton of refinement needed to get it on par with Intel in terms of OC features, and even control. It feels in a sense like my refrigerator; it’s designed to be plugged in and just work, not to be fiddled with. To be fair, I think that is what will happen in most cases.
Another thing I noticed is that AMD appears to save the best chiplets for the 32- and 64-core parts. This is obvious seeing that all four 3970X CPUs I have tried beat my best 3950x and 3900x in max clock speed on water cooling! There's nothing wrong with tight binning, but for overclockers, finding a chip that you can just overclock to be competitive with the next SKU doesn't seem possible on the current-gen AMD chips. If you want to overclock your processor, pay the extra cash and get the X-series, as the non-X series are the chips that don't hit the higher boost bins.
Overall, the Threadripper 3970X has strong performance in a category I would say AMD has created its own niche in. It feels to me like a media creation/creator/professional platform for the home user.
Without a doubt, the 3970X is the fastest 32-core processor money can buy. The question you have to ask yourself is, do you need 32-cores? Why not? You only live once! I expect the 3970X to be the top dog for at least a month until the 64-core 3990X lands. See you then!
A world-champion competitive overclocker who frequently tops the charts at HWBot, a site which tracks speed records, Allen will do just about anything to push a CPU to its limits. He shares his insights into the latest processors with Tom’s Hardware readers from a hardcore, push-it-to-the-limit overclocker’s perspective.
-
none12345 Buy a tap set and thread your own rod? You dont need full threaded rod, just thread the end, and put whatever thread you want at the top for your nuts. it doesnt take very long to cut some threads.Reply
Took me 1 minute of searching to find a $10 tap set(both hole and rod) for m3.5 x 0.6 -
Saberus Overclocking a Threadripper feels like overclocking a server part? That's because it kinda is an EPYC variant that's tooled for HEDT.Reply
And the socket mount is similar to what HPE used in their Gen8 ProLiants. It's a good system, fairly foolproof. -
nofanneeded What is the point of such OC when you cant use it for long time ? using LN2 ?Reply
for what exactly?
make world records that will run forever then be proud of it. -
bit_user Even though I have no interest in doing any OCing, myself, this was a pretty fun read. Thanks.Reply -
bit_user
The only concern I'd have is what sort of material he needed, in order to exert that amount of force (and possibly withstand the temperatures?). Otherwise, I was thinking the same thing.none12345 said:Buy a tap set and thread your own rod? You dont need full threaded rod, just thread the end, and put whatever thread you want at the top for your nuts. it doesnt take very long to cut some threads.
Took me 1 minute of searching to find a $10 tap set(both hole and rod) for m3.5 x 0.6 -
bit_user
I imagine what he meant was in terms of the lack of refinement, as the next sentence indicated:Saberus said:Overclocking a Threadripper feels like overclocking a server part? That's because it kinda is an EPYC variant that's tooled for HEDT.
There is a ton of refinement needed to get it on par with Intel in terms of OC features, and even control.
I think server parts aren't really designed to be overclocked. So, the point would be that if AMD wants to sell ThreadRippers to users wanting to OC (which the CPU can clearly handle), then they should put some more work into the feature set needed to facilitate it. -
bit_user
It's like asking what's the point of building dragsters or rigs for doing tractor-pulls. Some people just like seeing how far they can push the technology, even though it's totally impractical for normal use. There's another aspect of it, as well - just scratching the itch of curiosity. In this case, how would we know how fast modern CPUs could run on LN2, if nobody ever tried it?nofanneeded said:What is the point of such OC when you cant use it for long time ? using LN2 ?
for what exactly?
Besides being fun to watch, I think there are probably some side-benefits for the rest of us, in the form of improvements they devise that eventually trickle down into the mainstream.
The same points could probably even be made about Olympic athletes, and their equipment & training regimes. -
tntom
I can understand why this seems pointless. Until you understand that this in a necessary science that creates transparency to the general consumer. Much like you hope a bridge was tested in a lab before you drive across it. Stressing these parts verifies manufacturer claims and provides relevant real world data on reliability. 10 mins of LN2 can simulate months or even years of continuous operation. The fact that some people find it fun does not diminish its value. We would not have disc brakes on cars if it wasn't for auto racing.nofanneeded said:What is the point of such OC when you cant use it for long time ? using LN2 ?
for what exactly? make world records that will run forever then be proud of it. -
Christopher1 Wow! Sounds like this chip is quite an awesome piece of silicon that could burn its way through the average person's needs when partnered with a proper discrete graphics card.Reply