Three-year-old CPU beats Intel's fastest current chip in RAM benchmark — 7 GHz Core i9-11900K tops 8.3 GHz Core i9-14900KS in PYPrime 32B

Core i9-11900K overclocked to 6.95 MHz (Image credit: Allen 'Splave' Golibersuch)

The Core i9-11900K may not be one of the best CPUs anymore, but that doesn't mean the flagship Rocket Lake chip has lost its mojo. Our resident extreme overclocking guru, Allen 'Splave' Golibersuch, has set a new world record with the Core i9-11900K in PyPrime 32B, outperforming the previous record holder, the Core i9-14900KS.

Setting world records isn't just about having the fastest hardware at hand. It helps when chipmakers like AMD or Intel send you trays and trays of processors to find the best sample. However, a lot of preparation work goes into an extreme overclocking endeavor, and finding the best platform for the job is one of them. Despite Intel and most of the hardware world having moved on from Rocket Lake, Splave has discovered that the Intel 500-series platform is probably one of the last low-latency platforms of its kind.

For those who haven't heard of PYPrime, it's an open-source RAM benchmark based on Python that scales with processor and memory overclocking. However, the latter is more important, so Splave beat the previous record set by Korean overclocker safedisk despite the former having an older chip and running at a lower clock speed. Splave's Core i9-11900K was operating at 6,957.82 MHz compared to safedisk's Core i9-14900KS, which was overclocked to 8,374.91 MHz.

With Rocket Lake, Intel introduced gear ratios, a similar approach AMD had taken with its Ryzen processors. As a quick recap, in Gear 1, the processor's memory controller and the memory speed are in sync. Meanwhile, Gear 2 forces the processor's memory controller to run at half the memory speed, so there's a performance hit. Rocket Lake runs DDR4 in Gear 1 and DDR5 in Gear 2 by default. Officially, Rocket Lake can support up to DDR4-3200 in Gear 1, so Splave's Core i9-11900K is a remarkable sample since it can do DDR4-3913.

Unlike other memory benchmarks that favor bandwidth, PYPrime just loves latency. Splave was running DDR4-3913 dual-rank, double-sided memory for interleaving gains. Splave tuned the timings to 12-11-11-18 1T, which are very tight thanks to the Samsung B-die ICs inside the G.Skill Trident Z DDR4-3466 C16 memory kit. For comparison, safedisk was using DDR5-9305 with timings configured to 32-47-42-34 2T. If we do the math, Splave's memory latency is 6.133 ns, up to 11% lower than safedisk's 6.878 ns.

For extra performance gains, Splave set the affinity to the core closest to the processor's integrated memory controller (IMC) and chose Windows 7, which is lightweight and not hindered by mitigations and security patches that newer operating systems and processors require to be safe. The result is Splave dethroning safedisk in PyPrime 32B by 285 ms. It may not seem like a considerable margin, but every last millisecond counts in competitive overclocking.

It's always thrilling to see old hardware beating the latest and greatest. Splave's PyPrime 32B world record pays homage to Rocket Lake and, more importantly, ASRock's Z590 OC Formula, the brand's last unrestrained overclocking motherboard. And while ASRock has tested the waters with the Aqua OC, it's not the same. Fear not overclocking enthusiasts; a little bird has whispered to us that the Formula is returning for Z890 and Core Ultra 200 (Arrow Lake) series that will hit the retail market before the end of the year.

See more CPUs News

Zhiye Liu is a news editor and memory reviewer at Tom’s Hardware. Although he loves everything that’s hardware, he has a soft spot for CPUs, GPUs, and RAM.

7 Comments Comment from the forums

bit_user

Sadly, latency is a tradeoff that DDR5 had to make, in order to continue scaling density and bandwidth. If you compare the lowest latency DDR5 (in terms of ns, not CL), it's nearly as good as fast DDR4 memory. At least, that's what I saw in examples I looked at.

Remember kids: you need to divide CL by the memory speed, in order to compare them.

The article said:
Splave was running DDR4-3913 dual-rank, double-sided memory for interleaving gains
That's somewhat redundant. The key detail is "dual-rank". It just so happens that most dual-rank DIMMs are also double-sided and vice versa. However, I think it should be possible to have chip-stacked DRAM that's dual-rank without being double-sided.
Reply
DingusDog

This. Changes. Everything.
Reply
jkflipflop98

Back in Ye Olde days when the company would just hand us free processors on request, I used to be into this scene. I'd go down to the SEM labs and fill a dewer with a bunch of LN2 and head home for nerdy fun on the weekends.

Gets expensive when you have to buy your own CPUs, though.
Reply
btmedic04

Rocket lake never had DDR5 support, so that whole part about gear 2 for ddr5 on rocket lake is erroneous. Do better
Reply
TheHerald

bit_user said:
Sadly, latency is a tradeoff that DDR5 had to make, in order to continue scaling density and bandwidth. If you compare the lowest latency DDR5 (in terms of ns, not CL), it's nearly as good as fast DDR4 memory. At least, that's what I saw in examples I looked at.

Remember kids: you need to divide CL by the memory speed, in order to compare them.

That's somewhat redundant. The key detail is "dual-rank". It just so happens that most dual-rank DIMMs are also double-sided and vice versa. However, I think it should be possible to have chip-stacked DRAM that's dual-rank without being double-sided.
It's also the size of the cache, as it grows latency grows with it. I never tried Rocket lake but going from CFL to CFL+ to CML (8700 --> 9900 --> 10900) latency progressively increased, a 8700k with tuned 4000mhz ddr4 can drop to 32-33ns, with a 10900k even at 4400mhzc16 ram youd hit a wall at around 36ns. You can get lower than that but with unsavory amounts of voltage.
Reply
bit_user

TheHerald said:
It's also the size of the cache, as it grows latency grows with it. I never tried Rocket lake but going from CFL to CFL+ to CML (8700 --> 9900 --> 10900) latency progressively increased,
Huh? All of those CPUs used the formula 32/256/2048 kB for L1d/L2/L3 cache per core. The only thing that changed in each case was the number of cores. That meant more stops on the ring bus, but you also got more L3 cache, as a side effect.

Another thing about them is they all used basically the same process node. If you're using a denser process node, then you can sometimes increase cache size without increasing the latency (in clock cycles). The other thing you can do to reduce the effect of cache on memory latency is simply clock higher while maintaining cache latency as the same number of clock cycles, thereby reducing how many ns cache tag RAM lookups have on memory latency.
Reply
DaveLTX

"Rocket Lake runs DDR4 in Gear 1 and DDR5 in Gear 2 by default."
Sorry what?
Reply

Show more comments