The quarter mile in drag racing, and the mile run in track and field are benchmarks to gauge speed competitively. To run a mile in under four minutes is godlike.
But what if the clock was malfunctioning and for every second in real time you were only charged for half a second on the clock? That would put an average person's eight minute mile into the four-minute godlike category. Unfortunately, this can and IS already happening in PC benchmarking for some time now due to an issue referred to as the “RTC Bug,” which alters the system’s perception of time.
This issue has been fixed with newer versions of Windows and Intel processors, but AMD’s Ryzen 3000 processors still suffer from the bug, so benchmark results for any AMD platform with an alterable base clock have been banned from HWBot, the organization that maintains the world’s largest competitive overclocking database.
We use benchmarks to measure performance of different PC platforms during an equal load. Scientifically, a benchmark’s workload is constant. For 3D benchmarks the result is measured in frames per second, while other applications like SuperPi measure the time it takes to finish the workload in seconds. The results of both types of benchmarks rely completely on an accurate time measurement. When a second is no longer a second in real time, we can no longer compare these results against each other. Therefore your benchmark is null and void, and the scoring is inaccurate.
Problems with time measurement of PC benchmarks are hardly new, and there has always been a fight against any form of cheating that changes the perception of time for a benchmark. But things got out of hand when Windows 8 introduced the RTC bug. Microsoft changed the system’s base clock, and as a result, several OS timers malfunctioned and began reporting inaccurate results that weren’t in line with the system’s performance.
As a result, HWBOT banned results made with Windows 8 and 10 for several Intel and nearly all AMD platforms. To this day, all AMD platforms with the ability to change the reference clock can be affected by the bug. Luckily Windows 7 remained reliable, so the solution was to avoid newer versions of Windows for benchmarking, if necessary.
About six years after the RTC bug popped up, AMD’s Zen 2 hit the shelves and the bug remains unsolved even in the latest versions of Windows 10. To make matters worse, there is no falling back to Windows 7 anymore. It’s currently a mess performance- and stability-wise, with no hopes for improvement as it nears its end of life. As a result, HWBOT had no other choice but to disable points for all Ryzen 3000 CPUs for all ‘unsafe’ benchmarks. like SuperPi, wPrime, Cinebench, Geekbench, and so on.
The largest overclocking database, HWBOT.org, has taken this seriously since the beginning and has a sharp outline for what is allowed to protect the sanctity of its database from being tainted with foul play (be it known to the user or through negligence).
The issues with the Windows system timer are complex, so we spoke with Matthias Zronek for deep dive details on the issue. Zronek is a talented software engineer who is also an overclocker, and he has developed a tool that competitive overclockers can use to verify benchmark accuracy, even in the face of the RTC bug.
In the most basic terms, the RTC bug is a timer flaw that alters the time perceived by a benchmark as a result of over- or under-clocking the base/reference clock of the system. “Let’s say you boot your i7-4770K with a bclock (base clock) of 104 MHz into Windows 10, reduce it to 98 MHz with your favorite OC tool and run a benchmark. The measured time for the run would be 6% shorter. It also works in the other direction: Booting with 98 MHz and overclocking to 104 MHz would stretch the time by 6% ending up in a result that’s possibly worse than without overclocking.” Zronek explained.
These inaccurate time measurements aren’t recorded in the standard screenshots that HWBot requires to verify benchmark submissions, and in the absence of a fix, the organization has banned all Windows 8 and 10 results that might be impacted by the bug. That means that even if you haven’t adjusted your base clock, you cannot submit a Cinebench result for a Ryzen 9 3900X, for instance.
Unfortunately there isn’t any official information on the RTC bug, so Zronek began to investigate the issue.
“Some basics first: A kernel ticks,” Zronek explains, “There is a System Clock Interrupt triggered by the CPU that awakes the OS periodically to check if there is work to be done. This normally happens 64 times per second, each tick being 15.6250 ms apart. When a tick occurs, an internal tick counter will accumulate the number of ticks that happened since the last boot. Some benchmarks use this kernel tick count to measure time, especially older ones. The kernel tick is also used to update the system time in the taskbar.”
But the operating system is free to choose a different time source for the System Clock Interrupt. In the past, it used the Programmable Interval Timer (PIT) to perform the wakeup. The Real Time Clock (RTC) can also be configured to fire periodical interrupts. Both have their own crystal and are independent of the CPU or bus frequency.
“Windows 8 introduced a state-of-the-art “tickless” kernel, or as Microsoft calls it. ‘Dynamic Ticks.’ The System Clock Interrupt no longer produces periodical wakeups with a fixed interval, instead the interrupt fires only when there is scheduling work to be done. This could happen, for example, if a thread sleeps for five seconds and needs to be woken up again to continue its work. During these prolonged idle times the CPU can stay in deeper C states to save power. This is especially true for low-energy devices like tablets that Windows 8 targeted for the first time,” Zronek continued.
“Together with the move to a tickless kernel, Microsoft secretly changed the preferred time source for System Clock Interrupts as well. The LAPIC timer enters the stage. It’s a widely available high-resolution timer situated inside the CPU that is driven by and therefore directly connected to the bus frequency of the system. We can only speculate why Microsoft decided to switch from PIT/RTC (Windows 7 and before) to the LAPIC timer (8 + 10). My best guess is that it was necessary to use a more fine-grained timer to reduce additional latency that can occur with dynamic wakeup calls. Btw, you can disable “Dynamic Ticks” by executing the following in an elevated command line window: ‘bcdedit /set disabledynamictick yes.’”
Zronek says that the term “RTC bug,” though widely used to describe the problem, isn’t entirely accurate.
Zronek created his own tool that uses a driver to measure the timing issues and found that, as shown in the image above, the real time clock actually has nothing to do with the issue. Zronek thinks a more accurate term for the issue would be the ‘LAPIC timer bug,” because the entire problem stems from the LAPIC’s dependency on the bus for time measurement.
At its root, the bug, no matter what you call it, is triggered by altering the base frequency. Benchmarking results on impacted systems are usable if you don’t change the base frequency, which Tom's Hardware doesn't do for its reviews. It’s notable that while results may not be skewed significantly, many motherboards have an unstable base frequency that can fluctuate by several microseconds per second in both directions (both higher and lower), which is measurable but doesn’t have a significant impact on test results.
Enabling the HPET (High-Precision Event Timer) in Windows is also another viable solution to combat the issue, but it comes with its own caveats. This timer fixes the bug in Windows 8 and 10, but it requires some manual work with a command entered into an elevated command prompt (“bcdedit /set useplatformclock yes”) and requires a reboot.
Zronek explains that there are even more pressing issues with HPET: “The biggest problem of forcing HPET are the system-wide implications. I’ve talked about the System Clock Interrupt above, the periodical wakeups for the OS. Well, enabling HPET will switch the time source for the interrupt to HPET. So the kernel tick and every tick-dependent API function for time keeping will rely on HPET now. That’s a good thing, because it solves the problem, right?”
“Sadly, that’s only true for older CPU generations. Modern CPUs with high core counts take a serious performance hit when enabling HPET. Although it’s a very precise timer, it resides on the chipset and has slow access times. If it’s used for every timing operation in the system, it bottlenecks the CPU and will lower your FPS or even bring stuttering to the Windows UI. The situation is especially bad on Kaby Lake X and Skylake X, where I’ve detected this anomaly first. But Ryzen and Threadripper are negatively impacted as well. I’m calling it the ‘HPET bug’ (not very unique, I know) and I’ve written a full article that features a small benchmark app called “TimerBench” to test the impact of the timer configuration on your own system.”
In the video below, you can see the impact of HPET in real time during a game.
To help address the issue, Zronek has created his own utility that he thinks could define a new standard for benchmark results verification. Zronek thinks this could even be more broadly applicable to the benchmark runs that vendors use to advertise their products, and even for reviewers to use in their testing.
“It’s called BenchMate and its main goal is to make benchmarking accessible again,” Zronek said. “Just download it, fire it up, launch the benchmark of your choice and the result will be automatically captured including all necessary system data. With the click of a button you can save a screenshot and submit the data to HWBOT. It works on Windows 7, 8 and 10 as well as modern and old hardware and verifies all supported benchmarks by using the same standard. It takes care of reliable time measurement, prevents cheating, and automatically applies all necessary rules needed for comparable results and therefore competitive benchmarking.”
Zronek says that, if needed, BenchMate can be updated to deal with future changes to Windows system timers, or if new security vulnerabilities arise. The software is free and is currently geared for use with HWBot submissions for competitive overclockers, but Zronek says that we could see newer versions that incorporate more benchmarks.
You might think that benchmark verification is only important for overclockers, but it's not. Companies regularly use benchmarks to promote hardware, some reviewers base much of their analysis on these results, and people all over the globe use benchmarks to compare their systems. And they should, because benchmark results are great to show the true potential of hardware!
But this only works if the benchmarks reliably measure a system's performance equally across all platforms and OS versions. It completely eludes me why this is not taken seriously by vendors, especially AMD and Microsoft.
However, there are ways -- even with AMD processors and Windows 10 -- to get accurate results. You can use BenchMate, which is the only recognized way to submit AMD results to HWBot. However, if you're not trying to compete for dominance on HWBot, you can just avoid changing the base clock frequency in the first place.
MORE: Best CPUs
MORE: All CPUs Content