Intel Hyperthreading Bug Unearthed In Kaby Lake And Skylake Processors (Updated)

Update, 6/27/17, 12:00pm PT: Intel responded to our questions with the following:

“We have already identified this issue and addressed it with a fix that started rolling out in April 2017. As always, we recommend checking to make sure your BIOS is up to date, but the chance of encountering this issue is low, as it requires a complex number of concurrent micro-architectural conditions to reproduce.”

Original article, 6/26/17:

Hyperthreading, which schedules two logical threads on one physical core, has been a boon to computing since its 2002 debut, but it hasn't been without its headaches. After 15 years, we could logically expect the kinks to be ironed out, but according to Henrique de Moraes Holschuh, a Debian Linux developer, Kaby Lake and Skylake processors have a serious flaw in their hyperthreading implementation.

This warning advisory is relevant for users of systems with the Intel processors code-named "Skylake" and "Kaby Lake". These are: the 6th and 7th generation Intel Core processors (desktop, embedded, mobile and HEDT), their related server processors (such as Xeon v5 and Xeon v6), as well as select Intel Pentium processor models.[...]This advisory is about a processor/microcode defect recently identified on Intel Skylake and Intel Kaby Lake processors with hyper-threading enabled. This defect can, when triggered, cause unpredictable system behavior: it could cause spurious errors, such as application and system misbehavior, data corruption, and data loss.

Intel's errata list for the recent Skylake-X processors (unearthed by Hot Hardware), provide a bit more insight into the nuts and bolts of the issue.

Problem: Under complex micro-architectural conditions, short loops of less than 64 instructions that use AH, BH, CH or DH registers as well as their corresponding wider register (eg RAX, EAX or AX for AH) may cause unpredictable system behaviour. This can only happen when both logical processors on the same physical processor are active.Implication: Due to this erratum, the system may experience unpredictable system behavior.Workaround: It is possible for the BIOS to contain a workaround for this erratum.

It appears the problem is confined to the sixth-generation Skylake and seventh-generation Kaby Lake processors, but it spans from desktop and mobile processors to Xeon models. The errata apply to any operating system, so it can also impact Windows users. The defect can lead to data loss or corruption and erratic system behavior. Unfortunately, the scope of the issue isn't well-defined. Specific code patterns in applications will trigger the defect, and as yet, there isn't a list of specific software to avoid.

For now, Holschuh recommends disabling hyperthreading to circumvent the issue, but that isn't an acceptable long-term fix. There are microcode fixes available for the Kaby Lake and Skylake processors through system vendors, which means you might have to wait for a BIOS/UEFI update to rectify the issue. According to the Debian post, for Kaby Lake processors that entails a BIOS/UEFI that fixes "Intel processor errata KBL095, KBW095 or the similar one for Kaby Lake," and for Skylake you'll need a fix for "Intel erratum SKW144, SKL150, SKX150, SKZ7."

Mark Shinwell, an OCaml toolchain developer, discovered the bug earlier this year, but Intel hasn't responded to his queries. Intel did issue microcode updates in the interim.

It's worth mentioning that we aren't aware of the extent of the issue and how much it will impact everyday desktop users. Skylake debuted in August 2015, so if there were a considerable number of mainstream desktop applications that trigger the errata, it would have likely already been thrust into the spotlight.

We do recommend caution, though, until we learn how many motherboard vendors have already issued the fix in BIOS/UEFI updates. For now, it's best to disable hyperthreading if you handle sensitive data, particularly in business applications. We've sent along the requisite request to Intel for more information and will update accordingly. 

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • leoscott
    There's a reason that buying the latest, greatest when it is first released is called bleeding edge.
    Reply
  • TJ Hooker
    19865784 said:
    There's a reason that buying the latest, greatest when it is first released is called bleeding edge.
    Skylake has been out for close to two years. Even Kaby Lake is about half a year old. It really isn't "first released" behaviour at this point, and Skylake is in no way bleeding edge.
    Reply
  • JamesSneed
    You can look up all the erratas for Skylake this one is the last in the list. Its a rather long list. This was linked in the article the point is there is stuff in here that is nothing to do with bleeding edge, lots of old erratas with no fix.

    https://www.intel.com/content/www/us/en/processors/core/desktop-6th-gen-core-family-spec-update.html
    Reply
  • DookieDraws
    Isn't Haswell also affected by this very same issue?
    Reply
  • This may sound weird, but I'm actually happy about this news.

    I've been having problems with Windows 10 hanging almost every day ever since I built a new PC with a Skylake i7 6700K CPU. Despite tons of researching, I never found an explanation for this weird problem, not to mention a solution.

    This news is the first time since long that I've seen something that might be related to my issue. Just thinking that I may finally be able to solve my problem by disabling HT fills me with joy.
    It sucks, but disabling HT is significantly better than buying a new mainboard and CPU.
    Reply
  • ammaross
    19866079 said:
    This may sound weird, but I'm actually happy about this news.

    I've been having problems with Windows 10 hanging almost every day ever since I built a new PC with a Skylake i7 6700K CPU. Despite tons of researching, I never found an explanation for this weird problem, not to mention a solution.

    This news is the first time since long that I've seen something that might be related to my issue. Just thinking that I may finally be able to solve my problem by disabling HT fills me with joy.
    It sucks, but disabling HT is significantly better than buying a new mainboard and CPU.

    No, this is very likely not your issue.
    Reply
  • ffleader1
    It's kinda funny to think that after 5 years without competitor and just need to copy/paste the same architecture over and over again to make profit, Intel still managed to let a bug happen, on two latest generations.
    Reply
  • nzalog
    19865925 said:
    You can look up all the erratas for Skylake this one is the last in the list. Its a rather long list. This was linked in the article the point is there is stuff in here that is nothing to do with bleeding edge, lots of old erratas with no fix.

    https://www.intel.com/content/www/us/en/processors/core/desktop-6th-gen-core-family-spec-update.html

    Literally all the erratas have no fix.
    Reply
  • bigpinkdragon286
    Any chance of Tom's running a few benchmarks before and after the updates to see if performance changes as a result?
    Reply
  • jaber2
    This explains why I'm getting such low frame rates
    Reply