Intel Patents Redundant Cores In a Many-Core Processor

According to the patent, increasingly complex processors with a greater number of cores, referred to as many-core processors by the company, will see higher failure rates than single- or dual-core processors. In fact, the patent states that the lifetime of a core may "shorten from generation to generation." The reasons include electromigration, stress migration, time dependent dielectric breakdown, negative bias temperature instability (NBTI), and thermal cycling.

To alleviate failure concerns, the patent covers an approach of core management, which is heavily focused on temperature monitoring of the individual cores: "Because many semiconductor failure mechanisms are expressed at elevated temperatures, temperature thus has a direct bearing on core MTTF [mean time to failure] and many-core reliability," the patent document explains. "If the temperature cannot be decreased, a many-core processor would activate spare cores to protect both the possibly failing core as well as neighboring cores. Both failed and spare cores are described to "absorb heat generated by active cores, driving the temperatures on the active cores down."

In an allocation/reallocation scenario, Intel says that the temperatures of cores can be drastically reduced.

There is no indication when Intel will actually use such a technology, but the examples in the patent start with at least 32 cores total, which use 16 active and 16 spare cores.

  • xx_pemdas_xx
    Who would want a CPU that gets slower over time?? I want a bulldozer, its cores come stable for less money.
    Reply
  • kjsfnkwl
    The problem here is that enthusiasts will want to unlock those extra cores, which would undermine the entire system.
    Reply
  • billybobser
    I guess this would be a fail-safe against cores going down under warranty.

    As I imagine activating the inactive cores will give little to no benefit.

    Although I don't really see how this is patentable. Having redundant hardware to click in in case of failure?

    I guess you should be able to get round but having all cores active when needed, and sleeping when not (as is possible today), and making it so a processor can carry on if a core dies mid process.
    Reply
  • Lyden
    Um... when was the last time you had a core fail on you? Me? never.
    Reply
  • mcd023
    I think that they're patenting the monitoring of temperature to activate and deactivate cores to absorb and dissipate heat, not just having the redundant one there. So, others might be able to measure load on the cores to do the same thing as opposed to measuring the heat. I think?
    Reply
  • ksampanna
    Might be useful in enterprise sector. Meanwhile for the desktop, spare us the extra cores & use the space for higher cache or better graphics.
    Reply
  • randomstar
    think about a satelite or spacecraft - you do not want the extra cores active using power- but on a mission critical piece of equipment, calling a repair main a year away to replace a failing part might not be a good back up plan.
    extend the lifetime and reliability of the whole endevor without having to have wholy seperate computers..
    Reply
  • outlw6669
    This is a terrible idea.

    Why not ship the CPU with all cores active and give it a 'soft fail' feature for failing cores?
    By 'soft fail' I mean that a failing core could be dynamically deactivated while allowing all other cores to function normally.

    This would allow you to have higher initial performance and give you uninterrupted computing in the case of a core failure.
    Reply
  • saturnus
    The idea makes no sense neither from a business PoV or a customer PoV. Why would you in principle cripple the performance of your product to retain a lower performance threshold longer? You'll have to have really crappy product quality if you calculate in that a significant number of the cores fail within the normal 3-5 year life cycle to warrant this patent. Instead it would make much more sense to have all cores anabled from the start to have maximum performance, and the as cores fail it would read out as a indicator for when to exchange them instead.
    Reply
  • nottheking
    I can see some use of this (it sounds as if it could re-balance load before cores fail, simply to prevent individual cores from stay hot too long) but I can't help but think... Wasn't this sort of design perhaps already patented for use in spacecraft? Their designs tend to make use of redundant CPUs, and seem to use a method not entirely unlike what Intel described here.
    Reply