Sign in with
Sign up | Sign in

Intel Patents Redundant Cores In a Many-Core Processor

By - Source: Tom's Hardware US | B 43 comments

Intel has just been granted a patent that claims the rights to the concept of using initially inactive processing cores to replace failing cores.

According to the patent, increasingly complex processors with a greater number of cores, referred to as many-core processors by the company, will see higher failure rates than single- or dual-core processors. In fact, the patent states that the lifetime of a core may "shorten from generation to generation." The reasons include electromigration, stress migration, time dependent dielectric breakdown, negative bias temperature instability (NBTI), and thermal cycling.

To alleviate failure concerns, the patent covers an approach of core management, which is heavily focused on temperature monitoring of the individual cores: "Because many semiconductor failure mechanisms are expressed at elevated temperatures, temperature thus has a direct bearing on core MTTF [mean time to failure] and many-core reliability," the patent document explains. "If the temperature cannot be decreased, a many-core processor would activate spare cores to protect both the possibly failing core as well as neighboring cores. Both failed and spare cores are described to "absorb heat generated by active cores, driving the temperatures on the active cores down."

In an allocation/reallocation scenario, Intel says that the temperatures of cores can be drastically reduced.

There is no indication when Intel will actually use such a technology, but the examples in the patent start with at least 32 cores total, which use 16 active and 16 spare cores.

Display 43 Comments.
This thread is closed for comments
Top Comments
  • 28 Hide
    kjsfnkwl , December 8, 2011 1:07 PM
    The problem here is that enthusiasts will want to unlock those extra cores, which would undermine the entire system.
  • 22 Hide
    randomstar , December 8, 2011 1:26 PM
    think about a satelite or spacecraft - you do not want the extra cores active using power- but on a mission critical piece of equipment, calling a repair main a year away to replace a failing part might not be a good back up plan.
    extend the lifetime and reliability of the whole endevor without having to have wholy seperate computers..
  • 20 Hide
    Lyden , December 8, 2011 1:12 PM
    Um... when was the last time you had a core fail on you? Me? never.
Other Comments
  • 28 Hide
    kjsfnkwl , December 8, 2011 1:07 PM
    The problem here is that enthusiasts will want to unlock those extra cores, which would undermine the entire system.
  • 5 Hide
    billybobser , December 8, 2011 1:09 PM
    I guess this would be a fail-safe against cores going down under warranty.

    As I imagine activating the inactive cores will give little to no benefit.

    Although I don't really see how this is patentable. Having redundant hardware to click in in case of failure?

    I guess you should be able to get round but having all cores active when needed, and sleeping when not (as is possible today), and making it so a processor can carry on if a core dies mid process.
  • 20 Hide
    Lyden , December 8, 2011 1:12 PM
    Um... when was the last time you had a core fail on you? Me? never.
  • 0 Hide
    mcd023 , December 8, 2011 1:13 PM
    I think that they're patenting the monitoring of temperature to activate and deactivate cores to absorb and dissipate heat, not just having the redundant one there. So, others might be able to measure load on the cores to do the same thing as opposed to measuring the heat. I think?
  • 16 Hide
    ksampanna , December 8, 2011 1:22 PM
    Might be useful in enterprise sector. Meanwhile for the desktop, spare us the extra cores & use the space for higher cache or better graphics.
  • 22 Hide
    randomstar , December 8, 2011 1:26 PM
    think about a satelite or spacecraft - you do not want the extra cores active using power- but on a mission critical piece of equipment, calling a repair main a year away to replace a failing part might not be a good back up plan.
    extend the lifetime and reliability of the whole endevor without having to have wholy seperate computers..
  • 10 Hide
    outlw6669 , December 8, 2011 1:27 PM
    This is a terrible idea.

    Why not ship the CPU with all cores active and give it a 'soft fail' feature for failing cores?
    By 'soft fail' I mean that a failing core could be dynamically deactivated while allowing all other cores to function normally.

    This would allow you to have higher initial performance and give you uninterrupted computing in the case of a core failure.
  • 0 Hide
    saturnus , December 8, 2011 1:41 PM
    The idea makes no sense neither from a business PoV or a customer PoV. Why would you in principle cripple the performance of your product to retain a lower performance threshold longer? You'll have to have really crappy product quality if you calculate in that a significant number of the cores fail within the normal 3-5 year life cycle to warrant this patent. Instead it would make much more sense to have all cores anabled from the start to have maximum performance, and the as cores fail it would read out as a indicator for when to exchange them instead.
  • 0 Hide
    nottheking , December 8, 2011 1:44 PM
    I can see some use of this (it sounds as if it could re-balance load before cores fail, simply to prevent individual cores from stay hot too long) but I can't help but think... Wasn't this sort of design perhaps already patented for use in spacecraft? Their designs tend to make use of redundant CPUs, and seem to use a method not entirely unlike what Intel described here.
  • -2 Hide
    saturnus , December 8, 2011 1:49 PM
    notthekingI can see some use of this (it sounds as if it could re-balance load before cores fail, simply to prevent individual cores from stay hot too long) but I can't help but think... Wasn't this sort of design perhaps already patented for use in spacecraft? Their designs tend to make use of redundant CPUs, and seem to use a method not entirely unlike what Intel described here.


    Standard fail-over systems used on spacecraft, and by every single serious server provider in the world, are completely seperate systems that steps in if one system fails. It's completely unrelated to this patent both in spirit and in practice, as it is much safer to switch to a different system all-together instead of relying on fail-over on the same silicon that is failing in the first place. No serious business would ever rely on that.
  • 6 Hide
    Honis , December 8, 2011 1:58 PM
    I don't think we'll see CPUs based on this at a consumer level for a long time. Redundancy is far more important in a server environment than it is on a home desktop. Having a fallback core on the CPU will (in theory) give the IT guy time to mitigate load off a server that's fallen to the fail safe and in the end have 0 downtime or even a slow down in available services.
  • 0 Hide
    CaedenV , December 8, 2011 2:20 PM
    This is exactly the type of article that makes me wonder about AMD fans. They talk about bulldozer as if the only reason that it is not great is because it is ahead of it's time, and that the architecture is the way of the future. Meanwhile Intel is already working with 64+ core setups and laying the groundwork for many-core products while still selling stuff that works well today!
  • 7 Hide
    infernolink , December 8, 2011 2:21 PM
    outlw6669This is a terrible idea.Why not ship the CPU with all cores active and give it a 'soft fail' feature for failing cores?By 'soft fail' I mean that a failing core could be dynamically deactivated while allowing all other cores to function normally.This would allow you to have higher initial performance and give you uninterrupted computing in the case of a core failure.


    This would mean that it would only got hotter faster, and possibly heat up all the cores to a point of deactivation. At that point you would technically have no CPU anymore.
  • 0 Hide
    Anonymous , December 8, 2011 2:29 PM
    I am sure that Sir Clive Sinclair proposed something similar to this for 16bit processors and memeory... many moons ago (20 or so years ago)...
  • -2 Hide
    t_wilson , December 8, 2011 2:34 PM
    Yet another example of just how ridiculous the patent system is.
  • -5 Hide
    outlw6669 , December 8, 2011 2:47 PM
    infernolinkThis would mean that it would only got hotter faster, and possibly heat up all the cores to a point of deactivation. At that point you would technically have no CPU anymore.

    Not really sure how :??: 
    As long as you have a properly designed cooling system, you should have no issues at all.
    When a defective core is disabled (presumably by power gating the affected area of the chip) overall power consumption (and therefore heat production) will be reduced.
    This will lead to a cooler system as it ages, not a hotter one...
  • -1 Hide
    theuniquegamer , December 8, 2011 3:16 PM
    Looks like intel has its spare wheels (cores) for its superfast muscle cars (upcoming 32 core 14nm haswell or broadwell cpus). Nice idea to patent before amd or arm or (ipeoples).
  • 2 Hide
    memadmax , December 8, 2011 3:37 PM
    The person that talks about this regarding space based processors hit it right on the money.

    Intel is talking about a processor that is super mission critical. Think space, nuclear, or some other place that man can't go to repair a computer. It sounds to me that Intel may be gearing up to knock out IBM/Motorola in the industrial compute department.
  • -2 Hide
    infernolink , December 8, 2011 4:02 PM
    outlw6669Not really sure how As long as you have a properly designed cooling system, you should have no issues at all.When a defective core is disabled (presumably by power gating the affected area of the chip) overall power consumption (and therefore heat production) will be reduced.This will lead to a cooler system as it ages, not a hotter one...


    I see it this way, if you have all your cores running at 100% they all will get hot quick. Since they all will have a similar temperature they may all reach the critical temperature at one point, which would signal them to shut off. Since Intel is working on tri-gate transistors I believe they mean they will stack cores on each other, thus making it harder for even the best coolers to cool. With this method of disabling cores and having inactive ones they can better compensate for that fact.
Display more comments