The Skylake-X Mess Explored: Thermal Paste And Runaway Power

Factory Imposed Limits: Heat Spreaders, TIM & The IVR

One reason for the cooling problem is Intel's use of inadequate (but arguably much cheaper) thermal paste instead of indium-based solder. Although we can debate the durability of solder over time, particularly as it relates to CPUs with small dies, we have seen processors of different sizes operate stably and error-free over many years with solder between the die and heat spreader.

Moreover, thermal pastes have their own long-term stability issues. Over time, the oils in these materials separate from the solids, introducing air gaps between the surfaces and increasing thermal resistance. This effect is different in all pastes, but it can't be prevented completely.

Why is this such a big deal to us? The following curve from our Core i9-7900X review, which we generated with a very high-end cooling solution, shows clearly that waste heat is dissipated poorly and inadequately with paste between the die and heat spreader. What actually worked effectively for a 91W chip like the Core i7-7700K now leads to a thermal bottleneck.

In the end, the following graph represents the glaring temperature differences between the heat spreader on top and cores underneath. We were shocked in our launch story and remain so today.

Although we're using some of the highest-end and most expensive cooling hardware available, we still measure up to a whopping 71 Kelvin difference between the cores' reported temperature and the heat spreader's top. Obviously, a more mainstream closed-loop liquid cooler under full load would look quite silly.

Observation #2: The dissipation of waste heat is hindered by the CPU's construction and Intel's deliberate decision to use thermal paste between the heat spreader and die. Regardless of how much pressure you use or how cold you can get your heat sink, you'll never realize Skylake-X's potential the way it's currently configured. Intel applies a thermal brake, favoring longevity and sacrificing performance.

Now, you might be tempted to remove the heat spreader and replace Intel's thermal paste with something better. But that's simply not a realistic course of action for most enthusiasts. It takes a special tool, a steady hand, and some prior practice. Of course, the process also obliterates your warranty.

It'd be even more extreme to leave the die exposed and use a good torque screwdriver to minimize the possibility of mechanical damage from a non-uniform/excessive load. That's still a risky move, though.

In the end, de-lidding is one solution to this cooling bottleneck, though it lacks mass appeal. A certain contingent of enthusiasts will try their hands at it regardless, and we can only caution that you consider the risks first.

Skylake-X's Integrated Voltage Regulators

Intel's Haswell and Broadwell designs employed a Fully Integrated Voltage Regulator, incorporating power delivery onto the package/die. FIVR was to simplify motherboard layouts by consolidating five platform-based voltage regulators down to just one. But the implementation created some issues for overclockers, too.

Skylake-S did away with FIVR. Now, Skylake-X re-incorporates integrated voltage regulation, though its IVR is linear, rather than switching.

What's this all about? Well, the motherboard's external voltage converters do not deliver the Vcore, as in Kaby Lake-X, but rather an intermediate voltage (VCCIN, or eventual CPU input voltage) as input for Skylake-X's IVR. If you take a look at the picture below, you can see the point for measuring VCCIN for Skylake-X or Vcore for Kaby Lake-X. The CPU determines which voltage is delivered from the VRM, and it can be between 1.6V and up to a maximum of 2.55V.

Anecdotally, it was this hybrid approach that led to so many CPU deaths in the run-up to Intel's launch, as folks switched from Skylake-X at 1.8V to Kaby Lake-X and applied far too high of a voltage.

Based on the lower intermediate voltage VCCIN, delivered by the VRM (which does the biggest part of the voltage regulation job), the IVR generates voltage for the cores (Vcore) and all needed sub-voltages for the last-level cache, mesh topology, the I/O (VCCIO), the system agents (VCCSA), and the PIROM (VCC33).

This intermediate voltage VCCIN is controlled by the CPU via the SVID (Serial Voltage ID) bus, and the R35201 controller also supports Intel's latest VR13.0 PWM. This VID-based voltage is similar to the former loadline of the Vcc of older CPUs.

Skylake-X's Maximums & Extreme Overclocking

Intel specifies a TDP of only 140W for existing Skylake-X CPUs. The maximum current is an incredible 190A (peak, for <2ms), but also cut down to 73A for the Thermal Design Current. The Thermal Design values (for wattage and current) are defined by Intel to clarify the VRM load and cooling requirements under a constant load. The maximum package power is set to 297W. Tests with higher values cause the motherboard to shut down at 365W.

Observation #3: Power consumption for VCCIN beyond 300W has nothing to do with realistic overclocking, since the CPU is loaded beyond its thermal spec well before that point. For long-term stability, a maximum value of up to 250W is more realistic (even if it's still quite high).

MORE: Best CPUs

MORE: CPU Overclocking Guide: How (and Why) to Tweak Your Processor

MORE: Intel & AMD Processor Hierarchy

MORE: All CPUs Content

This thread is closed for comments
138 comments
    Your comment
  • You guys don't get it??? I talked to some people who got 6 core of Skylake-X and they were able to push CPU up to 4.6Ghz on all cores where temperatures were fine under Prime. Again temperatures were much lower in anything else. In my opinion Prime is rather unrealistic stress test, not to say useless crap proving nothing. I am not defending Intel but you all approached this problem with a wrong assumption.

    With 7900X which is still built using 14nm fabrication process, there is no in hell you are going to be fine with temperatures on overclocked 10/20 cores. That's just too many of them to keep them cool.

    If someone gets 10/20 CPU i would not push more than 4Ghz. That is a max realistic clock speed for such CPU, with 8 Core you will be better but i'd say the best thing to buy is actually 6/12 Core which can easily run at @4.5Ghz.

    People don't play Prime or any other similar >Mod edit: keep it clean<test. People game, do programming, stuff where you will never see CPU showing overheating issue. And again keep 10/20 at 4.0Ghz max. Honestly you won't gain a thing running at 4.4Ghz.
  • Also i might want to add is to wait for second iteration of x299 boards. The first batch is a joke from cooling point of view. Evga is one of the companies which will get it right. X299 need copper based cooling for VRM and chipset and also 2x8pin CPU connectors with recommended PSU of 1000W+. That's how i would run x299 setup.
  • AgentLozen
    Freak777Power said:
    You guys don't get it??? I talked to some people who got 6 core of Skylake-X and they were able to push CPU up to 4.6Ghz on all cores where temperatures were fine under Prime. Again temperatures were much lower in anything else. In my opinion Prime is rather unrealistic stress test, not to say useless crap proving nothing. I am not defending Intel but you all approached this problem with a wrong assumption.


    What's wrong with using Prime? It does a good job of testing the thermal limits of a CPU. You wouldn't test the limits of a weight lifters strength with 5 pound dumb bells. You need to go all out.

    You say that the author of this article approached this problem with a wrong assumption. Do you think that there's nothing noteworthy of Skylake X's thermal performance?

    I think this article did a good job of pointing out the glaring flaws of Skylake X. The conclusion is really interesting: "We're getting the sense, though, that the revered Core architecture can't be pushed much further." That gives me chills. I never thought I'd see the day when Core hit its limits.
  • rothbardian
    496490 said:
    The conclusion is really interesting: "We're getting the sense, though, that the revered Core architecture can't be pushed much further." That gives me chills. I never thought I'd see the day when Core hit its limits.


    It's a chilling conclusion indeed. It all points out to AMD's multi-die, multi-ccx architecture of Ryzen Threadripper being supperior to Inte's Core on all counts.
  • Wisecracker
    Good job -- Thank you for the in-depth analysis.

    BUT (you knew that was coming ;) right?), I question the need to call-out motherboard OEMs. I agree with the comments regarding unnecessary 'Bling' but they clearly feel they are delivering what the market demands in that regard ...

    It seems off-kilter to focus/blame board components and OEMs at the top of your conclusion page, and not really Chipzilla, while noting Sky(lake-X)-rocketing heat/power beyond that of the previous-gen 32nm AMD FX-9590 (constantly derided since its introduction as a power-hungry 'heater').

    Know what I mean, Vern?

    edit: How could I have misquoted Earnest!
  • FormatC
    To be honest, this was translated in absolute hurry over the weekend and sounds now (without my lyrics) a bit harsh. But one thing is fact: without all this kiddish plastic crap, covering the cooler surface, it might work a lot better. As I wrote on page One (intro); it is a causal chain and at the begin is the CPU.
  • AgentLozen
    Quote:
    See what I mean, Vern?


    I know its petty, but isn't the line, "Know what I mean?" We're talking Jim Varney, right? Haha.
  • JamesSneed
    This article spells out the points why I decided to build a Ryzen based system. I waited for Skylake-x and the thermals / power are just way to off the charts for the little extra performance. I could not be happier with the Ryzen 1800x build and yes I know I paid more for something you can get in the 1700 and OC it. I certainly agree anyone needing more than 8-cores should wait on Threadripper as it really has a chance to take Intel on performance due to these very same thermal / power issues in the i9 which means the higher core counts won't hit the same frequencies.
  • JamesSneed
    482859 said:
    To be honest, this was translated in absolute hurry over the weekend and sounds now (without my lyrics) a bit harsh. But one thing is fact: without all this kiddish plastic crap, covering the cooler surface, it might work a lot better. As I wrote on page One (intro); it is a causal chain and at the begin is the CPU.


    I agree, they should be called out when form causes a hit to function. I didn't find it harsh at all. Motherboard makers are all enamored right now with shiny pretty and are loosing sight on quality. I don't care if it has LED's or looks "cool" but never should that be at the expense of the motherboards main function.
  • mrjhh
    Power consumption and TDP are only marginally linked. Maximum power consumption relates to the maximum the chip could possibly use, while TDP is what a heat sink needs to be able to dissipate. The chip will thermally throttle if the maximum power consumption extends for long, but this condition should not happen in normal usage. But, if one uses all execution units within the processor at the same time, one will hit maximum power consumption at least momentarily. But, it's hard to keep all execution units running all the time, as there are typically cache misses which slow the processor, as well as software inefficiencies preventing running all execution units all of the time. Normally, that would put the average power consumption within TDP limits. Unusual use cases could exceed TDP, and cause thermal throttling.
  • 496490 said:
    Freak777Power said:
    You guys don't get it??? I talked to some people who got 6 core of Skylake-X and they were able to push CPU up to 4.6Ghz on all cores where temperatures were fine under Prime. Again temperatures were much lower in anything else. In my opinion Prime is rather unrealistic stress test, not to say useless crap proving nothing. I am not defending Intel but you all approached this problem with a wrong assumption.
    What's wrong with using Prime? It does a good job of testing the thermal limits of a CPU. You wouldn't test the limits of a weight lifters strength with 5 pound dumb bells. You need to go all out. You say that the author of this article approached this problem with a wrong assumption. Do you think that there's nothing noteworthy of Skylake X's thermal performance? I think this article did a good job of pointing out the glaring flaws of Skylake X. The conclusion is really interesting: "We're getting the sense, though, that the revered Core architecture can't be pushed much further." That gives me chills. I never thought I'd see the day when Core hit its limits.


    Because it is unrealistic, and finding thermal limit is just pointless. Go and read Intel specs sheet and tells you about this processor thermal limit. We really don't need any test to show such thing. Tomshardware spent pages of writing something pretty much everyone knew about if you were to read Intel specs. And even if you did not logically anyone can conclude that using same 14nm fabrication process won't play in favor in term of overclocking and heat.

    Again 10/20 is a lot of cores and to cool that with 4.0 Ghz+ clock speed, good luck with that.

    You people think that AMD Thread Ripper will run cooler, it will with 2.4Ghz clock speed. Seriously i had chance to play with every iteration of Xeon and AMD counter part CPU and you people have no idea of what you are talking about. 18 Core Broadwell-E or Haswell-E CPU for example is hell of task to cool down therefore those CPU run <3.0Ghz speed. We didn't hit any limit with Core CPU, but with what's possible using 14nm fabrication process. The fact that you can even overclock 10/20 to 4.4Ghz with such core count and complexity CPU package itself carries is AMAZING compared to AMD Ryzen which can't hit anything above 4.0Ghz with rather high temperature.

    You people get your fact straight.
  • FormatC
    If you read the intro between the lines, this test is a kind of answer to a YT video that was telling us, that all motherboard makers failed. I only tried to show, that we have headroom enough, to use this CPU as is without any kind of limitations. Only manually OC is able to bring it in trouble. :)
  • kinggremlin
    Prime 95 is the cpu equivalent of furmark. It's basically a power virus that does not represent the power usage of any other program you can come up with including other stress test programs. Intel implemented automatic throttling for their igpu's when furmark was detected. I wouldn't be surprised to see the same thing for prime95 on their cpu's.

    The bodybuilder analogy is idiotic. The prime95 equivalent would be to require grocery stockers to bench 500lbs as part of the hiring process to demonstrate the strength necessary to lift grocery products on the shelf. They will NEVER have to lift that much making it a meaningless and unrealistic test.
  • Phil_52
    Would love to hear the reviewers thoughts on how to setup these boards to de-clock the CPU in a way that reduced the head issues without too much damage to normal performance. I have just ordered a X299 setup and am more interested in the chipset features (PCI Lanes, Multi M.2 support etc) than the RAW horse power of the CPU.. So the question is, if I do the opposite of nature and under-clock... can I get a good balance ?
  • FormatC
    Simply use the mainboards functions to limit the wattage and / or play with Vcore and Vccin. I will also write a follow-up in both directions when I get a better mainboard.
  • bloodroses
    2070440 said:
    496490 said:
    The conclusion is really interesting: "We're getting the sense, though, that the revered Core architecture can't be pushed much further." That gives me chills. I never thought I'd see the day when Core hit its limits.
    It's a chilling conclusion indeed. It all points out to AMD's multi-die, multi-ccx architecture of Ryzen Threadripper being supperior to Inte's Core on all counts.


    Once/If Threadripper is able to clock higher and have a faster IPC, then AMD can talk about being superior. Looking at the leaked specs so far, neither appears to be true. While it's great that AMD is back in the game again and finally giving Intel competition, they are no means superior outside cost/value and core count.

    To give a comparison, Threadripper supposedly will have a tdp of 125-155w, with the highest topping out at 4.1ghz boost. The 10 core equivalent has a 125w and 4ghz boost.

    http://wccftech.com/amd-threadripper-1998x-and-threadripper-1998-processors-x399-x390/

    Chances are these will run quite hot as well and are huge in size. These 2 links show the size of the die and coolers needed:

    https://www.lowyat.net/2017/133239/computex-2017-noctuas-amd-threadripper-cpu-cooler-massive/

    http://www.pcgamer.com/amds-threadripper-is-huge-with-an-equally-large-socket-and-cooler/


    Intel's I9-7900x has a tdp of 140w, with a 4.3ghz boost. Their die size is still roughly the same as that of the rest of their core lineup in comparison to Threadripper's monstrous size. The biggest mistakes Intel made was the thermal paste (as the author mentioned) and while not really a mistake; trying to cram too much into a tiny space for their socket.
  • JamesSneed
    1069610 said:
    2070440 said:
    496490 said:
    The conclusion is really interesting: "We're getting the sense, though, that the revered Core architecture can't be pushed much further." That gives me chills. I never thought I'd see the day when Core hit its limits.
    It's a chilling conclusion indeed. It all points out to AMD's multi-die, multi-ccx architecture of Ryzen Threadripper being supperior to Inte's Core on all counts.
    Once/If Threadripper is able to clock higher and have a faster IPC, then AMD can talk about being superior. Looking at the leaked specs so far, neither appears to be true. While it's great that AMD is back in the game again and finally giving Intel competition, they are no means superior outside cost/value and core count. To give a comparison, Threadripper supposedly will have a tdp of 125-155w, with the highest topping out at 4.1ghz boost. The 10 core equivalent has a 125w and 4ghz boost. http://wccftech.com/amd-threadripper-1998x-and-threadripper-1998-processors-x399-x390/ Chances are these will run quite hot as well and are huge in size. These 2 links show the size of the die and coolers needed: https://www.lowyat.net/2017/133239/computex-2017-noctuas-amd-threadripper-cpu-cooler-massive/ http://www.pcgamer.com/amds-threadripper-is-huge-with-an-equally-large-socket-and-cooler/ Intel's I9-7900x has a tdp of 140w, with a 4.3ghz boost. Their die size is still roughly the same as that of the rest of their core lineup in comparison to Threadripper's monstrous size. The biggest mistakes Intel made was the thermal paste (as the author mentioned) and while not really a mistake; trying to cram too much into a tiny space for their socket.


    We will see soon enough on Threadripper. I really don't think they will run hot as you do. They have a large heat spreader with the dies underneath spaced out and should be soldered. This should make for lovely cooling capabilities even with air. The 16+ core parts of AMD and Intel are where it gets really interesting seeing how Intel deals with the heat of a CPU that is 60-80% more cores than the 7900x.
  • nyannyan
    Heat is going to be a real problem when the 18 core SKUs come out regardless of Prime95. There are other AVX heavy use cases you know and the headroom will decrease with each additional core. Right now I'm quite satisfied that I went with Broadwell.
  • Aspiring techie
    I can only imagine what will be needed to cool the 18 core variant...
  • techy1966
    Great Article Thank you I found it very interesting and it answered a lot fo questions for me. I think both Intel and the main board makers are at fault. Intel because they rushed these CPU's out and used thermo paste instead of soldered heat spreaders. I also think they have reached the limit of their 14nm process and need to shrink it again.

    Main board makers at fault for as you stated putting all that plastic bling bling on the boards because they think consumers think it is cool. Only teens and 20 something find that cool the rest of us just want fully working boards that do as advertised out of the box. If the plastic bling bling & RGB lighting effects the board performance it needs to go simple as that.
  • the nerd 389
    For your motherboard reviews with this CPU, might I suggest that you look up the rated lifetime of the PWM caps?

    If those are 5k caps, they'll only last 5000 hours at their rated temperature. Lifetime usually doubles for each 10C under that rating.

    These CPUs often end up in entry level workstations. If the intended usage pushes the caps to 75 C, and they are the more common 5k/105C, then conventional wisdom indicates that 10% of the caps should fail by the 40,000 hour mark. In those cases where the CPU is fully loaded most of the time, this will occur 4.5 to 5 years of age. This means that there isn't much room for mistakes in the motherboard layout, case design, or airflow requirements. It's something that potential consumers should take into account if they want to get the most out of this platform.

    I'd have to check in more detail to give any estimates for the MOSFET lifetime, but that's another factor to account for if longevity is a priority.
  • chaosmassive
    well done Intel, I applaud you !
  • FormatC
    983009 said:
    For your motherboard reviews with this CPU, might I suggest that you look up the rated lifetime of the PWM caps? If those are 5k caps, they'll only last 5000 hours at their rated temperature. Lifetime usually doubles for each 10C under that rating.
    Suggest it Thomas, he makes the mainboard reviews here. The idea is not bad to see deeper into the tech. I'm testing here VGA boards and I disassemble all.
  • Nintendork
    Intel repeated the P4 Prescott mistakes, they never wanted more than 10 cores for HEDT, not with Zen blowing them in performance per watt, power consumption and temps they forced their HEDT out of reasonable limitis to look good on benchmarks.

    Overaggressive turbo clocks, high power consumption, unacceptable temps, and this is only the 10core...

    Since they dropped the price more than they wanted now they cheap out $1 from solder to toothpaste.