WHEA_UNCORRECTABLE_ERROR- Help/Possible solution- Thermal Paste

jasev01

Distinguished
Mar 13, 2010
26
0
18,530
Hello all, I’m a frequent reader so I want to contribute and hope this helps someone out there.

Here is an executive summary- If you BSOD and get “WHEA_UNCORRECTABLE_ERROR” check your thermal paste.

My setup:

I7-5820K @3.30GHz
MSI X99 SLI Plus (New bios Version E7885IMS v1.11(1BO))
24 GB DDR4 Ram Frequency 2133MHz
Asus GTX 970 Turbo OC edition
Cosair Liquid cool H100
2 mechanical drives
2 SSD drives (Samsung EVO I believe)
Windows 8.1 (Sorry media center or riot)


The story (long bear with me): A few days ago I had the brilliant idea of getting ready for VR by buying the GTX 970 (5 minutes before I looked at Twitter and saw the 1080 announced).

I got the new bad boy (google the card on Youtube find the video and you will get the joke) and threw it in. Worked fine for a few days then I got a BSOD with the “WHEA_UNCORRECTABLE_ERROR”. This was after an untimely update to the Asus GTX 970 Turbo OC edition so I suspected this at rights and swapped out the old card etc none of that worked. After that my computer would have a hard time booting. Came on a bit shut off. Then it would not post. Reading through the posts I thought my PSU was overloaded or dead. Well I had a 950 PSU so it was not overloaded so it must be dead.

I was an hour and half from Microcenter which closed in an hour and a half so I was cutting it too close so this had to wait until the morning. I worked hard did my research and came up with the HXI850 refurbished. Fast forward purchase install no post extreme frustration.

I tore the computer apart. I checked the memory dims. Swapped the video cards. Rewired things No good. Must be a bad PSU. This means another 3 hours of driving to Microcenter to swap this out. This time I lunged the massive rig with me. Let them check it out. Find out why I could not get it to post.

I go explain bring the rig in and they plugged in a 600W PSU (cheapo) it posts. Clearly the PSU. Unlucky it would seem. So no more refurbish give me a new EVGA P2 650. I was told I would not need more. Get home put it in nope no post. More frustrated.

So the side note is it would post sometimes and not others. I would load with some hard drives but not others. (overly complicated but one came from an old windows 8 laptop so it load that windows ). The latest was it would not load under full load of USBs plugged it. So I start stripping out pieces to try to make this thing work adjust the power load see what cane get it going. I figure by now must be the motherboard reading the comments. 4 am I set up an RMA on line. But I’m not done with the struggle.

Back to war. Why did it post and why is it spinning everything but not posting. Well everything but one CPU fan. It seems like whenever it does post that fan spins up. Interesting. Playing around taking all the USB out and putting things in one by one it posts fans going. Interesting.

I get into the bios for a look around and realize the CPU is running at nearly 100Degrees C. Not good. It’s not over clocked nothing special what is going on. This is supposed to be liquid cooled. Then I thought well I took one of the liquid cooling fans off to plug in the CPU 6 pin.

So back to the internet and the liquid cooling can that be a thing. Fast forward jump a few steps and I think what about the thermal paste. Worth a look. I have to take everything of the board anyway.

So I get the heat sink off and the paste is dry as a bone and flaky. I wipe it off and dig around for some fresh paste. Worth a try. I get the paste on and get it to post again. Well the temp dropped to 40 degrees C. Big change. So far so good but not so fast it is climbing. So far it has peaked at 94 mind you the case is open so who knows how the temperature is read. As an aside, coming from nearly 100C I expected a hot processor when I was applying the thermal paste but it was not even warm so take that for what it is. Does it cool down that quick? 100 C is 212 F it should be like a stove. I should burn myself getting near it.

So I’m writing now in medias res. I had what seemed to be a stable system albeit it claims to be running very hot. I am questioning the purchase of a new PSU and thinking this was all a thermal paste stem running hot issue. Mind you I’m a bad person who let the computer run as a server 24/7 so I can access things when I’m away from my “office”. I just flashed the bios because I read that could be an issue and I want to try to fix all things. That just finished and it is posting on the regular a far cry from m24 hours ago when it would never post. CPU temperature reads 96-97 degrees C now and “motherboard” temperature is holding at 38C.

So at least as of this writing I am getting steady posting. Its nearly 7am so I’m not tinkering much more but I plan to reinstall the old PSU to see it that can give me a bit more service and I can save some money returning the new EVGA 650 P2. If now it is what it is. I’m also going to add the load back on to the USB and the other unplugged things. Again hopefully things hold.

Closing Summary:

Sorry again for the long post but I wanted to walk people through everything. I know I was frustrated with posts that seemed to lack detail that might be like my situation but might not be or ones that drop off with the person probably figuring out their issue but not sharing it. I just wanted to throw this solution out there in the hope that someone who has this problem will find it useful to have all this information in one place so they don’t go on a 3 day and night sage of buying and returning and trouble shooting. My path might not work for everyone and maybe someone with better skills can lay out a better organized step by step trouble shooting guide that has better order. This is just want I went through and what has worked for me. Had I known thermal paste would solve this 3 days again I would have a few hundred dollars in my pocket, more sleep and less frustration. I believe this build is less than 2 years old but the paste might have been the pre installed Intel paste so maybe that helps someone Again just details maybe that match your situation.

I apologize for the typos. As I said 3 days only a few hours of sleep and I’m writing train of thought. As of 7:21 all the updates installed (except Windows 10) Bios updated and seems to be stable. I will update if anything changes after I put the USB load back on and swap back the old PSU so people will know it those are potential issues as well. if anyone wants to chime in with something I might be doing wrong or other advice I'd be happy to have it because i might be just getting luck with this solution for now.
 

jasev01

Distinguished
Mar 13, 2010
26
0
18,530


Thank you for your fast response. I replied earlier with more details but there was an error and I didn't have the energy to retype everything. An Update my "solution" did not work. :(

It lasted a few hours then no more. I scanned for viruses to see it that was the cause. i had a few things i think mostly caught by email quarantine and they were fixed and things were ok. I got on the phone with MSI about the RMA I requested and the CPU was reading 90-91 which was better and considering it had run several hours i thought ok. He had me reset the bios again, I had done that before when I flash updated it. Then were reset and back to no posting. :(. So I just RMAed everything. The Processor back to Intel, the board back to MSI and the H75 ( I think I got the wrong one in my original post) back to Corsair. I know it was not the power supply since I tried 3 different ones from different brands and since I threw and old AMD Phenom II X4 940 Black Edition and Asus mother board kit I had laying around into the system and it started up. To be fair I did have a post problem again once but I think that was from not pushing in the CUP 6 pin hard enough. So while the RMAs are sorted I am living with that semi crippled system although it does have some advantages and should suit my needs for now until thing are turned around.

To your point about voltages, they were running a bit high ie 3 might read at 3.2 or 1 at 1.08. I'm not an engineer so I don't know if that is normal or a huge deal but again this was me trying to run things stock no OC and that was the result. Could that have been the issue? I took it that in any case the CPU should not have been 100 degrees C which suggest the heat sink was not doing its job which might mean the board was not going it's job getting the heat sink working which is why I thought send it all back and start over. It would cost less time than trying to pinpoint the point of failure and since they were all under warranty and had to be taken out anyway why not.
 

Bradleyvarol

Reputable
Jan 29, 2015
69
0
4,660


A few mVs either way is okay:

Power Supply Voltage Tolerances (ATX v2.2)

Voltage Rail Tolerance Minimum Voltage Maximum Voltage
+3.3VDC ± 5% +3.135 VDC +3.465 VDC
+5VDC ± 5% +4.750 VDC +5.250 VDC
+5VSB ± 5% +4.750 VDC +5.250 VDC
-5VDC (if used) ± 10% -4.500 VDC -5.500 VDC
+12VDC ± 5% +11.400 VDC +12.600 VDC
-12VDC ± 10% -10.800 VDC - 13.200 VDC
 

jasev01

Distinguished
Mar 13, 2010
26
0
18,530
Thank you again for the response. I ended up RMAing everything. Intel turned the processor around in maybe 2 days. Corsair is taking longer up its on the way. MSI forgot I sent then anything but after a few calls they found it and said they were supposed to replace the board yesterday. We will see. For now I threw together an old AM3 FX 965 Black edition board I had sitting around. Hopefully when all the parts come back and it will all work out. Another issue Asus required a threat of a law suit before they honored the rebate. Even with that the still screwed me and I'm pretty sure their card caused all of this. I don't want to explain it all or come of sounding dumb but the summary is not pleased with Asus and will never use their stuff again.
 

Bradleyvarol

Reputable
Jan 29, 2015
69
0
4,660


The only experience i've had with RMAing a motherboard was with ASROCK. They sent me a new one before my old one was even shipped. Quite nice.

Hope you get things sorted!