BSOD/random restarts due to WHEA_UNCORRECTABLE_ERROR

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685
I have been expieriencing random restarts on my Windows 7 64bit pc of late. According to the crash report, I get:

On Tue 6/18/2013 5:49:23 PM GMT your computer crashed
crash dump file: C:\Windows\Minidump\061813-21531-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x6F880)
Bugcheck code: 0xA0 (0xB, 0x9BF7A000, 0x3, 0x40C4000)
Error: INTERNAL_POWER_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that the power policy manager experienced a fatal error.
This problem might be caused by a thermal issue.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.



On Tue 6/18/2013 2:15:52 AM GMT your computer crashed
crash dump file: C:\Windows\Minidump\061713-27140-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x4A359C)
Bugcheck code: 0x124 (0x0, 0xFFFFFA8003A4E038, 0x0, 0x0)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem problem. This problem might be caused by a thermal issue.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.



On Mon 6/17/2013 3:15:19 AM GMT your computer crashed
crash dump file: C:\Windows\Minidump\061613-27796-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x4A359C)
Bugcheck code: 0x124 (0x0, 0xFFFFFA800371A8F8, 0x0, 0x0)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem problem. This problem might be caused by a thermal issue.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.

The above is for the last three crashes that occured. I had a similar issue in the past that was resolved by applying a fresh coat of thermal paste to the CPU. I did so again yesterday, as well as cleaning the inside of the case and heatsink for desk, but I still am having random restart occur. What can I do at this point? I am trying to run stress test on the CPU and monitoring the temp but have so far found nothing out of the ordinary (cpu temp is below 60).

 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685


No. Don't think I've touched voltage in many years. Could it be a failing power supply?

 

Nieru Senju

Honorable
Mar 19, 2013
22
0
10,520
Well there are tons of reason that can cause WHEA.

Here are some - Poor voltage regulation (i.e. power supply problem, voltage regulator malfunction, capacitor degradation)
- Damage due to power spikes
- Static damage to the motherboard
- Incorrect processor voltage setting in the BIOS (too low or too high)
- Overclocking
- Permanent motherboard or power supply damage caused by prior overclocking
- Excessive temperature caused by insufficient airflow (possibly caused by fan failure or blockage of air inlet/outlet)
- Improper BIOS initialization (the BIOS configuring the motherboard or CPU incorrectly)
- Installation of a processor that is too much for your motherboard to handle (excessive power requirement, incompatibility)
- Defective hardware that may be drawing excessive power or otherwise disrupting proper voltage regulation

More information can be read here http://www.tomshardware.com/forum/358806-28-whea-logger-event-viewer-error

Credits to popatim
 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685


That doesn't really help as it doesn't tell me how to narrow down the possible issues. Again I don't overlock so issues potentially related to that are not relevent

 

Nieru Senju

Honorable
Mar 19, 2013
22
0
10,520
Well the most simplest issue you can try to trouble shoot is your motherboard's voltage regulation. Do you have a spare motherboard to try with your processor? and see if you still get WHEA?
 


Sadly, WHEA is one of the tougher BSOD's to nail down. Generally, CPU, Motherboard, or PSU are the problems. Could be voltage related, temperature, shoddy PSU voltage regulation, or other general instability.

I'd validate the BIOS and make sure all voltages are correct, no OC, and the like. I'd then make sure temps are within acceptable ranges.
 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685


How would I go about validating the bios and what would I be looking for? Also how would I know what the correct volvages where? I just recently updated the BIOS with the most recent patch and the temps are now down to below 40 so maybe that did the trick
 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685


ran memtest for 7 hours and got no error. How can I test my moboard and power supplies without a milimete
 
You can't, hence the problem. Some utilities exist that will show you in realtime the voltages going to certain components, so you can spot check to see if anything if falling out of bounds. No real way to test the motherboard though. You're now at the point where "swap parts" may be the only way to find the problem.
 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685
I've run into another problem now: the computer will no longer power on. I tested to see if the power supply was dead by taking a paper clip and puting an end in the green pin hole and one of the black pin holes and powering it on. At first I tried it with the GPU that was still mounted to the motherboard and got nothing. I tried one more time however with a CD-Rom drive that I had lying around and connected it to the PSU via a molex connection and the drive powered on and worked normally. Could this mean that the moboard is dead?
 

notasandwich

Distinguished
Jun 13, 2011
115
1
18,685
So long story show motherboard not dead and I took the computer to a Stables to see if they could do something. They ran some test on the hardware and everything is good. They're theory is that the issue that is causing the restarts is in fact driver related. To find out which driver it is that is causing the restarts would entail an additional few days of work and another $60 so I told em no thanks. My question now is: how can I figure out what driver it is that is causing this problem on my own?