Random system restarts/shutdowns [Resolved]

Buddy_T

Commendable
Jul 27, 2016
5
0
1,520
Hi, I'm pretty much a neophyte, having only built one system and pretty much sticking to whatever default or menu options are available. I built this system 2 years ago and all has worked well until recently when it started randomly freezing up or restarting, getting stuck in re-boot loops, and/or completely shutting down during boot, even when trying to boot to a Windows 10 repair disc. Frequently, a successful boot requires remove AC power & holding down the power button for 60 seconds. The crashing seems to be exacerbated by opening multiple windows.

My system components are:
Motherboard: ASUS M5a99FX PRO r2.0
CPU: AMD FX-8350 8-core (with stock cooler)
GPU: Asus m5a99fx pro r2.0
RAM: Patriot Viper 3 8GB (2 x 4GB) DDR3-1600 Memory
HDD: Western Digital Blue 500GB
PSU: Sentey 750W
OS: Windows 10 Home 64-bit
BIOS set for "High Performance". (BIOS & HW Monitor indicate the CPU is running at 4374MHz)

Other than regular Windows updates, there haven't been any new HW or Software installed in several months.

Here's what I've tried so far:

- Ran CHKDSK with no problems found
- Cleaned up Windows Registry with ccleaner with no effect
- Updated all HW drivers and the BIOS with no effect.
- Replaced HDD with new one anyway using restored backup system image with no effect.
- Tested old HDD with Western Digital's HDD diagnostic tool. No problems found.
- Checked all voltages & temps with HW Monitor. Voltages all look ok but I don't really know what tolerances I should be looking for. Under normal loads, CPU Temp runs at 47C & GPU runs at 44C. I don't know whether these are normal op temps or not.
- Ran MemTest86, passed all tests without error
- Ran GPU stress test using Furmark for 25 minutes. No issues. Both CPU & GPU temps had stabilized at ~80C by that time.
- Tried to run RealBench bench test & stress test. Both crashed the system in less than 5 minutes.
- Tried to run Prime95 stress test & system crashed in less than 2 minutes. Results log showed several "Fatal Hardware" rounding errors.
- Monitored voltages & temps while running Prime95. +12V dropped from 12.24 to 12.15 while all other voltages remained rock solid. Temps increased only slightly given the short time the tests ran.
- Tried to re-run stress tests with BIOS set to "Normal" & "Quite" modes but tests still crashed as above.

At this point, it appears to be a toss up between a motherboard problem or a CPU problem. I'd appreciate any advice or suggestions anyone might have to help me further isolate the problem.

PS: I wasn't sure what category to select for this question but a search for "system crashes" indicated this category contained the most threads on that topic.



 
Solution
New Update: Well, the power recycling issue appears to have been resolved by installing the Windows 10 Anniversary Update. Just to summarize all the above...

1) System was randomly shutting down and restarting, as if the power had been recycled.
2) The system failed Prime95 & RealBench stress testing, indicating the CPU might be bad.
3) After replacing the CPU & installing a better CPU cooler than the original stock cooler, the system easily passed the Prime95 & RealBench stress tests but still randomly shut down and restarted.
4) Installing the Windows 10 Anniversary Update resolved the random shutdown issue.

My theory as to what was going on:
1) The CPU had been over stressed by being run for 2 years in an overclocked mode (High...

Buddy_T

Commendable
Jul 27, 2016
5
0
1,520
Hi. Thanks for the comments. As I noted above, I'm not very knowledgeable about these matters. Not sure how to make the adjustments you suggested or even which voltage should be adjusted. But I can share this data that might help clarify the problem:

My system has 3 preset options to choose in BIOS: Powersaving, Normal, High Performance. The only difference between PowerSaving & Normal (that I can determine) is that EPU Power Saving mode is enabled under PowerSaving.

Here are the system data for each option:

Power Saving:
CPU Speed: 4000Mhz
Mem Freq: 1333 MHz
NB Freq: 2200
HT Link Freq: 2200
VCORE Voltage: 1.26V
+3.3V: 3.24V
+5V: 4.95V
+12V: 12.149V
VDDA2.5V: 2.484V

Normal: All the same as for PowerSaving above

High Performance:
CPU Speed: 4374Mhz
Mem Freq: 1439 MHz
NB Freq: 2376
HT Link Freq: 22376
VCORE Voltage: 1.272V
+3.3V: 3.24V
+5V: 4.928V
+12V: 12.149V
VDDA2.5V: 2.484V

I have the same problems in all three modes.




 

Buddy_T

Commendable
Jul 27, 2016
5
0
1,520
I have some more information but it just adds to the confusion. As I stated in my OP, I ran a GPU stress test with FurMark until CPU & GPU temps stabilized at about 80C. Had no stability issues. I just repeated those tests this morning and had the same results.

So I rebooted and let the system idle until the CPU & CPU temps stablized down into at about 45C. I then started opening multiple windows (browsers, email, word processing, excel) to put a load on the CPU. All seemed ok until the CPU temp reached 64C at which time the system froze. I was able to duplicate this result twice more.

So the GPU stress tests would lead me to believe that temperature is not the problem but these somewhat artificial load tests indicated that maybe temperature is a problem.
 

Buddy_T

Commendable
Jul 27, 2016
5
0
1,520
Update: I sent both the motherboard & CPU to the manufacturers for diagnostics/repair. Both were still under full warranty. AMD sent me a new CPU but without any explanation as to whether the CPU I sent in was bad or not. ASUS returned my original MB to me, again without any explanation as to whether they found/repaired any problems.

I installed both in the system today I also installed a CoolerMaster Hyper 212 EVO cooler on the CPU. The system now runs Prime95 & RealBench stress tests without any issues. But the system still randomly power cycles.
 

Buddy_T

Commendable
Jul 27, 2016
5
0
1,520
New Update: Well, the power recycling issue appears to have been resolved by installing the Windows 10 Anniversary Update. Just to summarize all the above...

1) System was randomly shutting down and restarting, as if the power had been recycled.
2) The system failed Prime95 & RealBench stress testing, indicating the CPU might be bad.
3) After replacing the CPU & installing a better CPU cooler than the original stock cooler, the system easily passed the Prime95 & RealBench stress tests but still randomly shut down and restarted.
4) Installing the Windows 10 Anniversary Update resolved the random shutdown issue.

My theory as to what was going on:
1) The CPU had been over stressed by being run for 2 years in an overclocked mode (High Performance BIOS setting) with the inadequate stock cooler. It eventually degraded enough that it could still handle normal processing loads but could not handle extreme stress testing. So this turned out to be a secondary issue found while trouble shooting the primary problem.
2) The random shutdown problem was caused by corrupt Windows 10 files, which may have been corrupted by a previous failed update. Installing the Windows 10 Anniversary Update essentially re-installed Windows, thus correcting the corrupted system files.

Thanks for the help and advice. I hope this thread can help others with similar problems.
 
Solution