System reboots seemingly randomly, not sure how to diagnose

sleupold

Reputable
Oct 14, 2015
6
0
4,510
The short of it:
Two year old system, custom built, started restarting randomly (no error messages, BSOD, or warning) months ago. Restarts were very infrequent and have started ramping up in frequency over the last two weeks. It worked fine last night, but this morning it would not stay on for more than 4 minutes between restarts. Not sure exactly what the problem is or how to find out.
OS: Windows 10 64bit, all updates as of Aug 28 2017
Originally a Win 7 clean install and upgraded to Win 10 through the free upgrade program, if that makes a difference
CPU: AMD FX-8350 4.0GHZ
RAM: Kingston HyperX FURY 16GB Kit (2x8GB) 1866MHz DDR3 CL10 DIMM (HX318C10FBK2/16)
MB: Gigabyte GA-970A-D3P ATX
GPU: EVGA GeForce GTX 960 4GB FTW
PSU: EVGA SuperNOVA 650 G1 80+ GOLD, 650W Fully Modular
Boot drive: SanDisk Internal SSD 240GB 2.5-Inch SDSSDA-240G-G25
Other drives: I have 4 other mechanical SATA drives hooked up for additional storage.


The long of it:
System originally built Nov 2015, no new parts or upgrades since the build was completed. Been running great until today (aside from a few random restarts). Restarts are not seemingly triggered by heavy use or gaming, often they happen shortly after I just finished editing or rendering and my computer is at a near idle state when stress should be low. There are no error messages, or messages at all, the monitors which are plugged into the same surge protector don't switch off or lose power for a moment when the system restarts. The system never just shuts down it always restarts immediately, the fans don't spin down or indicate that there's an interruption of power. It goes something like this: everything's good, all screens go black at a moments notice (power lights still on), after a few seconds MB splash image shows up, system boots like normal, back to the desktop no error messages or issues detected. Everything seems to load like normal. A few times I've gotten a message from Windows saying something like "the computer didn't shut down properly what would you like to do" or a message from Creative Cloud telling me something didn't load, but those are unusual. I did some searching and people suggested checking system event viewer.
System event viewer simply states:
System, Kernel-Power, Event ID: 41, Task Category (63) "The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."

After much searching it seems that the error indicates that this is likely a hardware related issue as opposed to Windows. The most common suggestion from people was that this is a PSU failure and to replace the PSU. I'm not sure if that's what the problem is and if I can avoid pulling my PSU out and installing a new one I'd like to. Other people suggested it could be RAM (some people had mismatched or mistimed RAM), the power button short circuiting, the graphics card, faulty surge protector or outlet, or a virus.

In my effort to identify the problem on my own I ran the following checks:
1. System File Check: No errors
2. DISM CheckHealth: No errors
3. Windows Memory diagnostic: No errors
4. I opened my system up today and unplugged the PSU to see if there was any obvious signs of damage, the cables and PSU all seem fine. I double checked the cables going to the MB and GPU as well as unplugged and re-plugged the cables connecting into the PSU.
5. I also switched my computer to a different power cable and a different wall outlet just in case.
6. I opened up HW monitor, CPU Z, and GPU Z: Most temps below 40C when reboot happens, no indication of stress or heavy load on components.
7. I normally run three monitors, so I tried unplugging two of them just in case.

As I was trying to diagnose the issue today I started timing the reboots to see if they were happening quickly or just when I did certain things. I sat back and didn't touch my computer to ensure it wasn't something I was doing directly this is the log of my activity today:
1. Rebooted on its own
2. Get to desktop, no input for me, 3min then reboot
3. About 40 seconds to desktop
4. At desktop for 17 seconds before reboot
5. About 40 seconds to desktop
6. At desktop for 2 minutes before reboot
7. About 40 seconds to desktop
8. At desktop for 1 minute before reboot
9. At this point I go into the BIOS to see if it will still reboot from there, sit in bios for 23 minutes no issue
10. Boot into Windows and go to settings to get into safe mode, boot into safe mode with networking, system runs about 20 minutes, I decide to see if it's stress related, I open up After Effects and Premiere Pro, I dink around in AE for a few minutes, then I kick off a render in Premiere Pro, render finishes CPU was at 100% on all cores during render, close Premiere and AE, open up Photoshop, close PS, watch back the video that I just rendered, video plays fine in VLC, I'm still in safe mode a few seconds after the video finishes playing the system reboots. No apps running except VLC which wasn't actually playing a video.
Between sitting in BIOS and safe mode it was over an hour since the last random restart.
11. Get to desktop, at desktop for 2 minutes before reboot
12. After reboot decide to run Windows memory diagnostic, runs without issue, after it finishes it takes me to desktop, I send it back into safe mode and it was running fine for 3.5 hours until I opened up the start menu and then it rebooted again.
13. Now I'm back to the 3 minutes at desktop into reboot. I just put it in safe mode so I can submit this post.

I don't have a multimeter to test my PSU, but I'm sure someone I know has one and will let me borrow it so I think I can test my PSU in the near future.
It seems to me (and I'm probably wrong) but if it was a power supply issue shouldn't the system reboot regardless of whether I'm sitting in BIOS vs safe mode vs normal Windows? Yet it seems that when in normal Windows rebooting is very frequent where as it is seemingly rare in safe mode and sitting in BIOS. It does not appear that system stress has anything to do with reboot patterns.

Any advice or ideas that I should try before getting a new PSU?

Also, if you read all that... thank you. I didn't realize this was going to be such a long post.
 

Supahos

Expert
Ambassador
Seems unlikely to be your issue but that board isn't one I'd run a 8350 on. Try downclocking your CPU, aim a fan at the capicitors nearest the CPU. See if it still restarts. It's a common issue with that cpu, but yours seems to be acting a bit different than most of them with power delivery issues. Just one mor free thing to try before buying a new PSU which is where I'd lean as well.
 

sleupold

Reputable
Oct 14, 2015
6
0
4,510


Interesting. Hadn't heard that before. Will try later. Thank you!

Update: I found another thread that talked about just disabling the turbo boost. I tried that for a day. I did see improvement (about 5-6 hours between reboots) but I need something more stable. So I downvolted and reduced the multiplier to x18 giving me a max of 3.6 GHZ instead of 4.0. At this point my system has been up for a little over 24 hours! It seems like it's stable. I've done some editing and rendering on it since then and not had an issue.
I'm glad I don't have to buy a new power supply right now. I can live with some downclocking for the short term.
Thanks for the advice!
 

sleupold

Reputable
Oct 14, 2015
6
0
4,510
UPDATE: the downclocking worked for a little while and I gradually stepped down the multiplier until I was at 2.4ghz. However, now the system is rebooting more and taking longer to do it. Turning down the multiplier doesn't help anymore. I decided to replace my motherboard in case it was related to the capicitors failing or something (though now I realize I may have misinterpreted what Supahos was saying). Replaced with the Gigabyte 78LMT-USB3 Rev 6.0.
Swapped everything out and the system runs even worse. Shorter time between reboots, however the reboots are slightly different now. Previously when it rebooted or was just turned on the fans would spin up for a moment and then turn off for 5-10 seconds before actually turning on. The system would boot properly from that point until a failure caused a reboot.
Should I return the motherboard and replace my PSU or keep the MB and replace the PSU?