Computer Shuts Off Under Load - Not Overheating

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510
So Ive been having this issue for quite a while now and have tried everything I have the means to try. I believe I have narrowed it down a bit, but would like a second opinion. The issue I have is that very randomly my computer will hard reboot. There may actually be 2 issues so Ill give the whole run down.

First off, specs.

CPU: AMD FX 8350 (Currently not OCd as I thought that may be contributing to the issue) Cooled with a Cooler Master H80i
RAM: 2x Corsair Vengence 1866mhz (Currently not OCd, same as above)
GFX: EVGA 760 SC (Currently OCd past factory OC)
PSU: EVGA 600B (This is actually a new PSU as the first thing I thought when this started was that my PSU was bad. I will explain later why I think it still to be an issue)
MB: Gigabyte 990FX-UD3
HDD: Seagate Momentus Hybrid SSD
Case: Cooler Master HAF 932
I also have 6 USB devices that draw power.

At idle the machine runs at 16-20C, during Prime95 or similar, it gets up to ~60C. I have gotten the processor up to ~81C when Heavily OCd (to 4.92Ghz, Vcore @~1.6v). Ive recorded the screen while running stress tests and have seen temps ranging from ~35-~60C when the machine shuts off, so I am Very confident that it is NOT an overheating issue.

Onto the randomness of the issue. I could be gaming, Arma3, for several hours with no shut off, or I could watch a netflix video for 5 minutes and it does shut off. Running OCCT, Prime95, Hot CPU, etc will consistently cause the machine to hard reboot, but sometimes the tests can run for several minutes without issue, or can run for a few seconds with issue (This is usually depending on how much RAM I allot to the test, the more ram, the quicker the issue occurs). I have run Memtest86+ for many hours, modules together, modules separate and in all 4 slots (individually) and get hundreds of errors in test 7 and 9, with a few errors in test 6. I have tested the ram in another machine, also using Memtest86+ and have received zero errors, so am quite confident that it is not the RAM. I know that this CPU has the memory controller on chip, so am definitely leaning towards there being an issue with the CPU, although the randomness of the issue makes me doubt myself and so Ive not purchased a new CPU yet. I have also run tests on the PSU and found that at normal operation, it is just enough to power everything I have. Under load, it is actually not enough.. So Im thinking I need a new PSU. But that wouldnt cause the issues with Memtest86 errors.. So Im thinking I need both a new CPU and PSU. Oh, I also use eXtreme Power Supply Calc to determine my power needs.

I have reinstalled the O/S, more than once. I have reinstalled every driver, more than once. I have reflashed the BIOS. I have reset the BIOS to optimal defaults. I have cleaned the inside of the case, just incase. I have even gone as far as implementing some extreme cooling and have brought my CPU down to sub zero temps.

Even reading this I am telling myself that yes, I need a new CPU and a new PSU wouldnt be a bad idea either, but I just want someone else to agree with me so I can stop doubting myself, Lol.

All that being said, if anyone has any other input, it would be very much appreciated.

TIA

Brad.

EDIT: I do believe that my issue may be due to the PSU, but had doubts due to it having been replaced, although again, it is just enough to power my system. It also doenst, at least not to my knowledge, explain the memtest86 errors. So again, possibly, probably, both CPU and PSU need replacing.
 
Solution

Your power calculator is using grossly exaggerated worst-case figures.

Your CPU at stock clock will use less than 100W, your GPU at stock clock will use less than 200W and the rest of your system may use about 50W so we are talking 300-350W peak when everything is stock.

Buy a Kill-a-Watt or similar device and stick it between your PC and PSU. You will likely see less than a 400W worst-case (stock) power draw and with an 85% efficient PSU, that would mean ~340W actual PSU output.

InvalidError

Titan
Moderator
If memtest86 spews out tons of errors but the DIMMs check out fine on a different system, that leaves the CPU and motherboard as potential suspects.

Why suspect the motherboard? Because it provides power to the DIMMs, the CPU's memory controller and there is also the possibility that a design or manufacturing issue is causing excessive crosstalk when certain bit patterns cross the memory bus and trigger errors.

Memory errors would usually cause the system to freeze or crash and reboot instead of shutting down.

Since you have attempted some extreme OCs, damage to the motherboard's VRMs could certainly be an option.
 

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510


Thank you for the quick reply. I was under the impression that this processor actually has an integrated memory controller, which is why I wasnt looking at the mother board. I would also not really consider 5Ghz an extreme overclock for this processor, as the 9590(which I matched the specs of) is in fact an 8350 just factory overclocked. These issues also started before I OCd this thing to 5Ghz. At that time I had thought that maybe OCn the thing may actually resolve the issue. Unfortunately this was not the case.

Anyway, thanks again for the quick reply.

 

InvalidError

Titan
Moderator

Yes, it has an integrated memory controller but the CPU and memory still needs DRAM voltages from the motherboard and the motherboard still provides the electrical connections between the CPU and RAM.

As for the "extremeness" of a 5GHz OC, you do need to keep in mind that the FX9xxx are cherry-picked from wafers for the best electrical characteristics so even if you do manage to duplicate the clock speed, you are likely doing so with worse stability margins and probably higher power draw.
 

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510


Hmm, I was hoping to narrow it down a bit more, lol. So I guess first things first. I suppose we can say that I should definitely upgrade the PSU as Its over a year old, runs 24/7, and the power draw of the system had gone up significantly since getting it, due to extra components, new GFX card, etc. So no problem. Ive been looking at the EVGA SuperNOVA 850 B2, as it will give me room if I decide to add another GFX card, and it has a single 12v rail. Its on sale right now for ~$110, so thats perfect.

Now, as far as CPU vs Mobo.. I dont suppose there is anyway to say with some certainty which one could be the issue? I unfortunately do not have another AM3+ board that I can use to test the CPU, so other than taking it to a local shop and paying them $40 to test it, im kind of stuck on where to go from here. Although I suppose $40 to save on buying both a Mobo and CPU isnt all that bad.. I just wish I could narrow it down myself. Im one of those people that really hates going to others for assistance Lol. Took almost a year just to post this Lol.

Anyway, thanks again for your assistance. Hopefully this will be sorted out soon. :)
 

InvalidError

Titan
Moderator
Your system should use only about 300W peak, which is well within your 600W EVGA's comfort zone so that should not be a problem in the foreseeable future... unless you do choose to go ahead with SLI.

As far as determining with certainty if you have a CPU or motherboard issue, short of having parts to swap out, I cannot think of anything foolproof for that. memtest86 might help with the initial guess but most of the failure patterns I can imagine can still be caused by either end.

If you have access to an oscilloscope and know how to use it, you could check how clean memory supplies are to possibly rule that out.
 

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510


Sorry, but how are you getting 300W? With the calculations Ive run, and using eXtreme Power Supply Calculator, Im getting closer to 600W, without OC. With OC, its closer to 700W, and thats not including the OCd RAM and GFX card, which would both be overvolted.

Anyway, thanks again, and I guess Ill just have to take the processor in to be tested. Or I suppose I can do a bit more research on the errors that Memtest is giving. Only 2 tests give errors, test 7 and 9, so that should hopefully give me a better idea of where the issue is.

 

InvalidError

Titan
Moderator

Your power calculator is using grossly exaggerated worst-case figures.

Your CPU at stock clock will use less than 100W, your GPU at stock clock will use less than 200W and the rest of your system may use about 50W so we are talking 300-350W peak when everything is stock.

Buy a Kill-a-Watt or similar device and stick it between your PC and PSU. You will likely see less than a 400W worst-case (stock) power draw and with an 85% efficient PSU, that would mean ~340W actual PSU output.
 
Solution

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510


Well then, good to know. Saves me a bit of cash and now I guess I have a good excuse to get a new Mobo. Ive been looking at the TuF Sabertooth 990FX anyway.

Thanks again for your assistance. :)

EDIT: Oh, and not sure why I never did this before, but I just checked my UPS, an APC XS1300, and it says the current load is ~230W at idle. Should probably have checked that earlier Lol.
 

InvalidError

Titan
Moderator

I'm guessing you have a bit more than just your PC and monitor plugged in your UPS to reach 230W idle... I have my modem, router, a 24-ports switch, one LED-lit display and my PC plugged into my XS1000 and even if I run FurMark on my PC on top of the ~60% CPU load I have from open programs, I only reach 264W.

 

Brad Bass

Reputable
Jul 29, 2014
6
0
4,510


I have actually tried this and it still crapped out just a few minutes after starting a blend test in Prime95. Already sent in an RMA request for Mobo and CPU. Figured why not. Might as well have backups. Going to grab the 9730 and the Sabertooth.



I actually plugged in as little as possible in hopes to have it run for more than a few minutes if the power cuts out. Although there is a few devices. Right now its the PC, 1 Monitor, modem, NAS and a 4 port wireless router.