Puzzling hard shutdowns

revolink24

Distinguished
May 17, 2007
70
0
18,630
I'm a computer professional, and even I'm a bit stumped by this and the sheer number of things that could be wrong.

Every once in a while (notably, it is easily repeatable by running the digital photo test in PCMark Vantage) my computer completely shuts off. Pressing the power button does nothing, I must first cycle the PSU for it to start again.

Things I have done:

Set the motherboard to optimized defaults.
Messed about a bit with voltages on the CPU and DRAM.
Unplugged all my extra hard drives and CD drive, so only one SATA device was present.

System specs:

AMD Phenom II X6 1090T (3.2 stock reproduces the problem, was at 3.8) (aftermarket cooler, 38C idle)
Gigabyte GA-890FXA-UD5
G.Skill Ripjaws 1333 2x2
Western Digital Green 1.5TB, Partition 0 NTFS system drive, Partition 1 NTFS Files drive
All other hard drives have been unplugged.
Zalman ZM600-HP
Sapphire HD4850, reference.

Haven't yet tried replacing known good anything, I don't have anything to test.

Here's my problem. It could be the PSU, but it seems like that should be fine given the specs. Replacing the GPU with a known good one would upset the PSU balance, and perhaps cause the problem to no longer occur, resulting in me replacing the GPU to no avail. I'm running Memtest86 now. Running single core 3.5 causes an instant reboot, multi core 3.5 does not start, 3.4 currently running. Will update with results.

Will run Prime95 and furmark when this has finished and post results.

Any ideas in the meantime?


 

coldsleep

Distinguished
Dec 18, 2009
2,475
0
19,960
What if you leave it for an hour or two? Do you still need to cycle the PSU?

Aside from the PSU part, it sounds like temperatures. Have you run CPUID (or similar) to monitor the temps while you run PCMark Vantage?

Also, how long are you running the test before it shuts off? Is it within seconds, or a few minutes, or what?
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
Haven't tried leaving it for an hour or too, too impatient.

My CPU temps are usually quite good - I'm running a Xigmatek HDT-S1283 and stock speeds. My biggest concern is the GPU temperatures - the 4850 is known to be excessively hot and reach temperatures of up to 100C, but on an image manipulation test? Seems like the memory would be at fault on that one.

A thermal shutdown was my first guess, but I'm still uncertain. I'll get CPUZ and GPUZ, run them and monitor the test once memtest finishes - currently at 84% with no errors, so it doesn't seem like I'll get any.
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
Yeah, I just didn't have the time, I'll do that tonight if nothing else happens here and see what happens. The fact that the shutdown happens almost every time makes me think one pass might have shown it if the problem was really there.
 
That sounds like a PSU problem.
How are the fans arranged in your case? Although it is generally better to have more exhaust than intake, too great an imbalance will not allow the PSU to draw enough air to cool itself.
How warm is the air coming out of your PSU? How does your system perform if you leave the side off your case?
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
Same problem with the case panel off, PSU is quite cool, although there's not much fan down there (the PSU itself has a 120mm fan and big ol' heatsink). I've been getting vertical stripes on my monitors too, which could also be either a bad card (overheating or bad GDDR) or a bad PSU.

The thing is that if I replace my 4850 with a card other than a 4850 to test the card then the PSU might suddenly start working fine (lower load) and such. I'm going to put in my OCZ Evostream 720 soon to see if that's any better.

Also, running full load on CPU and GPU does not replicate the problem, so therefore it isn't really related to heat of the GPU/CPU or draw on the PSU.

My case is a P182 with a front intake in front of the upper hard disk bay and exhausts out the top and rear, bottom mounted PSU.
 
The fact that the power supply has to be turned of at the mains before it will turn back on again means that the power supplies internal protection has been activated. About the only way this can occur is if the power supply its self is faulty, other ways this can occur are unlikely.
Replacing the power supply will fix your problem.

 

revolink24

Distinguished
May 17, 2007
70
0
18,630


What do you think of the possibility of faulty wiring before the power supply, like a faulty surge suppressor?
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
Oh, and one more interesting tidbit - while it does require the PSU to be cycled (by switch or unplugging/replugging) the motherboard's LEDs remain on indefinitely (not by capacitors) so that is still being powered.
 
If you unplug the PSU, the separate +5VSB circuit in the PSU will be shut off, and that LED should go out within 15-30 seconds or so.
If the GPU is defective (for example experiencing thermal runaway), it would show the lines on your screen, and then possibly overload your PSU, forcing it to shut off.
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
I should add that one of the reasons this is so puzzling is that this power supply has been powering my older setup, (same GPU, but 4GB DDR2 and Phenom 9950BE) for months, and that rig should have been MORE power intensive. It never did it before I upgraded the motherboard, CPU, and RAM (my only changes). As I understand it, the motherboard or processor could be malfunctioning, causing excessive power spikes and resulting in the OCP protection being tripped.

Oh, and the GPU was definitely bad, running a 9400 now, but the shutoff problem is still there, so the GPU wasn't the cause of the spikes.

Once again, the problem ALWAYS occurs during the photo manipulation in PCMark, and never happened before. It does not happen under full load on all internal components.

I'm starting to suspect motherboard woes.
 
Looking at your specs, and allowing for a little headroom, a quality 450W-500W PSU is enough for your system. Look for a model with full range active PFC (no little voltage switch) and 80+ certification. Antec, Corsair, Seasonic, and Enermax are among the better brands.
 

revolink24

Distinguished
May 17, 2007
70
0
18,630
I just ran the test 5 times with my old PSU, worked like a charm. I'll run it a few more times just to be sure, but I'm thinking the issues are resolved.

That said, the first run with the new PSU resulted in a BSOD. Hasn't happened since though, and it didn't just turn off so that wasn't related to the OCP circuit.

Hopefully a replacement PSU should sort it out.