Having issues booting previously running rig

iacobus42

Distinguished
May 13, 2011
5
0
18,510
I built the rig (specs below) last fall and it has been running fine with one weird event. A few weeks ago, under moderate load, the system "shut down" but not all the way. The video and sound were cut out, the drives stopped spinning but the fans all stayed on and the power LED flashed. I assumed this meant I had tripped a thermal trigger or that I overloaded the PSU. It happened with no warning, no onscreen tearing or lag. The power button didn't toggle the power and after about a few minutes of sitting there, I switched the power off at the PSU. I turned the PSU back on, powered on the rig and everything was fine. I wrote it off as a one-off.

Yesterday morning I go into my office and see that the door was blown shut the night before by the fan. It was hot inside, like 80 degrees or so, and the fans were all still running at normal to sub-normal speeds but again, there was no video/sound and the power LED was flashing and wouldn't toggle power. I tried the PSU power toggle and it booted but failed to POST before falling into the weird power state.

I went to class/work and came home and tried it again now that the AC had a chance to cool off the room thinking it was a heat issue. This time it POSTed and went to the GRUB bootload screen before failing into the low power state. A repeat got similar results and my work on it so far has been making no progress.

I think that since it was successfully POSTing that all the hardware was more or less fine but something was going wrong within 15-20 seconds of power on. I popped the heat sink off both the GPU and the CPU, cleaned them and reapplied thermal paste and re-set them. I checked all the hardware to make sure it was plugged in and tried removing it piecewise to see what happened. I also tried removing all the drives so it was just the mobo, cpu, gpu and ram. Nothing got past the GRUB screen after the POST.

I am not sure where to go from here. I thought maybe it was a thermal issue based on how warm the room was but cleaning everything and checking the fans didn't fix the problem (though it really needed it). I am now wondering if the PSU is failing and there that is putting the system into the lower power state. Any thoughts on that?

My build is

Rosewill RV2-700 700W ATX 12V 2.3 PSU

AMD Phenom II X4 945 Deneb 3.0 GHz (stock heatsink and clock)
Corsair XMS 6 GB (3x2GB) DDR3 RAM (stock)
ASUS CuCore Radeon HD 5770 gpu
MSI-870A-G54 ATX mobo

in a Cooler Master Storm Scout case (the airflow in the case seems pretty solid, I have watched temps on the CPU during Prime95 and GPU during gaming and haven't seen anything broader line scary).

Thanks.
 
If it gets to the GRUB menu, then it is posting....(might want to drop to single mem stick in required slot until you figure out what is going on)

See if you can begin booting from DVD....

If so, perhaps your drive has been corrupted in the earlier 'brown power' event....
 

iacobus42

Distinguished
May 13, 2011
5
0
18,510
I was able to get it to the Window's loading screen today (i.e. past GRUB) but it is still failing shortly afterwards. A reboot right after that results in failing earlier (sometimes before POSTing). It is a lot cooler out today, like 50 instead of 85, so that might be part of it.

I have tried all 3 sticks of RAM solo and in pairs in both channels of the mobo to no luck.

The problem isn't with the booting process (I am not getting a message that says no OS found or no boot loader found like I would expect with that). Additionally, I have been able to get it to the point of booting Windows or the GRUB loader (have not tried Ubuntu yet) but it often fails at or before that time. The screen just goes black and the monitor goes into standby.

I tried booting with the hard drives with the OSs disconnected (so all that was connected was a hard drive with some files on it) and I didn't get a screen that said "No OS Found" or what not like would be expected. So I have ruled out an issue with the drive/OS being corrupted during the brown power event.

Is there a good way (or place to go to) to have the PSU tested? If it is just that the PSU isn't putting out enough power or is failing or something like that, it would be nice to find out instead of trying to swap every bit of hardware in and out (especially because I am a poor grad student and do not have much in the way of hardware to swab in as a test).

 

iacobus42

Distinguished
May 13, 2011
5
0
18,510
Update:

I was able to get it to boot into Windows and loaded Speedfan to watch the temps and voltage. The temps were normal at failure as was the voltage (though if that dropped or spiked, I likely wouldn't have seen it on the gauges). Rebooted later and tried the same thing but tried to isolate CPU vs GPU vs PSU vs other failure points. I ran Prime95 to stress the CPU to see if the CPU was the point of failure but the temps stayed well within normal limits. Reboot after that failure didn't POST and I let it sit overnight.

I have now rebooted and am doing just some web browsing and what not to see how long it stays online (also printing warranty information for the PSU). I have a fan blowing into the open case to increase the air flow as much as possible in case it is a thermal problem. The room is pretty cool, 63.7 F, so I am hoping that I can keep it stable for a while.

I am going to try and get the PSU checked out this week and will post whatever the outcome ends up being.