Mystery hangs/reboots...

zirconst

Distinguished
May 10, 2010
6
0
18,510
In early 2007 I purchased a desktop computer from AVAdirect; since then I have replaced nearly every component except for the graphics card (Geforce en7600gs), case and audio interface (external; firewire.) The RAM, power supply, motherboard, and processor have all been replaced:

* 2gb Corsair RAM (was 6gb, the 2x2gb set I had in there went bad recently, in RMA process now)
* Intel Q9450 processor
* Three internal HDs, Western Digital 7200 RPM, 500-640gb each
* Gigabyte EP45-DS3R mobo

I'm running Windows XP 64bit with all the latest updates. This is a machine used as an audio workstation, so it is connected to the internet (running Nod32/Comodo, Adaware) but generally it's a work machine.

For the last few weeks I've had serious problems with the system completely hanging, or rebooting instantly. No BSODs though I realized today the "automatically restart" option was on. I am at a loss as to what is causing the problems. My first thought was temperatures, which all checked out fine, even under load. Next, I tested my RAM. One of the four sticks was bad, as mentioned above. I removed the bad stick and its companion, but the problems did no go away. Friends of mine suggested the power supply could have caused the RAM to go bad, so I replaced that. The problems persisted.

If it is a driver problem I'm not sure what the issue could be since I don't recall having installed any new hardware in months. I might have plugged in my Droid phone via USB at some point in the last month, which could have auto-configured drivers, but this isn't really exotic, and it hasn't been plugged in for awhile now. My other USB devices are a printer, mouse and two external hard drives.

The problem seems to be software related, or possibly graphics card related; I did not experience any crashes when in safe mode for about 8 hours. However, the computer ran as long as 30 hours or so at one point in normal WinXP without problems, so given that the crashes/hangs are intermittent, it's impossible to know whether the safe mode test was accurate. Likewise it has never crashed in the BIOS, as I ran Memtest on my current RAM for 5 passes (~4-5 hours) with no problems. But, you never know.

Sometimes the crashes happen almost instantly when I load up Windows. They are not linked to any particular activity. Sometimes they occur just browsing in Firefox with IRC and AIM open. Other times, when I'm playing a computer game (yet other times, I can play the same game for an hour or two with no issues.) It's not bad power from the wall either; I have a nice voltage regulator/battery backup.

I know a simple suggestion would be to just reformat. However, this is very very difficult for me to do. Audio applications, particularly VST plugins, tend to have very complex, lengthy installation and authorization processes. It would take me literally weeks to install all of my gear, and as this is my full-time job, I cannot afford to lose weeks of time unless it is absolutely necessary beyond the shadow of a doubt. My Windows install is only about 6 months old so it is hard for me to believe I somehow messed up the software so badly that it would need a total wipe.

Am I missing anything on the hardware end? Is there a good method of somehow cleaning all my drivers and/or my Windows install (eg. registry) without necessarily reformatting? How else can I identify this problem? This has caused me to lose dozens of hours of work time and I'm really at my wit's end, having already spent hundreds of dollars replacing parts and running test after test.

Thanks very much in advance.
 
Which exact Corsair stick do you currently have installed? Have you manually set the RAM speed, timings, and voltage to their rated specs in the BIOS? That's the very first thing I'd do since incorrect RAM settings are a very common cause of system instability.

What PSU do you have? A low-end PSU is another nearly guaranteed way to have an unstable system.
 

zirconst

Distinguished
May 10, 2010
6
0
18,510
The exact Corsair set I have now would be 2 sticks of 1gb XMS2 PC2-6400 DDR2 800MHz CL5 (5-5-5-12.) I don't ever mess with the RAM speeds, timings or voltage myself; I actually had my local Micro Center install my current mobo about a year ago (the way my case is wired, it's hard for me to do) and so they would have set that stuff up. I can double-check that everything is set properly but I'm 99% sure that it is. The local MC guys are very good and wouldn't risk damaging a customer's computer by messing with the RAM timings.

The problem happened with my old PSU as well as my new one. Both are well above my voltage needs (600w old, 700w new), the old one being Seasonic and the new one being a well-reviewed CoolMax.
 

asteldian

Distinguished
Apr 23, 2010
1,116
0
19,360
The unstability of the system does smell of a RAM problem. The crashes happen randomly and often it just freezes as opposed to actually crashing?

Of the two sticks remaining have you tried just one in the system? I know you tested the memory and these two sticks seem ok, but it is a very easy to test the system by having one stick of RAM and then using the comp as usual, if it crashes try the other stick on its own.

It could even be some of the RAM slots themselves that are damaged. So you may find playing around with the one stick in different places may help.

I would also go into your Bios and check the RAM settings - you don't need to change anything, just confirm that the RAM is running at the correct speed and latency and voltage.
 

zirconst

Distinguished
May 10, 2010
6
0
18,510
I have not tried running the computer with one stick. I hesitate to do this simply because that would leave my system with a paltry 1gb of RAM, meaning I'd be unable to use it for any audio work. Since the crashes sometimes do not occur for 24 hours or more, this is not an ideal situation. I did test both sticks of RAM extensively as well as the RAM slots; how likely is it that such tests would reveal no problems when problems actually exist?

I'm currently waiting on a return for my 4gb kit. What I can do is replace my current 2 sticks with the new set, thus eliminating that variable and allowing me to work at the same time.

I will go ahead and check the RAM settings shortly.
 
I'm still betting on incorrect RAM settings in the BIOS. Memtest86+ is great at detecting faulty RAM, but often doesn't show errors when the settings aren't set correctly in the BIOS. I've seen multiple unstable systems pass Memtest86+ but crash during normal use. Manually setting the RAM values in the BIOS fixes the problem the majority of the time.
 

zirconst

Distinguished
May 10, 2010
6
0
18,510
My BIOS RAM settings are as follows:

Freq: 800mhz, 5-5-5-18 timings.
DRAM voltage 1.8
Term 0.9
Ch. A ref: 0.9
Ch. B ref: 0.9

My RAM is normally timed at 5-5-5-12. But isn't a more lax timing better anyway? Are those voltages OK?
 

zirconst

Distinguished
May 10, 2010
6
0
18,510
Just wanted to update you all - the problem turned out to be a bad USB cable. I'm not even joking. The lone BSOD I got several days after I last message indicated a USB problem, so I unhooked my devices one by one. The culprit was the USB cable leading to an external hard drive. The drive itself was working perfectly, which was even more baffling (the computer had no trouble detecting it, reading, writing.) As soon as I replaced the cable, all of my problems vanished.

Amazing!