Hello everyone,
I have a dilemma with my system. It randomly freezes while I'm using it and requires a hard reset. It does not blue screen, it does not get slow and it does NOT unfreeze no matter how long you allow it to sit. It seems like it freezes more often when I'm running heavy graphics, but it could just be that the load is higher overall.
I have run memtest for days on end with no errors, updated the BIOS, adjusted the voltage settings (a recommendation I found for the motherboard and memory) and tested under multiple operating systems (Windows XP 64-bit, Windows Vista Ultimate 64-bit, Knoppix, Ubuntu) and received the same results.
Here are all the components, currently running on Windows XP Professional 64-bit:
I've also thrown in extra fans, a Zalman CPU fan/heatsink and a couple DVD drives. The CPU rarely gets above 50 C.
At this point I'm thinking it's either a bad motherboard, two bad video cards or possibly some kind of conflict between the motherboard's RAID and nVidia's software storage controller (which I've disabled, but still shows up). I may be testing the RAID soon (by reinstalling to a single drive) but I'm not sure how to test the other hardware.
I don't know how easily this can be solved, but I would greatly appreciate any support and I'll do my best to answer any questions you have. Thank you!
Got a few blue screens when the video drivers updated but they went away with a fresh install, so I doubt that was relevant. Other than that, no error messages, no blue screens.
I'll check the GPU temp after the next crash (very soon)
Edit: Temperatures after crash:
CPU: 48C
GPU: 47C
Edit2: I have also tried disabling SLI and running with one graphics card (first tried one, then the other). None of that seemed to make a difference. I'm leaning towards a bad motherboard at this point, but I don't feel confident I did everything I could to change voltages and other settings. Also, I'm not sure how to run an overall diagnostic against a mobo (or if you even can).
Message edited by rhathar on 08-02-2009 at 09:21:22 AM
CPU Core: 1.30000
CPU FSB: 1.30V
Memory: 1.750V
nForce SPP: 1.40V
nForce MCP: 1.550V
Cores are set to +00mv, +00mv, +00mv, +00mv
Intel Speedstep is disabled. CPU Thermal control is disabled. C1E Enhanced Halt State is disabled. VT is enabled.
Any other BIOS stats I haven't touched or really looked at.
Edit: Interesting news! Running the 'torture test' with Prime95 locks it up within seconds (tested three times). What do I do with this information? Also, I'm rerunning memtest86+ tonight (for the next 8-12 hours).
Message edited by rhathar on 08-03-2009 at 02:42:51 AM
memtest is on pass 7 right now (12 hours in). I'm thinking I should let it keep running until after pass 8 before I mess with more settings.. what do you think? edit: pass 9 finished with no issues
I've reset the 'auto' stuff to hard numbers, changed the voltage settings on the different CPU cores to all be +00mv and updated the previous posts with current information. Also, I think 'sync' mode for the timing is 1:1, because the other options are Auto, 5:4 and 3:2. Also, I see options for 'P1' and 'P2' that are currently set to 'Auto' (the other option is Enabled).
Message edited by rhathar on 08-03-2009 at 02:45:19 AM
A couple of things come to mind to me. First, these components scream heat. Try running with the side off the case for a while.
The second, and most important, is where you said you turned speed stepping off. Turn it back on so your processor throttles like it should.
The north branch heat sink might not be doing a good job. Lots of work through that area causing lots of heat.
When you say "these are temps after crash".... so what. What are they just before crash. You lose a lot of heat during crash and reboot.
EDIT: I had some of these problems before I lost a processor.......
EDIT II: a while back there were some problems with nvidia mother board drivers and raid set ups. Can't remember any more what it was.
Message edited by swifty_morgan on 08-03-2009 at 10:53:52 AM
I've tried a few different things. First, I turned Speed Stepping back on. I've also been running with the case open and an additional floor fan nearby. Neither of these helped.
When I have time later I'll try running with one stick of RAM at a time. My next plan is to back up all my data to an external drive (run dd off a Knoppix CD overnight...) and then try a fresh install to a single disk.