Random Restarts

Hozer

Distinguished
May 23, 2006
71
0
18,630
I built a computer about 9 months ago with the following specs:

Antec TruepowerII 550
Radeon x1900 xt
Conroe 6400
Asus P5B
corsair xms2 ddr2 (2x1gig sticks)
windows xp pro sp2


Until recently the computer had been functioning extremely well. Recently, during high CPU usage the computer will restart. Most of the time it reboots back up, there is no Windows System error. The cpu, video card and hard drive are all ~80F, so I'm rather certain heat is not the problem. I tried running with just one stick of ram, and then with the other, same problem. I reformatted the computer, same problem. I updated my video card drivers, and flashed the most recent bios.

It happens sparatically, sometimes 20 minutes into the high usage, sometimes an hour.

I'm desperating trying to figure out why, and resolve this, and if there are any idea's you could pass my way, I would be very grateful.

Thanks in advance!
 

Hamarabi

Distinguished
Dec 6, 2006
38
0
18,530
Forgive me, if this advice seems obvious. I have no idea if you have tried the following.

What are the motherboard and CPU temperatures under load? 80 F (27 C) must certainly be idle temperatures. Idle and Load temps will vary greatly. I'm sure you have a properly installed heatsink and CPU fan, so it must be something else. What about video card temperatures? Are they within acceptable perimeters, under load?

Are your fans full of dust? Too much dust in the fans can cause overheating, because it will prevent air flow. Check the intake/exhaust fans, CPU fan, video card fan, and power supply fan.

Try using a temperature monitoring software, along with torture testing software [Speed Fan, CPU-Z, Prime95 (Orthos for duel core), 3DMark05 for video, memtest86 for memory etc.,] Test one thing at a time (ie CPU, RAM, or Video Card). If you put your system under 100% load while monitoring the temperatures you will be able to troubleshoot a little easier. You may be overheating under full load, but not idle, which can be remedied quite easily(fans, heatsinks, thermal compound, ambient temperatures, and airflow).

Temperatures are not always the reason for reboots, sometimes it is the voltage settings in the bios. Check your CPU, motherboard, RAM, and Video Card specs for proper voltage settings. Multipliers, dividers, timings, and voltage settings all play a crucial role.

Short circuits can cause overheating and rebooting, or worse. Check to see if there is any metal, like a missing screw lodged somewhere on the main board.

The easiest way to troubleshoot your parts, would be to swap some other parts in ... one at a time to test. If you don't have a spare computer that you can borrow parts from, try to borrow some from a friend. You'll need to test the Power Supply, RAM, CPU, heatsink/fan, Motherboard, and Video Card. I would swap out the easiest (RAM, Video Card, and Power Supply) first. Of course, swap and test them one at a time to find the culprit. It's possible both sticks of RAM are bad, but unlikely. It is also possible that the video card, CPU, motherboard, or power supply are defective, even though they are new.

Good luck!

Hamarabi
 

heltoupee

Distinguished
Feb 19, 2007
79
0
18,630
^^^^^ GREAT advice!!!

Also, your PSU could be flaky and tripping off under load. Also check the power cable and power strip it's plugged into. I've solved this problem on 2 separate machines by just replacing a power strip.
 

HenrikG

Distinguished
Aug 21, 2005
63
0
18,630
I'm surprised no one has mentioned a bad Power Supply. Yeah, 550W should be enough... BUT... power supplies can go bad too. This is classic bad power supply behavior.

What else do you have in that box? How many hard-drives, how many optical drives, how many periferal cards in your PCI slots, etc.?

I wouldn't worry too much about your memory cards (or other hardware for that matter) causing the reboot. You would most likely get some sort of error (at least in your event viewer logs).

I've had the Power Supply be the cause of reboots before. Let me guess, you're in the middle of some 1337-s4uce pwnage and BLAMO... your computer restarts. :)

Get a new power supply. May I suggest a Zalman... if the problem persists, I'll be surprised.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
Wow, these are excellent responses.

The box is actually rather empty. A single 300 gig SATA Seagate, a single dvd burner. As for dust, I am very adamant about keeping it clean, so that is something I can rule out.

The event viewer is not showing anything aroudn when the crashes occur.

I actually tried the power settings on my processor, I think that is a very possible problem. The values were set all at "Auto" like Asus likes to set things. I changed them to median values, and the computer would restart as soon as Windows loaded. I consider myself generally competent with hardware, but the power settings in bios is something I do not understand well.

I ran memtest86, and it completed successfully with no errors.

Is there a link someone could post that would describe the best way to understand how to set core values? Or a brief description? Thanks again for the help, it is greatly appreciated.

Edit:
Also, is there a decent way to test the power supply, aside from replacing with a different and seeing if the problem persists? The only reason being, this is the best power supply I have, and I don't think any of the others are suited to put out the power this one does.
 

alcattle

Distinguished
Jan 25, 2007
1,831
0
19,780
I reccomend Mondoman's solve also, but you can test with a lesser PSU. All you want to test is the MB, CPU and RAM, those pull like 5W ( don't know, but is nothing compared to drives and GPU's)
Take a known working PSU, take the parts out on cardboard, some people say the antistatic bag, others don't. then hooked them up and jump start it.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
I reccomend Mondoman's solve also, but you can test with a lesser PSU. All you want to test is the MB, CPU and RAM, those pull like 5W ( don't know, but is nothing compared to drives and GPU's)
Take a known working PSU, take the parts out on cardboard, some people say the antistatic bag, others don't. then hooked them up and jump start it.

This is a good idea, but the problem really only happens when I am, as HenrikG put it .
Let me guess, you're in the middle of some 1337-s4uce pwnage and BLAMO... your computer restarts. Smile

I can pick a power supply tester up on my way home today, hopefully that will give me a better idea.
 

kasperlindvig

Distinguished
Nov 28, 2006
29
0
18,530
Replace the motherboard, that should solve the problem. I had a similar issue, the resistors on the motherboard had gone bad during use and caused instability in the power to the cpu causing random reboots. I replaced the mb and the problem disappeared. I believe the psu, ram and cpu is OK.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
I suppose this would be a good time to ask for a recommendation on a board. I bought the P5B when the Conroe's had just come out, and there were very few boards that supported them.

The MoBo would need to support the e6400, and be geared towards gaming. Overlocking is not a huge issue for me, I don't generally mess with it unless the extra speed is needed.

So any recommendations on a strong board would be greatly appreciated, if in fact the power source turns out to be in good shape.

Edit:
I've used Asus boards for the last 7 or 8 builds, and it seems each time I become less satisifed, so this might be a good chance to try a new board.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
So I have run Orthos, memtest86, swapped video cards, reformatted, would a reasonable next step be to purchase the following motherboard?

http://www.newegg.com/Product/Product.aspx?Item=N82E16813128012

If it would be worth trying to fiddle with the bios settings that control the volts, is there a link to a guide, or general practices that go along with that?

Thanks again for all the help, saves me a lot of time, and possible costly mistakes!
 

goldragon_70

Distinguished
Jan 13, 2007
731
0
18,980
If it's the motherboard, that will probably be the last thing that is replaced in the system, because just about anything/ any part, can be causing the problem. BTW are you overclocking anything?

edit: before you go too far, you might want to try resetting the BIOS too.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
I am not overclocking at all. I reset the bios to default settings, tried it, then flashed to the most current version of the bios, and am still experiencing the issue.

Prior to restarting, under full load, the temperatures are still at very manageable values, and the computer is performing very well. I have tried running multiple graphic intensive games simultaneously, CPU usage goes to around 85%, but both games will run at 60+ fps, and perform very well. I have no had a restart occur in any other situation than running multiple games, or multiple instances of a game simulataneously.

I realize that this would seem like simple fix of "don't run under those conditions", but the computer easily has enough juice to perform, and prior to the restarts, functions exactly how I would hope it would.
 

rquinn19

Distinguished
Sep 8, 2006
166
0
18,680
First of all did you uncheck the automatic restart under sysytem failure so u know if u actually have a bsod which if you do might give you some insight as to whats going on as well.
 

goldragon_70

Distinguished
Jan 13, 2007
731
0
18,980
ok, there is a few other suggestions, trying another power supply, and not have the rom drive connected and any cards (other then the graphics card) and drives other then your main HDD, and then to try a diffrent HDD. After that, it can be assured that it's the mobo (weather it's BIOS related or not).
 

Hamarabi

Distinguished
Dec 6, 2006
38
0
18,530
The E6400 Conroe requires 65 watts of power. Please insure that your entire system's power requirements are met. Here is a Power Requirement Calculator you can use to determine if your system demands are being met (Use the Lite Version, it's free): http://www.extreme.outervision.com/psucalculator.jsp

The Corsair XMS2 RAM has several different chip releases that require different voltage and timing settings to run at stock speeds. Raising the Front Side Bus and voltage for your CPU will increase the clock speed and your RAM will also speed up (possibly past it's capability), unless you lower the memory divider at the same time. If your RAM wants 1.8, 1.9, 2.0, 2.1, 2.2, etc., volts to run at stock speed, then you need to configure that in the bios. If your default bios setting has your RAM running above or below it's specs, you will have problems. Some RAM / motherboards require a certain voltage setting to run in Duel Channel Mode or your computer will be unstable. It could be just a matter of bumping the voltage and your system will stabilize (Monitor ALL Temperatures). Don't bump the voltage without checking the specs on your motherboard and RAM, first. Learn to Overclock, so you can understand how everything works. This will allow you to troubleshoot problems like this a great deal more easily: http://www.rebelshavenforum.com/

You need to run those tests (Orthos, memtest86, 3DMark, etc.) for more than just a few minutes to obtain any meaningful information. You should run each test for a minimum of a couple hours. Some tests will run fine for a couple hours with no errors, but then after 6 hours, fail. This means your system configuration is unstable, so you will have to go into the bios to adjust the settings. It could be that one of your cores are not working properly. Check Orthos to make sure both cores are operating within a small variation of each other. Use a windows temperature monitoring software to keep track of ALL Temperatures, while under load. If one core is bad, you will have big problems trying to multi-task.

Applications will compete for the same resources and may even fight for the same memory address as your kernal, which will cause your system to crash. If either core is operating with errors, you will run into big problems, including reboots.

You say that FULL Load temperatures are manageable, but you never mentioned what they were. Full load temperatures should be around 50C (122 F) for most CPUs (Lower if you have good cooling). You can safely go a bit higher, but due to engineering safety factors that were put in place to protect against abnormal events, you should strive to stay at or below that temp during full load on either core.

The Power Supply is extremely important and if it doesn't meet the systems demands, you will have shutdowns and reboots. It takes a great deal more power to run at load (playing games or running Orthos) than at idle, obviously. I would say one of the easiest ways to troubleshoot this problem would be to swap in another Power Supply to see if that is the culprit.

If you would like directions on overclocking or configuring your bios, just Google it or check out the forum page I linked above. There are thousands of guides out there that will help you. Warning: It will take time and study to understand it all, however, there is nothing more complicated than grade school math. The only time you will need more advanced math skills, is if you plan to re-design or mod the circuit boards, themselves.

Good luck!

Hamarabi
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
Hamarabi, thank you for the very detailed post.

I just wanted to add one more thing, here is the message I get when it blue screens before restarting:

------------------------------------------------------------------------------------

Error Text

A problem has been detected and windows has been shut down to prevent damage to your computer.

While restoring the previously saved floating point state for a thread, the state was found t be

invalid.

If this is the first time you've seen this stop errror screen, restart your computer. If this

screen appears again, follow these steps:

CHeck to make sure any new hardware or software is properly installed. If this is a new

installation, ask your hardware or software manufacturer for any windows updates you might need.

If problems continue, disable or remove any newly installed hardware or software. Disable BIOS

memory options such as caching or shadowing. If you need to use safe mode to remove or disbale

components, restart your computer, press F8 to select Advanced Startup Options, and then select

Safe Mode.

Technical Information:

*** Stop: 0x000000E7 (0x00000001, 0x000000000, 0x000000002, 0x00000000)

Beginning dump of physical memory
Physical memory dump complete.
Contact your system administrator or technical support group for further assistance.

------------------------------------------------------------------------------------



the .dmp shows this:


-----------------------------------------------------------------------------------------------------

CUSTOMER_CRASH_COUNT: 2

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xE7

PROCESS_NAME: WoW.exe

LAST_CONTROL_TRANSFER: from 804fe9e3 to 804f9deb

STACK_TEXT:
a321d974 804fe9e3 000000e7 00000001 00000000 nt!KeBugCheckEx+0x1b
a321d9a4 ba2f45b1 87848870 00000000 876ecc40 nt!KeRestoreFloatingPointState+0x79
WARNING: Stack unwind information not available. Following frames may be wrong.
a321da00 ba33594f 00000001 0000001b 87513008 ctoss2k+0x35b1
a321da18 ba3359a1 87513008 87e10978 e19f6c84 portcls!PcDispatchProperty+0x130
a321da40 ba3bbf4c 00000004 875da0e8 875da0e0 portcls!PropertyItemPropertyHandler+0x2b
a321daa4 ba3bbec9 87513008 0000001b e19f6c84 ks!KspPropertyHandler+0x616
a321dac8 ba333603 87513008 0000001b e19f6c20 ks!KsPropertyHandler+0x19
a321dadc ba3351df 87513008 0000001b e19f6c20 portcls!PcHandlePropertyWithTable+0x1b
a321db14 ba33369d 87e108f0 89c126c8 87513008 portcls!CPortPinWavePci::DeviceIoControl+0x1eb
a321db30 ba3bbf0f 89c126c8 87513008 a321db58 portcls!DispatchDeviceIoControl+0x49
a321db40 ba333880 89c126c8 87513008 00000000 ks!KsDispatchIrp+0x126
a321db58 ba333841 89c126c8 87513008 a321dbd8 portcls!KsoDispatchIrp+0x43
a321db68 ba393d67 89c126c8 87513008 89c52138 portcls!PcDispatchIrp+0x5f
a321dbd8 a5b242a2 87513008 875da0e8 e16613d0 ctaud2k+0x4cd67
a321dc30 ba3bbf85 87dfbba8 87513008 a321dc64 sysaudio!CPinInstance::pinDispatchIoControl+0x153
a321dc40 804ef095 87dfbba8 87513008 806e4410 ks!DispatchDeviceIoControl+0x28
a321dc50 8057e70a 87513150 8767fb60 87513008 nt!IopfCallDriver+0x31
a321dc64 8057f56d 87dfbba8 87513008 8767fb60 nt!IopSynchronousServiceTail+0x60
a321dd00 805780c2 00002308 00002150 00000000 nt!IopXxxControlFile+0x5c5
a321dd34 8054086c 00002308 00002150 00000000 nt!NtDeviceIoControlFile+0x2a
a321dd34 7c90eb94 00002308 00002150 00000000 nt!KiFastCallEntry+0xfc
0012fca4 00000000 00000000 00000000 00000000 0x7c90eb94


STACK_COMMAND: kb

FOLLOWUP_IP:
ctoss2k+35b1
ba2f45b1 ?? ???

SYMBOL_STACK_INDEX: 2

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: ctoss2k

IMAGE_NAME: ctoss2k.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 42f6830d

SYMBOL_NAME: ctoss2k+35b1

FAILURE_BUCKET_ID: 0xE7_ctoss2k+35b1

BUCKET_ID: 0xE7_ctoss2k+35b1

Followup: MachineOwner
-------------------------------------------------------------------------------------------------------------------------



Maybe this is helpful?
 
I'd look at the PS, too. You should be able to get a cheap ($10-15) power supply tester from Radio Shack or online.

Might work, but I don't think that that is particularly promising. A momentary dropout of any of the PSU outputs can cause the "POWER GOOD" signal to drop which can reset the CPU.

You can try increasing the CPU core and memory voltage by .1 vdc or to ensure the the CPU and memory are getting enough power for whatever you are doing.
 

Hozer

Distinguished
May 23, 2006
71
0
18,630
The board is set to "auto", so I'm not sure what value would be .1 volt higher than "auto". Also, when I changed it to a medium value in the range, it just reset as soon as Windows loaded.