New computer random crashes-I have tried everything!

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
Hello everyone, I recently built a new computer and it is having random crashes. And when I say random I mean random - I got through two levels of L4D2 yesterday, but sometimes I can't even open Firefox without it crashing. I have tried diagnosing everything I can think of and was hoping for some external ideas.

First, my specs:
Asus Sabertooth X58 Motherboard
Intel i7 950 w/ stock cooler
6GB Corsair PC-12800
PNY GeForce GTX 460
Seagate Barracuda 1TB drive.
ULTRA 1KW power supply
Windows 7 64 bit

Now what I have tried:
Configured RAM using XMP, checked that it is on Asus's approved memory list. It runs stable for as long as I want in Memtest86.
I sent the motherboard in to Asus. They sent it back to me saying nothing was wrong.
I checked my hard drive with chkdsk, and SMART values are fine.
I had these problems before I left for a trip and when I got back my RAM CPU and Video Card were stolen. With replacement parts, the problem persists.
The power supply was used in my old computer with no problems, and the rated wattage on it makes me think it isn't the problem.
I boot Ubuntu off of a flash drive, and I still get random crashes, and my girlfriend has the exact same setup as me and her system is completely stable.
I can run a "stress test" on my CPU, RAM, Video card, and hard drive in Everest with no problems-Temps stabilize at around 80C.

My conclusions:
I don't think it is software related due to the same thing happening with two different OSes. Similarly the since Ubuntu didn't even mount my hard drive, I don't think that is the issue either.

I have tried two different graphics cards and processors. I would be stunned if they both had the same error or something.

The mobo has been checked out by Asus and they saw nothing wrong. I am inclined to trust them.

I don't think it is the power supply since it has more than enough rated power and the voltages were stable in the stress test.

I have really thoroughly tested out the RAM since random crashes normally scream RAM errors. Memtest86 overnight stable seems to disprove this though. I know i am using approved ram with its hard-coded settings too.

My question:
Can you guys think of anything else that may be causing this problem? I really can't figure out the culprit!
 
Solution
Hehe. "Clearly" is a word we don't use too often around here, except maybe in the case of a fire-blackened psu. In fact, when you think about it, its sort of amazing this crap works at all. Anyhow . . . sorry I missed the "I am testing it now" . . . I was jumping very quickly on the "OC failed" symptom.

Look. I still don't know exactly what you have done and not done, except now you say you're fairly sure its XMP.

If you did the CMOS reset, and now the system works, you still can't be certain its the XMP. Once you are sure its running stable, you can go back and re-XMP.

If you did the CMOS reset, got it to work for a while, then put it back into XMP, and then it failed again . . . as long as it ran well for a long enough period of...

Lutfij

Titan
Moderator
Welcome to the forums, newcomer! Though your post shows your not a newcomer in this field :)

Um, heres a small question, before i answer yours, was your machine prebuilt or built by you?

Chances are that your having a short within your case or your PSU.Remove the board from the case and check if the mobo is on stand-offs. Double check the connections to the board(including the 8pin power connector next to the CPU (its usually a tiny error we overlook)

Don't get me wrong but I don't trust the PSU manufacturer. you can try getting another 650W PSU from a reputed brand such as Tt, Corsair, Antec...and see if the problem persists. Do both You and your gf have the same workload on the machines ? (as you mentioned both of you have the same riggs)

Lastly, take an eraser and rub the gold contacts of the rams, replace them into the machine and see if problem persists. Another thing to do, get another set of ram (maybe your gf's) and check if the memory is stable.
 
Random crashes: memory, fluctuating power, and some other stuff (heat *appears* to be eliminated for the moment.

I know you've run Everest and memtest. Despite that, please run Prime95 for an hour or two using CPUID's Hardware Monitor to watch temps, and record the high. My experience is that Prime95 uncovers more memory errors than memtest, perhaps because of the heat generated and overall stress level. Everest probably works fine, but we *know* CPUID Hardware Monitor.

Hopefully Prime95 will crash before too long with a rounding error, and we'll have found a memory issue to work through.

Ultra makes a decent psu, and this one has plenty of power, though its "used". We'll worry about that possibility later.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
Though your post shows your not a newcomer in this field :)

Thanks! That is quite the compliment!

Um, heres a small question, before i answer yours, was your machine prebuilt or built by you?

Chances are that your having a short within your case or your PSU.Remove the board from the case and check if the mobo is on stand-offs. Double check the connections to the board(including the 8pin power connector next to the CPU (its usually a tiny error we overlook)

I built it myself. I would try swapping out components with the GF, but she is 1000 miles away at the moment. (Part of the reasons we wanted the rigs were so we could play some SC2 as bonding time :) ) I know that there isn't a short in the computer with the motherboard, because that was the issue I had when building my first computer - I am very paranoid about them now. I checked the connections on the mobo, and they are both fine.

Lastly, take an eraser and rub the gold contacts of the rams, replace them into the machine and see if problem persists.

I am a bit skeptical about this. What is the reasoning behind doing this? The contacts are clean enough for it to pass memtest - I would think dirty contacts would cause larger problems than just resets.

Try your gf's power supply too or try yours in her's see if the problem persists then you'll know if its that.

As mentioned above, I can't do that. I will however give my old power supply a try. It is only 500 W but it is worth a shot.

Hopefully Prime95 will crash before too long with a rounding error, and we'll have found a memory issue to work through.

I will give Prime95 a shot. Would you mind explaining what a rounding error is? I am having problems thinking of how a digital machine would need to round at the RAM level. I could understand for graphics cards rounding but not the system itself.

Thank you all for responding, I am sure you can see why I am so frustrated!
 
P95 is calculating prime numbers, ie, doing a lot of math. If the math doesn't turn out as expected, it will detect a bit error and say it found a rounding error - result different than expected.

Hang in there - odds are good we'll eventually get this one together.

Edit: Erasing the contact points actually removes any tarnish that might interfere with contact. Its a long shot play, but . . . you've tried everything :)
 

Lutfij

Titan
Moderator
Thanks! That is quite the compliment!

Your welcome!

Erasing the contact points actually removes any tarnish that might interfere with contact. Its a long shot play, but . . . you've tried everything

:) thank you Twoboxer for that

I had something of a similar problem. Weird stability issues, had my rig restart several times (even after a cold start) until it reached desktop. I learn't the contact cleaning trick when i noticed some 'tarnish' on the contacts which after cleaning had my machine stable again.

You need to think of your rig as a formula one car, have things well oiled and they'll run smooth. Have some dirt in the pistons and they wont fire properly.

Bonding time huh!?! :) SC2 was an awesome game, looking into the Razer/SC2 gaming hardware this vacation.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
It ran Prime95 overnight with no problems. When I tried to get online to post this post, though it crashed. I am basically starting to think it works fine unless I am around to watch it, haha. Since Prime95 ran fine, I am beginning to less and less doubt the RAM.

To double check that it isn't a grounding issue, I have removed my motherboard from my case and put it on an insulating surface (my plastic cutting board). I have also unplugged anything that isn't essential. It is just as unstable now as it was before.

Swapping out the power supply is next on my list. If I remember correctly though, my old power supply only has the 4-pin connector for the motherboard that goes next to the cpu. My new MB uses the 8 pin connector. Can you think of a way around this?

One pattern I have noticed is that the when it crashes, it makes it more likely to crash again if I try to boot immediately. Right now it can't even get to Windows before it dies.

Also, sometimes my BIOS gives an error that "overclocking has failed" even though I am not overclocking. Could this be an XMP issue? Also, my graphics card is factory overlcocked, but I doubt that is the problem.
 

Lutfij

Titan
Moderator
AH, that was the problem! Your under powering your board and by that i mean you don't have the necessary power connectors plugged in.

Can you think of a way around this?
Get a 4pin molex-8 pin cpu connector/convertor like this. Available at any microcenter and electrical store (with some looking)

To rule out the XMP issue, go into BIOS and set your memory profile to auto/default (or one that has the lowest timings)
 
Yes, it does.

Clear CMOS** and reload BIOS defaults. Let the board run memory as it wants with default settings.

Also, uninstall or disable any Windows programs that can affect "clock" settings. (Mobo & other utilities, etc.)

See if that changes things.

** Pull plug, remove mobo battery, press case power switch a few times, go grab a cup of coffee (wait 5 min), come back and replace battery, plug into wall, boot straight into BIOS, load defaults, save, exit, and boot into windows.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
AH, that was the problem! Your under powering your board and by that i mean you don't have the necessary power connectors plugged in.

What do you mean? I am running with the new power supply which does indeed have the 8 pin connector. I was questioning if it would be worth it to try a power supply swap if I only have the 4 pin on the old one.

Yes, it does.

Clear CMOS** and reload BIOS defaults. Let the board run memory as it wants with default settings.

Again, I think we have pretty much cleared the RAM at this point. Memtest and Prime95 can both run overnight with no problems. I am not really overclocking my RAM anyway. XMP is just making it run at what it is spec'ed to run at. Or do you mean there is another setting that is interfering?


EDIT: I have begun to realize it is unstable at idle rather than at load. Does this seem significant to anyone?
 
AFAIK, the max memory speed of the 950 is 1333. 1600 requires an "overclock". Yes, it should happen automatically.

But if you haven't OC'd or changed anything else on the board, why does the mobo protest about "OC failed"? Its obviously dealing with a setting it doesn't like. Where did that setting come from? Corrupt CMOS, an errant finger?

So I'd suggest you try reseting the mobo to "stock". Maybe its not the memory, maybe some other setting is wrong. Maybe running the memory at "stock" 1333, not XMP, will resolve the problem. Dunno.

If the instability goes away, we at least have narrowed down where the problem is.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
AFAIK, the max memory speed of the 950 is 1333. 1600 requires an "overclock"

...

But if you haven't OC'd or changed anything else on the board, why does the mobo protest about "OC failed"? Its obviously dealing with a setting it doesn't like.

True and True.

First I am going to try disabling SpeedStep and possibly bumping up the core voltage on my CPU. I am suspicious of it due to the fact that it seems to behave well at full power but not at idle.

If that doesn't work, I will reset my CMOS and BIOS settings.
 
If a current BIOS setting is wrong somehow, one doesn't correct it by continuing to make changes from an unknown or unstable base. Regardless of the result, you won't know what you are running.

I strongly urge you to go back to ground zero. Test it. Then make changes from the NOW-KNOWN base.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
I strongly urge you to go back to ground zero. Test it. Then make changes from the NOW-KNOWN base.

I have. I am testing it now.

I don't know if I have mentioned this, but i seem to randomly lose connection to the internet as well. Windows says there is no cable installed. I doubt it is the router, since I am plugged into the same ethernet port that my old system used with no problems. I can fix it by installing a different driver. Switching between two has been my workaround, but it is incredibly annoying. I also think this may be related to the larger problem. Any suggestions?
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
re-format an perform clean install of OS.

I'd be surprised if it were the OS. I booted into Ubuntu off of my flash drive again to humor myself and it crashed quite quickly.

After giving up and just iterating through the various options in the BIOS, I am beginning to very strongly suspect the RAM. When I enable XMP is when I get the crashes. Frankly this blows my mind since it is beyond what the i7's onboard memory controller considers non-overclocked, but Corsair says it can run no problem as well as Asus. Plus, Prime95 and Memtest were both run on the RAM with XMP enabled with no problems. What Gives?! Any suggestions or explanations would be most welcome.

I am also running on my old power supply, but the crashes seem just as regular as before.
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
I strongly urge you to go back to ground zero. Test it. Then make changes from the NOW-KNOWN base.

I have. I am testing it now.

I did go back and do a complete reset. Because of that, and me iterating through all the BIOS options, I am fairly confident that it is the XMP that is causing the crashes. The question now becomes why?

The memory is clearly configured to go at the XMP settings, and the Mobo is clearly configured to run at those settings as well. I don't know how it could have passed Prime95 and Memtest and still be unstable at those values, even though it clearly is. I am confused. A call to Asus and Corsair will transpire tomorrow.
 
Hehe. "Clearly" is a word we don't use too often around here, except maybe in the case of a fire-blackened psu. In fact, when you think about it, its sort of amazing this crap works at all. Anyhow . . . sorry I missed the "I am testing it now" . . . I was jumping very quickly on the "OC failed" symptom.

Look. I still don't know exactly what you have done and not done, except now you say you're fairly sure its XMP.

If you did the CMOS reset, and now the system works, you still can't be certain its the XMP. Once you are sure its running stable, you can go back and re-XMP.

If you did the CMOS reset, got it to work for a while, then put it back into XMP, and then it failed again . . . as long as it ran well for a long enough period of time without XMP . . . you now know its XMP. I'll assume (dangerous) that's where you are now.

So, something is causing the PC to become unstable in normal operation when trying to run memory faster that does NOT become unstable when running Prime95, memtest, etc.

You could try running Prime95 and Furmark while in XMP and see if it fails. But I don't think you have to.

You now have three suspects . . . motherboard, psu, memory, probably in that order. You can't push the memory voltage any higher than 1.65V, and that's where its supposed to be now.

So, if I had another psu, I would swap it in. If it runs, RMA the psu. If it still fails in XMP we are down to two suspects, and I would RMA the memory first.

Unless your phone calls yield a better approach. GL. Please let us know.
 
Solution

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
If you did the CMOS reset, got it to work for a while, then put it back into XMP, and then it failed again . . . as long as it ran well for a long enough period of time without XMP . . . you now know its XMP. I'll assume (dangerous) that's where you are now.

This is what I did. I spent a good portion of my day playing a lot of half-games of SC2 as I was testing out one BIOS option after another. Only one option (XMP off) had any effect on the stability. It has been about half a day now without a crash with my BIOS exactly the way I want it sans-XMP.

You now have three suspects . . . motherboard, psu, memory, probably in that order. You can't push the memory voltage any higher than 1.65V, and that's where its supposed to be now.

So, if I had another psu, I would swap it in. If it runs, RMA the psu. If it still fails in XMP we are down to two suspects, and I would RMA the memory first.

Switching to the PSU that is intended to run the system is on the to-do list for tomorrow (I am on my old one now). I will call both Asus and Corsair tomorrow. I will be beyond upset if Asus tells me to send it back for the second time. Then again, at least it will work.
 
Don't be terribly surprised if it runs well on the new psu. It would be good to try that first so you don't unnecessarily waste time and frustrate yourself calling Asus . . . and maybe Corsair, too.

I didn't know you were running with an old psu - what is the make/model of the psu installed now, and what is the new one?
 

DeviantMonkey

Distinguished
Nov 6, 2010
12
0
18,510
Don't be terribly surprised if it runs well on the new psu.

I am running the PSU I intended to run with this computer (The ULTRA 1KW) now. This is the one I had running it originally, but I swapped in my even older power supply that was a Antec Phantom 500W.

It has crashed once now. I think I had both a bad power supply and bad memory timings. I am going to fiddle around with just running this setup for now, but I may swap back to my Antec.

To Do List:
Confirm the ULTRA can't cut it
If that is the case, revert to my Antec
Mess with my memory timings more. I can't find a number for Corsair, so I am stuck with their "1 answer every 24 hours" email tech support.
 

TRENDING THREADS