Sign in with
Sign up | Sign in
Your question

Troubleshooting a relatively new build

Last response: in Systems
Share
September 24, 2009 9:21:31 PM

I started putting my first PC together in May, and it's been a lot of fun.

* CPU: Intel Core i7 920
* Motherboard: Asus X58 P6T
* Cooler: Cooler Master Hyper N520
* RAM: OCZ Gold 3 x 2GB 1600MHz (6 GB)
* GPU: BFG NVIDIA GeForce 260 OC Maxcore (Core 216)
* PSU: PC Power & Cooling 750W Quad Red Silencer
* Case: Antec Nine Hundred Two
* HDD: 1TB Samsung Spinpoint F3 and 500GB WD Caviar
* OS: Windows 7 RTM 64-Bit

Wow, it's fast (not overclocked)!!! I'm not a hard-core gamer, but I had to pick up a couple games to test the muscle in this baby. It plays Crysis, Batman Arkham Asylum, and Resident Evil 5 great.

And I love Windows 7.

It runs great most of the time. But every once in a while, it freezes. No blue screen or anything, but the 21" Samsung LCD immediately displays a "Not Optimum Mode Recommended mode: 1650x1080 60Hz" message.

Initially, this seemed to happen when I left it on overnight, so I thought it was a standby thing. So I turned off all standby modes, but it didn't solve the problem. I also played with S1 versus S3 standby in the BIOS, but it didn't help.

I thought it might be the monitor, so I tried my old Dell FP, and it would also lose signal in the middle of the night.

I googled, and saw someone had a similar issue (which remains unsolved), but he pointed out that it's probably not a monitor issue, as he notices his keyboard caps-lock indicator doesn't light up when he's in this state. That's what I'm seeing to.

And then, occasionally, it would do this same thing WHILE I WAS USING THE COMPUTER. This was annoying, but it was revealing--I was playing a game, and the audio seemed to loop. So it's definitely not just a display issue, or USB issue... it seems to be locked up.

As I first mentioned, this is my first build, so I need your help in troubleshooting. I've tried unplugging everything and checking every cable and plugging it all back in tight. And I'm monitoring my CPU and GPU temps, and they are okay. And my computer is not dusty.

The frequency of these issues is: if I leave it running overnight, it happens about every other night; when I'm actually using my computer, it's probably locked up on me about 5 or 6 times since May.

Oh, and it's not the OS--I've reinstalled Windows 7 several times in moving from RC to RTM.

Can you guys assist me in troubleshooting this issue? Could it be some other kind of overheating that I'm not monitoring? Northbridge or something? Does this sound like a Motherboard issue? I never lose power, so does that eliminate power supply issues?

Also, I had this problem before and after changing CPU coolers, and before and after adding the 1TB HD.

Thank you in advance!

Z

More about : troubleshooting build

Related resources
September 25, 2009 1:18:59 AM

I have run some stress tests earlier, and they seemed fine--memtest, prime95, furmark, etc... but I haven't run them in a few months now, and I never ran anything except memtest for more than maybe 45 minutes--they all seemed to stabilize.

I do always use RivaTuner and Everest for monitoring, using the Gadget/Widget thing to view. Currently, while not doing much, my GPU is at 41C and my CPU cores are between 32C and 36C. When playing 3D games, GPU goes up to high 50s and CPU cores go up to the 40s.

Current Ambient temp is 34C. Current System temp is 34C.

Let's see, the last time I really checked temps was after installing this new cooler in June. I've got some numbers here somewhere... no, can't find 'em. I'll rerun prime later, and run memtest overnight and see if it doesn't freeze.

Z

m
0
l
a b B Homebuilt system
September 25, 2009 1:42:41 AM

We do not use RivaTuner and Everest for monitoring, so it would be helpful if you took the time to DL the requested utilities and check temps.
m
0
l
September 25, 2009 1:59:32 AM

Okay. I have those already--just had the everest numbers at hand via the Win7 Gadget.

No problem. I'll get back to you later tonight.
m
0
l
September 25, 2009 11:57:40 AM

Here are some test results. I didn't get a chance to run everything--I have a newborn at home, and my time was kinda divided. :-)

First, I ran Prime95 for 15 minutes with the following settings:



And the HWMonitor results were:



Ambient temp in the room was about 22C.

Then I ran a furmark stability test for 10 minutes, but I screwed up the screenshots. The GPU got as hot as 67.

But I wanted the screenshots, so I reran for 15 minutes. This time, it only peaked at 62C:





I then kicked off memtest86+ and went to bed. Woke up this morning and it said the test was complete, no errors, everything passed.

How do my temps look? Any ideas?
m
0
l
September 25, 2009 3:27:51 PM

Oh, one more note. I've tried multiple BIOS settings. On the most recent one now. This problem exists on all BIOS versions tested.

Z
m
0
l
a b B Homebuilt system
September 25, 2009 10:21:54 PM

Your temps are very solid, fine. So that's not the issue.

You tried two monitors, so losing synch with the monitor's resolution is not the issue.

Corrupt/confused Drivers? Seems not to be, since you say you have re-installed the OS several times and you have had the same issue across the installs.

Personally, I'd run Prime95 for several hours semi-observed and then continuing overnight with "Detect Rounding Errors" checked. Its unlikely to fail, but then again these failures are pretty obscure.

If that fails, I guess I'm liking the mobo for this problem.

m
0
l
September 26, 2009 12:29:38 AM

Okay, sounds good. Blend mode is okay?

But if it just freezes after several hours, will we actually learn anything? Does Prime95 or HWMonitor create a log that will indicate anything?
m
0
l
a b B Homebuilt system
September 26, 2009 12:40:13 AM

I want to make sure we can't make it fail consistently after, say, 1 hr 14 min of Prime95. If that were to happen, and it were not heat related, that might tell us something.

That's why the first run semi-observed, for much longer than the 15 minutes you did.

The rest of the run is for . . . having something to do overnight lol. You're right, prolly won't learn anything - but may if it runs long enough to get a "rounding" error.
m
0
l
September 26, 2009 4:13:53 AM

I just ran it for just under 3 hours. Similar peak temps on the CPU--high 50s. Didn't crash or freeze. But I had to stop it because I am sick and can't stay up to semi-monitor it. Really don't like cooking my CPUs for 11 hours, unmonitored...

but I just realized I forgot to turn on Detect Rounding Errors. I'll run it again tomorrow for several hours.

Z
m
0
l
a b B Homebuilt system
September 26, 2009 8:44:20 AM

OK Z.

If it doesn't approach the "fry" zone after 3 hours, if you didn't notice any gradual climb in max temp after an hour or more, overnight is safe.

Be interesting to see it run without "losing the monitor".
m
0
l
September 27, 2009 10:31:19 PM

I ran it today with Detect rounding errors on. Checked it after 3 hours. Everything okay--CPU temps maxed at 59C. Checked it after 4 hours, same results. Sometime before 5 hours, freeze occurred as mentioned in original post.

Any other ideas? Or time to try to RMA to Asus?

Z

m
0
l
September 27, 2009 10:42:35 PM

Disable EIST and C1 state in bios. The new core i5s/i7s crash when idle... probably an undervolting issue. I'm not sure if its just a Win7 issue, but thats what I run as well. I've reported it to intel.
m
0
l
September 27, 2009 11:58:58 PM

I can give that a try.

Can you point me to a reference somewhere on this problem?

Also, let me make sure I understand what I'll be losing by disabling these features. I'll be losing automatic down-clocking which will use more power, right? And I'll be losing the turbo feature of the i7 when only using 1 core?

Z
m
0
l
September 28, 2009 3:45:28 AM

zinzan said:
I can give that a try.

Can you point me to a reference somewhere on this problem?

Also, let me make sure I understand what I'll be losing by disabling these features. I'll be losing automatic down-clocking which will use more power, right? And I'll be losing the turbo feature of the i7 when only using 1 core?

Z


There's no reference, I could only find a few google hits on similar problems. I discovered the fix myself after a few days of frustrating troubleshooting. Right, no power saving, no turbo mode.
m
0
l
a b B Homebuilt system
September 28, 2009 6:02:55 AM

Well, the cpu was hardly idle when this shutdown occurred. It was running 8 threads of Prime95.

I guess I'd put the problem to Asus and Intel. Having similar shutdowns at idle (after several hours) and in Prime95 (after 5 hours) says to me that one of the two devices has an issue. I'd like to hear what they say.
m
0
l
September 28, 2009 10:24:24 AM

Good point about the freeze (not shutdown) during Prime95.

When you say "put the problem to Asus and Intel", do you mean for me to open tickets with them through their support?

Z
m
0
l
a b B Homebuilt system
September 29, 2009 6:43:47 AM

Yes.
m
0
l
September 29, 2009 4:33:42 PM

Okay.

So I talked to Intel, and they basically said it doesn't sound like a processor issue. They said the issue would be less sporadic/more constant.

I called Asus, and they said it could very well be a motherboard issue. They have issued an RMA.

Hopefully, this will resolve the problem.

I forget... I won't have to reinstall my software when I replace the motherboard, will I? I'll just have to reauthenticate Windows 7?

Z
m
0
l
a c 113 B Homebuilt system
September 29, 2009 6:28:11 PM

That's correct, as long as it's the same model MB.
m
0
l
a b B Homebuilt system
September 30, 2009 1:43:05 AM

Those are two pretty reasonable responses from the Big Guys. Can't wait to see what happens next.

Please let us know.
m
0
l
October 7, 2009 12:42:17 PM

Just got the replacement mobo from Asus yesterday. Hope to put it in tonight. Unfortunately, I'm out of town all weekend, starting tomorrow. I guess I'll just play with it a few hours tonight, then leave it running, and hopefully it will still be going on Sunday night when I return home.

Z

m
0
l
a b B Homebuilt system
October 7, 2009 4:37:31 PM

We'll be here. Hopefully the new mobo will fix it.
m
0
l
October 8, 2009 3:37:06 AM

Well, I've installed the replacement motherboard. It is clearly a used board--maybe that's not a surprise. But I was disappointed to see old thermal paste all over the CPU socket cap and the metal spring clip holding it in place. I lifted it out, and as soon as I did, a "ball" of the dried thermal paste fell into the socket. Not the middle pin area, but in the tall pin area around the outside of the socket.



I was tempted to just return it to Asus right then, but unfortunately, I had already pulled my board out and cleaned the CPU and HSF and just wanted to get this tested. I took some compressed air, and blew the ball of thermal paste out. It didn't go easily, and left a little residue--I was weary of bending the pins, so I just dropped a little ArctiClean in the area, let it dry, then blew it again with the compressed air.

It's running okay. Temperatures seem about the same as before, but I haven't stressed it yet. I'll run a few tests tonight, then will leave it to run over the weekend. Hopefully, it will still be running when I get home, and I'll run some further stress tests on Sunday.

Lesson learned--thoroughly check out the replacement part BEFORE uninstalling your current parts!!!

I'll send an update later.

Z
m
0
l
October 12, 2009 2:27:58 PM

Well, damn! It was locked up when I got home last night--didn't have the Samsung "not optimum mode... recommended mode..." error dancing around the screen, but the monitor was black, and the computer was still on. USB keyboard didn't seem to do anything (caps lock indicator wouldn't come on). I rebooted, messed around with the computer for a couple hours, then left it on.

Same thing this morning. *sigh*

Any ideas of how to log this error? Could it be a keyboard issue? I might try switching back to a PS2 keyboard...

Z
m
0
l
October 14, 2009 4:11:50 AM

Proximon,

If you recall, I've had this issue occur WHILE RUNNING PRIME95 overnight. It doesn't seem like a standby issue, but I could be wrong. I think I mentioned that I thought it might be a sleep/standby issue in the past, so I turned off all sleep options in Windows, and also tried S1 instead of S3 in the BIOS.

Any other ideas, or any way to log exactly what the computer is doing when it freezes up?

Surely this must be fixable, or at least avoidable. I'm using quality components.

Z
m
0
l
a c 113 B Homebuilt system
October 14, 2009 5:40:38 AM

Are we talking about the same issue? or a different one?

You could have more than one thing wrong you know ;) 
m
0
l
October 14, 2009 3:06:29 PM

Quote:
It runs great most of the time. But every once in a while, it freezes. No blue screen or anything, but the 21" Samsung LCD immediately displays a "Not Optimum Mode Recommended mode: 1650x1080 60Hz" message.

Initially, this seemed to happen when I left it on overnight, so I thought it was a standby thing. So I turned off all standby modes, but it didn't solve the problem. I also played with S1 versus S3 standby in the BIOS, but it didn't help.

I thought it might be the monitor, so I tried my old Dell FP, and it would also lose signal in the middle of the night.

I googled, and saw someone had a similar issue (which remains unsolved), but he pointed out that it's probably not a monitor issue, as he notices his keyboard caps-lock indicator doesn't light up when he's in this state. That's what I'm seeing to.

And then, occasionally, it would do this same thing WHILE I WAS USING THE COMPUTER. This was annoying, but it was revealing--I was playing a game, and the audio seemed to loop. So it's definitely not just a display issue, or USB issue... it seems to be locked up.


I'm talking about this.

After another day and a half of using the computer, it has frozen up two times again, with the "Not Optimum Mode..." message on the Samsung monitor.

Z
m
0
l
a c 113 B Homebuilt system
October 14, 2009 7:23:23 PM

We had someone last month with a tough case like this, and it turned out they needed a firmware update to their graphics card.

Since the CPU and RAM and MB seem less likely now, I would be looking at the GPU, PSU, and even the HD.
m
0
l
October 14, 2009 10:22:32 PM

Yeah, I'm starting to lean towards the GPU and PSU myself. I've updated the firmware drivers on the GPU, so I don't think it's that. I've even tried rolling back to older versions. No luck.

I just tried switching to PS/2 keyboard and mouse, and that didn't solve the problem.

I'm thinking of trying to RMA the GPU, or at least opening a case with them, telling them I've tried swapping out the motherboard, monitor, keyboard, and mouse.

Oh, and I'm also thinking I'm going to keep my original motherboard and send back this refurbished RMA replacement. It didn't solve the problem, and the system temp seems to run a little higher.

Z

P.S. Oh, and I don't think it's the HD. I had this problem with my original 74GB WD Raptor, and have since decomissioned it (using it as an external eSata drive, not currently connected) and replaced it with a 1TB Samsung Spinpoint F3. I am still using the same 500GB secondary storage HD (WD Caviar).
m
0
l
October 20, 2009 3:49:58 PM

*sigh*

I went ahead and bought a second video card (same as before, BFG NVIDIA GeForce 260 OC). Figured I'd use this to test my machine before RMA'ing my current video card, and after everything was resolved, I'd have a backup GPU and a second card to use for SLI down the road.

Unfortunately, I again found my machine "stuck" (display signal lost, USB keyboard unresponsive) when left on overnight. Power was still on, as usual.

So, to summarize:
* I've updated motherboard BIOS.
* I've tried various video card drivers, including the latest.
* I've tried a different monitor, video card, hard drive, and keyboard and mouse (USB and PS2).
* I've turned off standby in Windows, and tried S3, S1, and Auto setting in BIOS.
* Temps seem good.
* Memory tests always pass.
* I've unplugged and replugged in power cables (obviously necessary when changing components).

That pretty much just leaves CPU and power supply, right? I mean, I'm discounting a loose cable, since I've checked, and have obviously had to unplug and replug them all in testing different video cards, motherboard, etc.

Does this sound more like a PSU or CPU issue now? PC Power & Cooling's warranty says I can send it in (at my expense), and they will repair or replace at their discretion (if they find a problem). I did try to open a case with Intel, but they pushed me to trying a different motherboard first. Since that didn't work, maybe they'll now accept an RMA on the CPU, or they might push me towards further analysis on the power supply.

Z
m
0
l
a c 113 B Homebuilt system
October 20, 2009 9:42:54 PM

Yes, afraid so.
m
0
l
a c 113 B Homebuilt system
October 20, 2009 9:43:44 PM

If it never locks up when in use, I would lean towards the PSU myself.
m
0
l
October 20, 2009 9:53:49 PM

Proximon said:
If it never locks up when in use, I would lean towards the PSU myself.


It has done so while in use. See post at 09-27-2009 at 06:31:19 PM.

Z
m
0
l
a c 113 B Homebuilt system
October 20, 2009 10:11:10 PM

Ah yes, sorry.

Could this be temperature related? I know your sensors are all reporting fine, but I'm talking about the PSU temp. Could the PSU be getting warm when this happens?

A combination of rising ambient temps (computer has been on several hours) and variations in the temp of the room?


m
0
l
October 20, 2009 10:22:09 PM

Room is pretty stable, around 22-23C. It's in an open room in my house downstairs.

This PSU (PC Power & Cooling 750W Quad Red Silencer) doesn't seem to share the temperature, as far as I can tell--I don't think there's a fan pin lead to the motherboard, like there was on my previous Enermax 525W. How would I obtain the PSU temp on this Quad Red Silencer?

Z
m
0
l
a c 113 B Homebuilt system
October 20, 2009 10:47:30 PM

There's no easy way. You could monitor exhaust temps with something, and if they were higher after a crash that would at least help point the way.

A PSU can take a long time to heat up, and also to cool down, and so can cause issues after hours of running, or faster with a few degrees ambient temp increase.

I've been wanting one of these toys, maybe for Christmas ;) 
http://www.amazon.com/gp/product/B0017L9Q9C/ref=pd_lpo_...
m
0
l
December 8, 2009 12:46:00 AM

Well, I feel like an idiot, but thought I should update this audience and see if ya'll have any more ideas...

Can't remember where I was on my last update, but I have now replaced the CPU and PSU, and... I still lose my display at times! So, I've replaced pretty much everything (CPU, PSU, mobo, GPU, and monitor). I've reinstalled my OS many times, I've updated my mobo BIOS, and I still lose my display at times.

But I do have S3 standby working very solidly now. I've got it going into S3 standby after about 45 minutes of inactivity, so it is no longer having a problem if left on overnight. It only loses the display if I'm actively using the computer, seemingly after at least 90+ minutes of use, like if I'm playing a video game.

But honestly, I haven't used this computer a lot the past couple months. I have a work laptop, and a netbook I use when sitting on my sofa, and an infant son. So I only use this computer to manage my itunes and update my podcasts every couple days, and to play games when I get a chance. I can usually only play for an hour here or there, so no problems. But if I try to play for 2 or 3 hours, it eventually hits me with the "Not Optimum Mode Recommended mode: 1650x1080 60Hz" message.

What could it be? Temperatures look good. The only hardware I haven't replaced is the RAM and the case. So I guess it could be the memory, but I've ran memtest86+ on it overnight several times in the past and never had any problems.

:-/

Was just having a thought... should I try messing with my RAM timings? I've been using the motherboard defaults, which I think are a little different from the RAM specifications.

-Z
m
0
l
May 17, 2010 6:53:06 PM

Well, I just stumbled back on this old thread of mine, and thought I'd update it and "close" the case.

I guess I was on to something in my last post, when I wondered if the RAM timings could have anything to do with it. I didn't get around to it for a week or so, but I eventually modified the RAM timings to match those specified by OCZ, instead of the timings as defaulted in my BIOS. That seems to have done the trick!

This surprised me, as I thought I could trust the timings that defaulted in my BIOS, and my memory tests never failed. But since I had replaced everything else, I had nothing to lose...

I decided to keep the second PSU (a Corsair HX750W) since I really like the modular cables. I've been holding on to the Quad Red Silencer as a backup, but I guess I'll go ahead and eBay it.

I also recently went ahead and upgraded the CPU to an i7 930... wasn't a real need for this, but I found I could eBay the used 920 for what I paid for it at Microcenter a year ago and then by the 930 for the same price. So it only cost me the eBay fees (about $12) and the Microcenter sales tax (again about $12) to upgrade from a C0 stepping 2.66 GHz 920 to a D0 stepping 2.8 GHz 930.

I will take a bit of a loss on eBaying the second GTX 260, but I don't think I'm really going to ever SLI these two GPUs.

-Z
m
0
l
May 24, 2010 3:53:28 AM

Okay, looks like I can't mark this case "SOLVED" because I can't nominate my own posting as best answer...?

Can someone add a last reply to this thread suggesting memory timings could be the issue, and I should set them as specified by the RAM manufacturer? And then I'll mark it as best answer.

Thanks,

-Z
m
0
l
!