Win7Pro/Sp1 machine shuts down, no errors, possible overheat?

ThreeLittlePiggies

Reputable
Oct 25, 2014
4
0
4,510
I have had this problem for about a year, and pretty much just dealt with it because i had talked to various techs from some of the hardware mfg's from equipment in this (self built) machine, and though they seem really knowledgeable, each eventually reached the end of their "support boundaries".

I'm one of "those" people who used to know how to work a computer, bu things have changed a lot. I got as far as building this system, and i'm pretty sure the problem is hardware based- maybe i have a setting wrong somewhere, but i think it's far more likely to be a hardware problem.

The system will occasionally simply shut down. It happens sometimes once every couple of days, sometimes 2-3 times in a day. The screen goes black, and if there was any audio playing, it will loop over and over for a few seconds, the fans keep spinning and the power light stays on while it's happening. I think eventually it stops, but it's been quite awhile since i've had the patience, i usually just hold the power button in to shut it off then restart.

My first thought was overheating. I installed a few different temp utilities, cpu-z, a few others. But they haven't been helpful- the utilities i've used apparently don't actually write their logs to disk as they are happening- it's buffered (i don't know the right terminology here), so that when the computer feels like making the problem, that info is zapped, so there is no log to read when i reboot.

I used to get logs in the event viewer each time, but now all i see is the error that the computer has rebooted without cleanly shutting down first.

So i went through tons of browsing history, and i'm pretty sure this was the error that i used to see:

Bug Check 0x116: VIDEO_TDR_ERROR

I think there was another error that involved nvlddmkm, which is nvidia, but i couldn't find it in the logs, maybe it'a possible to search the event viewer but i couldn't find a way to do that yet.

I have updated to the newest driver, there have been several newest drivers over the past year and always update to the latest, which i don't know is the best thing either since it seems to be mostly related to gaming profiles, but i don't play games.

It used to happen whenever i go fullscreen when viewing a flash movie (youtube). But as time went on, i found that it would also happen on other movie sites, maybe it's all flash still, i'm not sure. But it's also happened when viewing non flash HD movies in pot player. I used to use vlcplayer for years but changed to see if it made a difference, no it doesn't.

Can anyone recommend an approach? I'm hoping to find something other than the generic procedures, i've spent countless hours doing all sorts of things recommended by support techs, i was hoping that maybe someone here is familiar with this specific problem, and a reasonable method or approach to nail down the exact problem.

I had read that the video card might be overheating (gtx560ti) and people have suggested to try taking out the heat sinks and applying higher quality heat compound, which i did try. Same for the CPU, though i used arctic silver when i bought it and was pretty careful.

It might be helpful if someone knows of a program that logs device temps, but one that actually writes to disk realtime, as opposed to buffering, so if the computer overheats and shuts down, the log has already been written (not just buffered in memory).

(i think) needless to say, i have looked at temps in bios, but that doesn't seem to be of much use since there is no real load on the system.

Thank you!

Rob
 
Solution
Hard to troubleshoot this kind of errors without swapping parts. If the power supply is still under warranty, I would RMA it. It's the first one to suspect in random shutdowns.
It would be also helpful to detail the components.

ThreeLittlePiggies

Reputable
Oct 25, 2014
4
0
4,510
That was one of the things i wondered too, and i talked to their tech support, they couldn't find a problem, but thought RMA too, and it would be under warranty. Except.. the fan went out a few weeks after i got it, i was really busy with editing videos and didn't have money for another supply, so i replaced the fan myself. You know, that is just now sinking in, maybe i should have just RMA'd it, the fan going out so soon is a pretty darn red flag. I can't believe how much i've done and completely overlooked that. When it starts to look complicated that's usually when it's going to be something simple. I'll pick up another supply when i can and post some results, thanks for suggesting!
 

ThreeLittlePiggies

Reputable
Oct 25, 2014
4
0
4,510
Hardware wise, i'm using:

Motherboard- Asrock z68 extreme 3 gen 3. This is my first non-asus board since the early nineties.
memory- 16gb ram
Video- Geforce gtx560ti
cpu- intel i7 2700k
Power supply COOLER MASTER eXtreme Power Plus RS700-PCAAE3

I don't remember the memory mfg or model, it's not generic, though that info is next to useless i know
I am using 3 Sata drives plus sata DVD.
Nothing is overclocked
Windows 7 sp1

Is there a software utility that will tell you the mfg/model of memory chips in each slot? I think last time i had to pull it and use some numbers on it to cross reference the mfg/model..

Everything is stock (no hardware mods, overclocking, OS tweaks etc..)

Thanks

 
I wouldn't buy components for replacing the existing ones, unless convinced those are faulty.
Try borrowing a power supply from a friend for testing.

Another thing to try is running the board outside the case for a while. This way you would rule out a short.

For memory, you can use CPU-Z (http://www.cpuid.com/medias/images/en/softwares-cpuz-05.jpg)
 

ThreeLittlePiggies

Reputable
Oct 25, 2014
4
0
4,510
Little point in playing around, there's just no time, i'd just as soon replace with a new one, i can always use the old one for a different project. You mean running the motherboard outside the case? That seems like a lot of work and time, maybe, i'll see what happens after replacing the PS first. I mentioned that i already used cpu-z, and it's of no use... Thanks