Is My GPU Dying?

Krelm

Reputable
Oct 12, 2014
7
0
4,510
So, for the past two weeks or so, I've been getting BSODs with the nvlddmkm.sys error when gaming-- sometimes I'll be able to play for a couple of hours, and sometimes I'll only be able to play for five minutes before getting the bluescreen and crashing. I've tried everything under the sun to fix it software-wise (primarily done by googling the problem and applying those fixes) and nothing works.

Fixes I have tried:
- Cleaning out dust
- changing my nvlddmkm.sys file to nvlddmkm.sys.old and changing it out with a newer one
- doing clean installs of new/older drivers
- doing a clean install of Windows 7

After none of this worked, I took it to bestbuy to have them run some hardware diagnostics (a $70 mistake, I'll admit). They said everything was checking out fine, but I booted it up and tried to play Guild Wars 2 on it, and it crashed within ten minutes.

So, with all that said-- is my GPU just dying?

Specs:
MSI GT70
OS: Windows 7 64-bit
CPU: Intel Core i7-3610QM
GPU: NVIDIA GeForce GTX 670m
RAM: DDRIII 12GB

Thanks.
 
Solution
You have several tests to stress different parts of the system

-memtest86+ overnight to ensure the memory is working right.
-Prime95 will stress the crap out of the CPU(guild wars 2 is actually pretty hard on the cpu)
-Furmarks can stress the video card, but with the way modern cards work(detecting power consumption and programs to adapt), it is not a perfect test and it can make some cards VERY hot.

Running folding@home on a gpu stresses it about the same as a demanding game and will report an error if the card is not stable(it will try over and over again even if it reports errors. It just starts over). This program is not made to test hardware, but instead help simulate protein folding. You should check out the website. If you use...
You have several tests to stress different parts of the system

-memtest86+ overnight to ensure the memory is working right.
-Prime95 will stress the crap out of the CPU(guild wars 2 is actually pretty hard on the cpu)
-Furmarks can stress the video card, but with the way modern cards work(detecting power consumption and programs to adapt), it is not a perfect test and it can make some cards VERY hot.

Running folding@home on a gpu stresses it about the same as a demanding game and will report an error if the card is not stable(it will try over and over again even if it reports errors. It just starts over). This program is not made to test hardware, but instead help simulate protein folding. You should check out the website. If you use this to stress the card, turn OFF the cpu folding because you are stressing the video card.

296hp2c.jpg


Something else to try, downclock the card core about -52mhz or the next step up(something like MSI Afterburner can do this) and see if it becomes stable. If it does, the card may be clocking it self too high. Boost and Turbo features on some cards can be too aggressive.
 
Solution

Krelm

Reputable
Oct 12, 2014
7
0
4,510
It's about a two-year-old GPU, yeah.

Nukemaster's post

I'll try these tests and post their results. As for downclocking, I tried that also, but the Afterburner client doesn't work with laptops. The one time I tried to use it, I couldn't get the settings to save, then when I tried to play Shadows of Mordor with it running in the background my comp just up and shut off on me.

overheating

I suspected this, too, which is why I took it to bestbuy, but according to them all my temps are looking fine.
 

Leadbelly78

Reputable
Aug 27, 2014
724
0
5,010


You've tried this http://www.ehow.com/how_7182639_repair-nvlddmkm_sys-error.html ??


 

Krelm

Reputable
Oct 12, 2014
7
0
4,510


Affirmative. No dice.
 

Krelm

Reputable
Oct 12, 2014
7
0
4,510


I thought the same thing, which is what prompted me taking to it bestbuy. I assumed they'd run the diags, find it overheating, then ship it off and replace the thermal paste, but they said the temps were checking out fine. It's shut off on me about twice, but one of those times my power cord came undone and my battery's dead.

I just ran Prime95 on it, ran my CPU at 100% for a few minutes. My CPU and 4 core temps hit about 85C (according to SpeenFan) before I closed it down, which is definitely hotter than it ought to be, but I assumed that was just the program. No bluescreen, though.
 

Leadbelly78

Reputable
Aug 27, 2014
724
0
5,010


Sorry, that really too bad.'(

 

Krelm

Reputable
Oct 12, 2014
7
0
4,510
Ran Folding@Home for a few minutes and everything worked fine.

Ran Furmark for about two-three minutes and my computer shut down. The GPU temp was at 60C, so I'm assuming my CPU got too hot (I forgot to look at SpeedFan).

Could that be the problem? Would my CPU getting too hot cause the nvlddmkm error?

I was thinking about replacing the thermal paste anyway, but I didn't want to spend money on that if it turned out that my GPU was dying, but from what I've seen from these tests, that doesn't seem to be the case.
 
Interesting that you mention the system powering right off as well.

Could be a power issue(power supply/vrm are of the board[hard to diagnose]). 60c for a video card is nothing at all.

I recommend checking the other temperatures since laptops may share cooling among parts sometimes.

I have seen dust cause exactly this kind of shutdown, but it was over 100c.
33cxcp5.jpg
 

Krelm

Reputable
Oct 12, 2014
7
0
4,510
Here's a thought: the last couple of times it shut down on me, when I turned it back on it wasn't picking up my adapter for a couple of minutes, and was just running off whatever little charge was left in my fried battery. Could my adapter be going out on me, and whenever I'm gaming my card isn't getting enough power?

Edit to say I just saw your post, Nuke. I'll see if there's a way to test my adapter somewhere.
 

Krelm

Reputable
Oct 12, 2014
7
0
4,510
After reading this thread, doing these tests, and looking up some other threads about dying GPUs (which include symptoms far worse than a single error), I've determined that my GPU is not, in fact, dying.

So, to keep this from going any further off topic, I'll say it was answered. I'll probably open up another question about this error elsewhere.

Thanks for all the help.