Nvidia Drivers Crash - Bad GPU?

mhovingh

Honorable
Jun 21, 2013
9
0
10,510
Nothing OC'ed, everything stock hardware (including cooling)
EVGA GTX 550Ti
i5-2500k
MSI P67S-C43

It started from almost day 1 with this PC. Was getting black screens and Nvidia Display Drivers stopped working messages. At the time, it would only happen once a day or so, and I assumed it was just a driver bug that Nvidia would patch. Over half a year later, happens more frequently (depends on the load). Crashes are always Nvidia driver only. Not getting any sudden system shutdowns, restarts, or blue screens.

I have re-seated the card and CPU multiple times. Never had another problem with this machine. It will crash faster without fan control software running (fan just stays running at 30% without software control). With fan control software running (PrecisionX), it does better, but still crashes drivers.

I read some info that says the crash is from the GPU reaching unsafe temps. This doesn't seem to be the case, as it crashes at varying temps, not a constant temp. Crashes have happened anywhere from 81C down to 66C (using CPUID HWM to catch max temps reached), with almost all of them happening in the 72-77C range with temps to either side outside that range being rare occurrences.

I read some other info that says it is the card reaching 100% load, which triggers a drivers crash to protect itself, and that it was designed that way. This is a foreign concept since I have always experienced cards under load, in the past, causing frame rate drops when they reach their limits, not running nice and smooth and suddenly crashing drivers. The logic doesn't make sense to me, that a card would be designed to stop at 100% load to protect itself, rather than stopping at an unsafe temp. Temps are the danger to cards, AFAIK, and should be the trigger. If the card truly is stopping at 100% load, it comes off as one lazy card, and has a design that I do not understand and, without a decent explanation for that design, it makes me feel quite unhappy. :(

Card maintains steady voltage readings, running between 0.950 at desktop and 1.062 on the top end.

I did mess around with different driver versions and no matter what versions I use (clean installs, not just installing over older/newer), symptoms stay the same. Simply put, bad drivers shouldn't be a concern (unless they are all bad). I just really want to know if the card is designed this way, if I have a bad card, or if something else is going on. I don't want to run out and spend money on something without a fairly good idea first of what my problem is.

Well that is my story. Been frustrated by this for quite a while now, but the impact was a small enough disruption that I haven't bothered until now to finally try and figure it out once and for all. Appreciate some enlightenment on what I have going on, or some direction on things I can test to better nail down the specific problem. Thanks for taking a look.
 

mhovingh

Honorable
Jun 21, 2013
9
0
10,510

I have tried over a year of drivers with no changes in symptoms. Currently running 320.18 as performance does not seem to change at all with any older versions.

I would have no idea how to get the desktop to cause the card to hit higher temps (runs low-to-mid 40's at desktop). Because of this, I don't know if it could crash at desktop, but it never has. Only crashes when running games.

PSU is a CORSAIR GS600.
 

tinmann

Distinguished
Apr 28, 2009
1,121
0
19,660
If you are running the 320.18 driver that might be the problem. Many people have been reporting issues with that driver and rolling back to driver version 314.22.
I forgot to ask was this during gaming and what setting were you using at the time? Yes Kelper has a thermal threshold but usually the card down clocks when it reaches that point.
 

mhovingh

Honorable
Jun 21, 2013
9
0
10,510


I have tried driver packs dating back quite a while. This is not a new build and 320.18 didn't exist when I got this card/pc and this problem started (nor did 314.22 for that matter).

It does happen during gaming. Settings vary based on individual games so I am not sure how you want me to answer that one. Best I can do without more direction on what you want, is to say that I typically put settings where they run smooth consistently. If I am noticing performance problems, I drop the settings until I find something that looks smooth (not a specific FPS, just what my eye picks up). I use v sync so I never see over 60 FPS.

The info about down-clocking near/at thermal threshold is how I always expected GPU's to work. A friend recently replaced his card because it was overheating and, instead of crashing Nvidia drivers or his whole system, it would cause a huge performance hit (FPS). If he kept it running in that state it would eventually cause the rest of his system to overheat and he would see a system shutdown. That is the type of behavior I am used to from a GPU that has overheat problems.
 
You might want to try updating DirectX as well. In case there's a problem there.

Any BIOS updates available for your motherboard?

To narrow things down a bit you could try to isolate the problem, potentially.

Run MemTest to make sure RAM is working fine.

Run Prime 95 to make sure the CPU is stable.

Run FurMark to stress the GPU to see if it produces a crash as well.

Your Power Supply should be plenty for your setup. Just wanted to make sure you didn't have to worry about crappy power causing issues when the system demanded juice.
 

mhovingh

Honorable
Jun 21, 2013
9
0
10,510


DirectX is updated.

Bios is updated.

I haven't run MemTest yet, but I used the Win7 built-in memory diagnostic and everything checked out.

Prime95 blend torture test, all workers showed 0 Warnings and 0 Errors. CPU temp went to 70C for the package.

No Nvidia Driver problems with FurMark. SCORE:949 points (15 FPS, 60000 ms), Max GPU Temp: 70°C, Resolution: 1920x1080 (W) - AA:0 samples, FPS: min:16, max:17, avg:15 - OPTIONS: DynBkg, GeForce GTX 550 Ti/PCIe/SSE2 (10DE-1244), 9.18.13.2018 (5-12-2013) - GL:nvoglv64, GPU core: 972 MHz, memory: 2052 MHz, Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz, 3292 MHz, 8164 MB, Windows 7 64-bit build 7601 [Service Pack 1]. Will run a longer test tomorrow since that shorter test only got the GPU to 70C.
 
You might try MemTest. I don't use any of the Windows based utilities (Disk Defrag for example is next to worthless). Then run Prime 95 with "Small FFTs" as that stresses the CPU itself (Blend test includes Memory). Not really sure what's causing your crash problems.

Short of Reinstalling Windows from scratch I'm not sure what else to have you try since everything seems fine outside of the crashes.
 

mhovingh

Honorable
Jun 21, 2013
9
0
10,510


As far as re-installing Windows, already have re-installed twice since building the system (not specifically for the GPU, just a habit I have of doing it to rid myself of the clutter I accumulate on my PC) and it never changes the behavior in relation to the GPU/Nvidia drivers.
 

mhovingh

Honorable
Jun 21, 2013
9
0
10,510


I can't find anything with drivers or software that is causing the issues, so about all I can think is that the card was bad when I got it. Contacting EVGA about it is about all I can think of doing at this point. I am going to check with Best Buy first, as I bought the card there and it is a 3 minute drive, but if I don't hear what I like from them I will see what EVGA says.

Thanks for your help.
 

Sniter

Honorable
Sep 9, 2013
1
0
10,510
How did the problem end up ? Any solution ?

I also keep getting nvidia driver crashes. The only difference is that i get on desktop, not gaming. I lost count of how many times my driver crashed while I was using my web browsers.
 

tushars

Distinguished
Jun 20, 2010
21
0
18,510

Dear Sniter problem is nvidia driver confirmed by me same issue im facing from driver version 320 right now im using driver version 310.70 which is working very fine no system crash till now nothing. Also to confirm GPU ok or not swapped with new GTX low end GPU same problem system crash, so problem is nvidia driver not Hardware fault.