nvlddmkm Crash under Load - Possible OpenGL / DirectX issue [Detailed Diagnostics]

Rocking_Star101

Distinguished
Aug 20, 2011
15
0
18,510
Hello,
This is going to be a bit of a long post as I will be including as much details as possible of all the diagnostics I ran & my reasoning.
I will be including a TL;DR at the end, & also try to divide my details up into categories.

PC Specs:-
Motherboard: Gigabyte P75-D3
CPU: Intel i7-3770 @3.90 Turbo
RAM: 16GB @1333MHz
GPU: Asus GTX 760 OC2 2 GB @ 1006/1072MHz
PSU: Thermaltake 500W
OS: Windows 10 Pro 64-bit

All of the below happened in the previous 2 weeks.


Backstory
I had to travel out of country, so I packaged up my PC (as securely as I could) & stored it away. Unfortunately, my trip was extended & long story short, the PC was in storage for about 6 months.
When I returned & unpacked my PC, it was full of dust anyway. So I took it to my local PC shop & used a blower to blow all the dust away, then used a can of compressed air to make sure the CPU fan & grills were all clean.
I then plugged in my PC & started with the usual process of getting everything updated, starting with Norton 360, my games, & Windows Update.

A important note here is that I was not running my PC attached to a UPS for around 5 days. This is important because in my country the voltage from Mains sometimes goes low during the mornings, & I was running my PC 24/7 for quite a few days (even after failing to repair my previous UPS & getting a new one)


The Problem Starts
After about 2 or 3 days since I first started my PC, I had a total crash when playing one of my updated games.
First, the sound get messed up & the screen would flicker black & then to gameplay for about 2s, then it would go completely black, with the sound still playing back messed up. Then the computer hung.
I waited for a few seconds, trying to see if anything responds, nothing did (including NumLock not toggling On/Off), so I decided to force shutdown the PC (holding power button on case).

After restarting, I did not touch the game, & this problem did not happen again until a few days later.
And since then, it has crashed in this way only about 4-5 times, & it was not a game-specific crash.
Also note that AFTER this problem happened I decided to upgrade my nVidia drivers from 378.9 to 388.00 (latest)

I had found out that if I force-close the game fast enough, I can salvage the situation, but I have to restart the PC anyway. Because if I perform any action including just moving the mouse around (clicking of mouse, opening a program, typing), the PC freezes for approx. 1 second.


It completely worsens
Day-before-yesterday, while playing GTA:V Online, the crash happened again.
I restarted the PC & started the game again, this time instead of being able to play normally, within 3 seconds, it crashed again.
After this, I checked Event Viewer & found 2 errors of nvlddmkm with Event ID 13 & Event Data Graphics Exception: Const out of Bound & Graphics Exception: ESR 0x408030=0x80000003, followed by multiple errors of:-
Display driver nvlddmkm stopped responding and has successfully recovered.

This lead me to go online & try various fixes that I found


Attempted Fixes

  • ■ Reinstalling nVidia drivers, & resetting nVidia settings
    ■ Downgrading to nVidia driver 378.92 (which was the installed driver before I stored my PC away)
    ■ Uninstalling nVidia drivers & running the games on Intel Graphics to verify GPU is the problem
    ■ Completely removing nVidia drivers using DDU & installing them back
    ■ Trying to reduce the GPU Clock & Power Limit using MSI Afterburner
    ■ Getting Windows Update to install my GPU drivers for me (since I read that Windows's nVidia drivers are slightly different/modified)
    ■ (after I figured OpenGL/DirectX is the problem) Attempted reinstall of DirectX using the July 2010 Installer


Why I think OpenGL/DirectX is the problem
Yesterday night, while trying to diagnose the problem, I noticed that MSI Kombuster was able to run OpenCL benchmark & stress test without the GPU crashing.
After running the OpenCL test, I ran the PhysX 3 Fluids benchmark, which, naturally, led to a driver crash within 4 seconds.

This lead me to install CompuBench in an attempt to test OpenCL & CUDA to verify this.
I ran all the OpenCL & CUDA benchmarks successfully.
However, CUDA's Level-set Simulation 128 benchmark did not run as it said that:-
OpenGL error: [GL_INVALID_OPERATION]
file: C:\jenkins\workspace\cb_BuildDesktopInstaller\compubench-source\compubench\cb\gl\LSSegRenderer.cpp, line: 566


This is what initially led me to believe OpenGL may be the issue.
I researched about this online & attempted to reinstall DirectX by installing it's July 2010 release (since apparently that overwrites all DirectX files).
After doing this I uninstalled my GPU drivers using DDU & installed 378.92 version.
This did not help, as I crashed when trying to run Kombuster's GPU Core Burner v2 (Furry PQTorus), & any game within 10 seconds (notice the increased time until crash).

However, since then, I have been unable to run CompuBench's Ocean Surface Simulation using either OpenCL or CUDA, getting the same error as above EXCEPT mentioning line: 111 of the file instead.
ALL other benchmarks run fine (as before):-

  • ■ Catmull-Clark Subdivision Level 3 & 5 (Game Effects)
    ■ N-Body simulation 128k & 1024k (Game Effects)
    ■ Vertex Connection and Mergin (HQ CGI & Rendering)
    ■ Subsurface Scattering (HQ CGI & Rendering)
    ■ TV-L1 Optical Flow (Computer Vision)

Another thing that reinforced this idea for me is that I can see my GPU actually run some programs without crashing, namely Firefox, NetLimiter 4, etc.


TL;DR
Any heavy load on my GPU, including any (new or old) game or benchmark rendering crashes my GPU drivers.
I initially thought my GPU was dead, but it seems my GPU can run OpenCL & CUDA renders/calculations just fine. The problem only happens when I try to run anything using OpenGL.
The crash happens within 5 seconds of starting to render / loading into the game world.
The crash is first extreme stutter for 2 seconds, followed by extreme sound distorting & then a black screen, sometimes followed by the game screen flashing by after 3-4 seconds before returning to a black screen.
I am able to circumvent this problem if I force-close the render/game using Task Manager in the first stages of stutter, however I then HAVE to restart the PC because after every 3-5s or on performing any action the PC will freeze for 1 second.
Event Viewer shows 2 first unknown Errors of nvlddmkm with Event Data of Graphics Exception: Out of Bound & followed by a couple of Warnings of nvlddmkm driver stopped responding & was recovered.
For the fixes I have tried already, see the list in Attempted Fixes heading above.


Conclusion
Any help to help me fix this problem is appreciated.

The only things I can think of trying now is undervolting my GPU, but I seems I cannot do that without messing with my GPU's BIOS, which I do not want to do,


Attachments
Here is a Google Drive folder that shows the Event Viewer errors & 2 of CompuBench's OpenCL render results, with Ocean Surface Simulation being successful, & it failing (after I tried reinstalling DirectX)
 

Rocking_Star101

Distinguished
Aug 20, 2011
15
0
18,510
UPDATE:-
Just upgraded to the new latest nVidia drivers 388.13 while in Windows Safe Mode.

It does seem to be much better now.
I was able to run all the MSI Kombustor Renders.
Tried a stress test running each of them for 400 seconds, & did not have any crashes.
However, when I went to run the PhysX test (last on my list), I got a crash after 32 seconds. It was weird because I had run the PhysX test as benchmark (aka it runs for about 60 seconds) before I had done the stress testing.

Again weirdly when I had first run Kombustor after updating drivers, the application hung up, but it did not cause a crash.
Luckily, as always, I had Task Manager at hand to kill Kombustor the second it started to hang up, but I did wait long enough where before it used to completely crash & require a hard restart (using power button).


HOWEVER, the issue still seems to be there.
I ran CompuBench tests in CUDA API & got the same OpenGL error during the Level-set Simulation 128 Benchmark.
After that, when I went to run MSI Kombustor's GPU Core Burner (Furry Donut) & I had a crash.

It does seem that whatever the issue is, it has reverted to it's previous form of being a hit-or-miss, sometimes messing up & causing a crash, other times running fine.

I just tried & successfully ran Euro Truck Simulator 2 for about 1 1/2 hrs now.
Towards the end I noticed a bit of jumping in frames, but I think I can safely discard that as due to the location I had freshly loaded into.
I'm going to go try other more demanding games now, & if will report back the results after a few days of testing.
Still need to figure out why this happens though.

EDIT:-
Note that I downloaded the latest drivers from nvidia.com as opposed to geforce.com, which is where I usually download them