BsoD crashes | ntoskrnl.exe & ntkrnlmp.exe

Lithia

Commendable
Jan 7, 2017
6
0
1,510
I call upon you, the people. I need technical insight.

For...about a little over a week now, I think, I've been getting plagued by Crashes.
(Minidump: https://drive.google.com/open?id=1sfXNZL4DzZ9JJ5CeAuCE0G-3Tazf8iri )

It happens, exclusively, during my gaming sessions and time to crash can vary from 10 minutes to over an hour. And when I do crash, my screen (read: game) freezes, then goes black and my PC automatically reboots. I don't SEE the BsoD, but I know it's there. A DMP gets created and BlueScreenViewer is showing some telltale stuff.
At first I thought one particular game was the culprit, but now it also happens with another title, meaning it's something in my system and I'd like it fixed, as I've crashed about 8 times at this point and it's starting to really get on my nerves, but I'm not quite sure what to do.

Nothing specific comes to mind, as to what could have caused this sudden behaviour.

- System I'm running (with overclocked CPU) has been doing so, in stable fashion, for about half a year now so I doubt it's got anything to do with that; no problems of any kind prior to this "week".
The past 2 days I've been monitoring, as best I could, my CPU and GPU temps and according to NZXT Cam (...I know, I know) I'm nowhere near danger-levels.
(CAM regulates my x62 Kraken fans and they're not running at max. never have as of yet, so that should count for something...?)

- I ran a memory diagnostic and Windows told me the RAM is fine.

I HAD more dumps, but they've mysteriously disappeared (honestly, how?) so I've got just this one.
This is a KMODE_EXCEPTION error, but I'm getting one other alongside this one: the IRQL_NOT_LESS_OR_EQUAL error.
However BOTH are blaming the same "drivers" consistently.

If anyone could take a look at the dump file... and shine some light on my woes, that'd be great.
(I'd love to get back to gaming without interruption)


- Lith.


EDIT:

Perhaps system info is handy/necessary:

OS: Windows 10 Pro (x64)
GPU: MSI GTX 1080
CPU: Intel 8700k
Mobo: MSI Z370 pro gaming carbon AC (up-2-date Bios)
RAM: 2x8 GSKILL RGB Tridentz

...That should do?
 
Solution
overclock drivers take turns tweaking voltages even if you do not set a overclock. Each driver can make changes on the fly. You need to reproduce the problem without any overclock driver installed. also you should use a updated bios or reset the bios to defaults to get the best default timings and voltages.
changing a voltage in bios can cause various issues. best to test with default bios settings and no overclock drivers installed.

audio drivers cause memory corruption, there are known bugs in the motherboard audio drivers that were fixed in late summer of 2017.

the corsair software stopping can indicate issues with: the software itself, firmware in the device might need to be updated, it if is a USB 2.x port you might...
your CPU is running at 4.9 ghz

remove overclocking software
NTIOLib_X64.sys
and
C:\Program Files (x86)\MSI Afterburner\RTCore64.sys Fri Sep 30 05:03:17 2016
C:\Program Files (x86)\Intel\Intel(R) Extreme Tuning Utility\Drivers\IocDriver\64bit\iocbios2.sys Fri Sep 15 03:22:21 2017

this is a very old driver (vista) it is suspect just because of the age.
D:\Program Files (x86)\NZXT\CAM 3.7\OpenHardwareMonitorLib.sys Sat Jul 26 06:29:37 2008

go to the nvidia website and do a full reinstall of the graphic driver and its audio support driver.
you have a mis matched set of nvidia drivers one from Microsoft windows update on sept 5 and the other from January.

Processor Version Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Processor Voltage 8ch - 1.2V
External Clock 100MHz
Max Speed 8300MHz
Current Speed 4900MHz
 

Lithia

Commendable
Jan 7, 2017
6
0
1,510


First off: thanks for replying!

Second:

I'm not using overclocking software (and if I am, I never used it to overclock. All the overclocking I've done, I have done so through the Bios and it has been working fine for more than 6 months at this point)
Technically speaking Afterburner and NZXT Cam are "overclocking software" because they *can*, tehcnically, speaking, overclock stuff, but I don't use it to that end. Are you saying I still should remove those as well? (I kind of depend on Afterburner for my automated GPU fan-curve...)

I will remove the above 4 mentioned .sys -files, and use DDU to uninstall everything Nvidia and then reinstall the whole thing (I had no idea I had a mismatch O . o)

Thanks again. This gives me something to work with.


- Lith.
 

Lithia

Commendable
Jan 7, 2017
6
0
1,510
UPDATE #2 (26/09/'18):

I just finished Memtest86 run (default test, 4 passes, because I don't actually know what are things)
And the result showed zero errors whatsoever, so...I can at least rule out RAM issues?

UPDATE:

Today I had a string of fierce crashes. Getting a little angsty at this point.

I did the stuff desribed above. Removed the sys files (some were not labeled as old on my rig as stated in above posts, btw), reinstalled drivers, the works.
First crash happened, as per usual, during gaming, roughly 1 hour in.

After that I get a crash during reboot: System_service_Exception.
I reboot, and I immediately get another one: Attempted_excecute_of_noexcecute_memory.

I'm thinking: windows is getting as fed up as I am, and is starting to feel the heat of this constant crashing.

I end up being able to log back in and I leave the room, I come back some 10 mins later and I'm staring at another BsoD. The traditional KMODE_Exception one.

The first 2 minidumps are 0bytes in filesize, so they're probably useless.
The other two crashes I've had do have some data, please view them here: https://drive.google.com/open?id=16zhRwrUZr-yjt74BS3IfNZSB0ll409o7

Some brainspins:
- Can an Overclocked system, as a whole, be the culprit? (I've lowered CPU 4900hz to 4800, and dropped voltage from 1.220 to 1.200 -- minimal, I know, but it just seems so...unlikely that this is *suddenly* the issue)

- Memory diagnostic said that my RAM is fine. Do I got for a second opinion with something else? I've never used memtest, for example.

- Can Realtek Audio be a culprit even though I'm not really using it? (I use a SteelSeries Arctis Wireless Pro, which has its own soundcard/-pod, so I don't use the realtek HD audio as "speaker")

- From time to time I see that "Corsair service" has stopped working in Reliability viewer. Could this be a potential culprit? (I use corsair software for my keyboard --> RGB stuff etc.)

- Roughly 2-3 months ago I was forced to reinstall windows. I did a clean reinstall, but I did nothing with my storage drives. I've got 3 of those. Is this bad, in any way? (Bad enough to link it to this?)

Any additional info is much appreciated.


- Lith.
 
overclock drivers take turns tweaking voltages even if you do not set a overclock. Each driver can make changes on the fly. You need to reproduce the problem without any overclock driver installed. also you should use a updated bios or reset the bios to defaults to get the best default timings and voltages.
changing a voltage in bios can cause various issues. best to test with default bios settings and no overclock drivers installed.

audio drivers cause memory corruption, there are known bugs in the motherboard audio drivers that were fixed in late summer of 2017.

the corsair software stopping can indicate issues with: the software itself, firmware in the device might need to be updated, it if is a USB 2.x port you might need to update the CPU chipset driver, if it is on a usb 3.x you may need to update the external USB 3.0 chipset driver. with all usb devices you might need to update the BIOS to match the USB driver version.

often usb 2.x will work better than usb 3.0 for slow devices since windows update will update chipset drivers bug generally will not update external usb 3.0 drivers. (you have to get them from the motherboard vendor)
 
Solution

Lithia

Commendable
Jan 7, 2017
6
0
1,510
As far as the corsair "issue" goes (I never notice it myself, I simply noticed *that* it is a thing since I had to go through BlueScreenView etc.) All possible system drivers are up to date. (cpu, audio, LAN MoBo. the whole shebang) and I'm purposely running both USB cables from my keyboard off the only 2 USB 2.0 ports my motherboard has.

As for the actual issue: I've tweaked a few settings, and lowered the dosage on my CPU multiplier and...this might have worked a bit.
The little bit of gaming that I've been able to do has gone smooth. There were a few minor 'hiccups' here and there that made me go "Is there where it *would* have crashed...?"

But until I can get some more session-time in, so I can really sit down and have a game run for over 1-2 hours, I can't say for sure.

Will keep you (read: everyone) posted.

If this seems to fail I'll reset OC settings to default and try again.
If the OC turns out to be the problem...well...it's electronic hardware, so I *shouldn't* be surprised, but I'll be just a bit, as I've gone sans issue all this time.
Oh well...


- Lith.
 

Lithia

Commendable
Jan 7, 2017
6
0
1,510


UPDATE:

I've got through a few gaming sessions now and....there's not been a single issue.
I'm going to chalk it up to my overclock settings.
Let this also be a beacon of guidance to others that might be having similar problems, one day in the future.
What used to work fine, might not always be fine.
I didn't look twice at my OC settings, as they've served my fine for a good while, "why would it be a problem now?"
...All things age, I suppose. Lowering voltage (by as little as 0.020), ratio (by as little as 2) and tweaking some RAM settings sorted things out.

@Johnbl: Thanks so much for looking into this and helping me out!!


- Lith.