BSOD Followup: memory_corruption and Probably caused by : ntkrnlmp.exe ( nt!KxWaitForLockOwnerShipWithIrql+12 )

AmericaMe

Commendable
Nov 2, 2016
7
0
1,510
Hello all,

Here's the issue: Ever since I have upgraded my PC from Window's 7 to Window's 10 I have been getting BSOD. I can't find the cause

Here's what I have done thus far:
1) Uninstalled Norton and am now using Windows Defender
2) I am not overclocking
3) I downloaded and ran WinDbg and analyzed the dump file (see attached the 2 from today)
4) System restored to an earlier point
5) Fresh installed Windows 10 on my SSD
6) All my drivers are up to date
7) I also tried using an older graphics driver when the updated graphics driver didn't work

From other threads, and forums, I've read that it may possibly be my CPU or RAM needs replaced. I am wondering if there is any way to identify which one, or if my diagnosis thus far has been wrong? Also, if there are any other ideas you guys have?

Thanks for any help or ideas in advance!
 
Solution
Starting from MemTest86 v6.2, the user may see a warning indicating that the RAM may be vulnerable to high frequency row hammer bit flips. This warning appears when errors are detected during the first pass (maximum hammer rate) but no errors are detected during the second pass (lower hammer rate). See MemTest86 Test Algorithms for a description of the two passes that are performed during the Hammer Test (Test 13). When performing the second pass, address pairs are hammered only at the rate deemed as the maximum allowable by memory vendors (200K accesses per 64ms). Once this rate is exceeded, the integrity of memory contents may no longer be guaranteed. If errors are detected in both passes, errors are reported as normal.

The errors...

Colif

Win 11 Master
Moderator
The SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M bug check has a value of 0x1000007E. This indicates that a system thread generated an exception which the error handler did not catch. Bug check 0x1000007E has the same meaning and parameters as bug check 0x7E (SYSTEM_THREAD_EXCEPTION_NOT_HANDLED).

ntkrnlmp.exe is part of the windows kernel, it isn't likely to be the cause.

The resolution steps here may help you identify it further, since you have the debugger already: resolution advice

It looks like a driver error, the name is just hidden. If you cannot identify using the above resolution, you can try driver verifer - just make sure to read instructions

You can run the free version of http://www.memtest86.com/ on the ram, one stick a t a time. It creates a bootable USB so you don't need windows, Any errors are too many. You want 0 errors
 

AmericaMe

Commendable
Nov 2, 2016
7
0
1,510


I am going to try the troubleshoot next, but I wanted to give an update since there were errors reported. I ran the memtest86 for almost 5 hours and it found 5 errors (you can see them here). It was stated I should have 0 errors, but I unfortunately had 5- any insight as to what this means? I'm assuming it means my RAM is bad?

Thanks for any advice.
 

Colif

Win 11 Master
Moderator
Starting from MemTest86 v6.2, the user may see a warning indicating that the RAM may be vulnerable to high frequency row hammer bit flips. This warning appears when errors are detected during the first pass (maximum hammer rate) but no errors are detected during the second pass (lower hammer rate). See MemTest86 Test Algorithms for a description of the two passes that are performed during the Hammer Test (Test 13). When performing the second pass, address pairs are hammered only at the rate deemed as the maximum allowable by memory vendors (200K accesses per 64ms). Once this rate is exceeded, the integrity of memory contents may no longer be guaranteed. If errors are detected in both passes, errors are reported as normal.

The errors detected during Test 13, albeit exposed only in extreme memory access cases, are most certainly real errors. During typical home PC usage (eg. web browsing, word processing, etc.), it is less likely that the memory usage pattern will fall into the extreme case that make it vulnerable to disturbance errors. It may be of greater concern if you were running highly sensitive equipment such as medical equipment, aircraft control systems, or bank database servers. It is impossible to predict with any accuracy if these errors will occur in real life applications. One would need to do a major scientific study of 1000 of computers and their usage patterns, then do a forensic analysis of each application to study how it makes use of the RAM while it executes. To date, we have only seen 1-bit errors as a result of running the Hammer Test.

There are several actions that can be taken when you discover that your RAM modules are vulnerable to disturbance errors:

Do nothing
Replace the RAM modules
Use RAM modules with error-checking capabilities (eg. ECC)

Depending on your willingness to live with the possibility of these errors manifesting itself as real problems, you may choose to do nothing and accept the risk. For home use you may be willing to live with the errors. In our experience, we have several machines that have been stable for home/office use despite experiencing errors in the Hammer Test.

http://www.memtest86.com/troubleshooting.htm#hammer

How often are you getting the BSOD? Try the troubleshooting and if no driver shows itself, maybe replace ram.
 
Solution

AmericaMe

Commendable
Nov 2, 2016
7
0
1,510



Thanks for the link and info, I will look into this more tomorrow and report any action(s) I take. I get the BSOD regularly; it's a guarantee that it will occur if I am playing PC games such as Battlefield 1 or Witcher 3. It's only a matter of when; it ranges from 20 minutes to an hour. I use that as a test after I make a change to something since it always occurs during that gaming tasks with those games.
 
6 = binary 0110
E = binary 1110 (first bit was a data corruption)

failure in the bit hammer test means that one bit has leaked a charge to the adjacent cell, this means the data in memory becomes corrupted.

you will want to slow down your memory timings or isolate the ram stick and replace it.

windows 10 actually checks for data corruption and will call a bugcheck if any of its data structures are modified.
It assumes it is malware corruption even when it is just a simple hardware failure.
The checking was added to fight malware.

it is pretty common for RAM to fail this test. Most people never test the RAM. as the RAM gets more dense you see this problem much more often than you used to. Sometimes you can adjust your memory voltages and timings and get the test to pass.
I would just make sure your settings in BIOS are correct (update the BIOS, vendors make corrections to the memory timings and don't tell you) if the RAM fails with the new bios with correct settings I would just replace the ram stick.
Also, if you isolate the ram stick, sometimes you can put it in the fastest memory slot (the one closest to the CPU) and it will work.
the closest memory slot might be 9 ns faster and just provide needed timing change to allow it to work correctly.
generally, slowing down the memory access timings will work also.

 

AmericaMe

Commendable
Nov 2, 2016
7
0
1,510
Thank you guys for the assistance on this issue. I ended up replacing both RAM sticks and have not encountered the issue all weekend. I'm marking this as solved since I have not encountered the BSOD since replacing my RAM.

As always, you all help me to continually grow my knowledge and understanding with the inner working of computers.