Reoccuring BSoD Problem

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
Hey guys, I built my computer last year around Cyber Monday 2012, and up until a month or two ago had no issues whatsoever. The problem first started only when playing Final Fantasy XIV, but after I took a break from the game the problems seemed to go away for a while.

The problem has now resurfaced when playing other games, especially when running Pandora in the background. I have gotten two different issues thus far when the BSoD occurs. One is IRQL_NOT_LESS_OR_EQUAL, and KERNEL_SECURITY_CHECK_FAILURE.

I have been going through all my drivers and updating them to make sure it isn't the issue, but the problem is still happening. I'm not very good at determining whether or not it's a hardware issue, but from reading through various threads with similar problems, it seems as though it could be my RAM that is causing the issue.

I have not done any overclocking whatsoever, and have been running my computer with stock settings since I bought it.

Here are two of the crash dumps.

https://skydrive.live.com/redir?resid=B29B66DAA577F311!382&authkey=!APUkzIb2f39uFzM&ithint=folder%2c

Thanks in advance for any help.
 
Solution
memtest86 v 4.3.6
you had two types of test failures

Test 7 [Moving inversions, 32 bit pattern]

This is a variation of the moving inversions algorithm that shifts the data pattern left one bit for each successive address. The starting bit position is shifted left for each pass. To use all possible data patterns 32 passes are required. This test is quite effective at detecting data sensitive errors but the execution time is long.

Test 9 [Modulo 20, Random pattern]
Using the Modulo-X algorithm should uncover errors that are not detected by moving inversions due to cache and buffering interference with the algorithm.

- I would check the memory timings from the memory modules and compare them to what the BIOS actually has them set to...
edit: new realtek audio drivers from here:
http://www.realtek.com.tw/downloads/downloadsView.aspx?Langid=1&PNid=13&PFid=5&Level=5&Conn=4&DownTypeID=3&GetDown=false
(there were reports that the old driver was corrupting kernel memory)

Ok, it looks like you got a bugcheck, updated your BIOS and replaced your memory. then got another bugcheck a few weeks later.
The cause of both bugchecks is not clear but it is most likely a buffer overflow cause by a driver.

I would suspect one of these three:
nvhda64v.sys Thu Nov 28 05:38:09 2013 (nvidia audio driver for HDMI)
RTKVHD64.sys Tue Jun 12 03:02:32 2012 (Realtek(r) High Definition Audio Function Driver )
mbam.sys (malwarebytes driver Thu Feb 28 12:33:07 2013, just old and you might want to remove or update)


nvxdsync.exe was running.
My guess it that there is a bug the nvidia HMDI audio driver that caused a buffer overflow and corrupted memory that belongs to other drivers.
Or there is some interaction between the nvidia sound driver and the older realtek sound driver. ( bug could be in the realtek driver as well, I don't know)

I suspect the bug is triggered by using the nvidia program nvxdsync.exe (sound control app? you might look up what it does)

I would recommend that you disable the nvidia audio driver in control panel->device manager-> high definition audio devices.
or disable the one on your mother board, depends on which one you use most.


Notes: you did not have any detected memory corruptions in any loaded windows core files. The corruption was in shared data structures in the drivers memory space. Basically, a driver corrupted memory, later a windows driver went to get its data from memory and detected the corruption and called a bugcheck.

Windows will load drivers in a different order each time it boots. Most drivers don't look for corruption but a few do look in a effort to isolate damage
to the OS from a virus or driver errors. This is why you will not get a bug check on each boot but still have the same problem, it is just not detected
until a core windows driver's data is corrupted and that driver calls a bugcheck.
(directx called one bugcheck, and a memory manager routine called the other)









debug info:
Built by: 9200.16628.amd64fre.win8_gdr.130531-1504
Machine Name:
Kernel base = 0xfffff800`5627f000 PsLoadedModuleList = 0xfffff800`5654ba20
Debug session time: Sat Jan 18 11:30:14.299 2014 (UTC - 8:00)
System Uptime: 0 days 0:00:19.012



BugCheck A, {fffff700010a2888, 0, 0, fffff80056313183}


nvxdsync.exe
mbam \??\C:\Windows\system32\drivers\mbam.sys Thu Feb 28 12:33:07 2013

RTKVHD64.SYS Realtek(r) High Definition Audio Function Driver Jun 12 03:02:32 2012

RTKVHD64 \SystemRoot\system32\drivers\RTKVHD64.sys Tue Jun 12 03:02:32 2012
nvhda64v \SystemRoot\system32\drivers\nvhda64v.sys Thu Nov 28 05:38:09 2013

Identifier = REG_SZ AMD64 Family 21 Model 1 Stepping 2
ProcessorNameString = REG_SZ AMD FX(tm)-8120 Eight-Core Processor

Processor ID 120f6000fffb8b17
Processor Version AMD FX(tm)-8120 Eight-Core Processor
Processor Voltage 8dh - 1.3V
External Clock 200MHz
Max Speed 3100MHz
Current Speed 3100MHz

BIOS Version 2104
BIOS Starting Address Segment f000
BIOS Release Date 11/21/2013
Manufacturer ASUSTeK COMPUTER INC.
Product SABERTOOTH 990FX R2.0

memory
bank 0: none
bank 1: 4096MB 1333MHz Kingston 99U5403-043.A00LF
bank 2: none
bank 3: 4096MB 1333MHz Kingston 99U5403-043.A00LF


core OS files not corrupt in memory

--------------------------------------
BugCheck 139, {3, fffff88007fb6510, fffff88007fb6468, 0}
KERNEL_SECURITY_CHECK_FAILURE (139)
A kernel component has corrupted a critical data structure. The corruption
could potentially allow a malicious user to gain control of this machine.
Arguments:
Arg1: 0000000000000003, A LIST_ENTRY has been corrupted (i.e. double remove).
Arg2: fffff88007fb6510, Address of the trap frame for the exception that caused the bugcheck
Arg3: fffff88007fb6468, Address of the exception record for the exception that caused the bugcheck
Arg4: 0000000000000000, Reserved

ERROR_CODE: (NTSTATUS) 0xc0000409 - The system detected an overrun of a stack-based buffer in this application.
This overrun could potentially allow a malicious user to gain control of this application.

------------------------------------------------------
machine 2
Windows 8 Kernel Version 9200 MP (8 procs) Free x64
Built by: 9200.16628.amd64fre.win8_gdr.130531-1504
Debug session time: Wed Dec 25 11:26:03.581 2013 (UTC - 8:00)
System Uptime: 0 days 0:34:10.298

BugCheck 139, {3, fffff88007fb6510, fffff88007fb6468, 0}
KERNEL_SECURITY_CHECK_FAILURE (139)
A kernel component has corrupted a critical data structure. The corruption
could potentially allow a malicious user to gain control of this machine.
Arguments:
Arg1: 0000000000000003, A LIST_ENTRY has been corrupted (i.e. double remove).
Arg2: fffff88007fb6510, Address of the trap frame for the exception that caused the bugcheck
Arg3: fffff88007fb6468, Address of the exception record for the exception that caused the bugcheck
Arg4: 0000000000000000, Reserved



BIOS Release Date 07/10/2012
Manufacturer ASUSTeK COMPUTER INC.
Product SABERTOOTH 990FX R2.0
memory
bank1 and bank 3: 4096MB 1600MHz Manufacturer2


 

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
I went ahead and updated my realtek audio drivers, then disabled all the unused Audio ports that had to do with Nvidia. I normally use my motherboards audio jacks. I keep my WalwareBytes always up to date, so I'm not sure why that file is old. Would removing it run any risk to anything else? Otherwise I'll just go ahead and delete it. Thanks for the advice, I'll be running through tests today that have been blue screening me and see if it helps. I really appreciate the help.
 
the malwarebytes driver might be the current one. I think you can do a scan without installing the driver. you might do that to remove possible points of failure.



 

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
looks like kernel data corruption again. using a bad address.
looking at your os image, the only thing i can think of it to turn off your special hardware for your usb 3.0 in BIOS and see if the problem goes away.
I mention this because it looks like you have generic usb 3.0 drivers but there were lots of versions of hardware that had bugs in their circuits with USB 3.0. These circuits with bugs in the electronics require custom drivers to work correctly. SO, disable usb 3.0 on the motherboard, put your devices on usb 2 and hope.

confirm that a checkdsk /f /r of your drive is ok
confirm that your OS files are ok use
cmd.exe as admin then run
sfc.exe /scannow
maybe swap your memory between their slots and run a RAM test?
 

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
I updated my computer to Windows 8.1 after running checkdsk because it came up with corrupt files. I was planning on rerunning checkdsk to see if it came up with anything after installing the new version of Windows, but before I got there I blue screened again with a new error. I didn't catch the notice I got when it crashed, but here's the minidump for it. The message was different from the other two I've gotten previously.

https://skydrive.live.com/redir?resid=B29B66DAA577F311!646&authkey=!AGL7QyqozgPWa_I&ithint=file%2c.dmp

I haven't disabled USB 3.0 yet, or moved my RAM around to run memtest, but was curious if the new crash dump showed anything different.

Thank you again for your help.
 
BugCheck 4E, {8d, b7faf, 410080, fffff68000073b83}
PFN_LIST_CORRUPT (4e)
Typically caused by drivers passing bad memory descriptor lists (ie: calling
MmUnlockPages twice with the same list, etc). If a kernel debugger is
available get the stack trace.
Arguments:
Arg1: 000000000000008d,
Arg2: 00000000000b7faf
Arg3: 0000000000410080
Arg4: fffff68000073b83

the description for Arg1= 8d = The page-free list is corrupted. This error code most likely indicates a hardware issue.

i did not find any corrupt OS modules in the debugger.

run memtest86 http://www.memtest86.com/ if it works then swap your memory modules slots and run it again. If it still works after that you are back to hunting down a driver corruption issue.
 

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
memtest86 v 4.3.6
you had two types of test failures

Test 7 [Moving inversions, 32 bit pattern]

This is a variation of the moving inversions algorithm that shifts the data pattern left one bit for each successive address. The starting bit position is shifted left for each pass. To use all possible data patterns 32 passes are required. This test is quite effective at detecting data sensitive errors but the execution time is long.

Test 9 [Modulo 20, Random pattern]
Using the Modulo-X algorithm should uncover errors that are not detected by moving inversions due to cache and buffering interference with the algorithm.

- I would check the memory timings from the memory modules and compare them to what the BIOS actually has them set to.
- I would also try different slots to see if I get the same errors
- try your other set of memory to see if they have the same issue

It looks like there are a lots of BIOS updates(December, November, October, ...) for this board, it looks like you are current but the board may have issues. I would talk to ASUS about getting your memory timings correct to pass the memtest86 tests. I would ask Asus they will know which memory setting that will effect this test, lots of people run this test and this is a popular board.

the error can even be in the memory controller in the CPU. hopefully it is going to be in one stick or slot of your mother board.
 
Solution

jbonham91980

Distinguished
Jun 24, 2011
9
0
18,510
It looks like the standard setting for my RAM are 1600 mhz and 1.65V compared to the 1333 mhz and 1.501V my BIOS currently had it set at. I changed them, but haven't swapped my sticks just yet. I'm going to see if I crash again with the current settings, then I'll swap from slots 2 and 4 on the motherboard to 1 and 3 and then see what happens. Thanks again for your help.