P8P67 Pro + 2500k OC'd for 2 years has suddenly become unstable

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510
My system overclocked to 4.5ghz was rock solid for nearly two years, but suddenly I've been getting random blue screens with "whea_uncorrectable_error". I built my current rig in June 2011 with:
- ASUS P8P67 Pro
- 2500k
- Noctua NH-D14 dual-fan heatsink
-8gb of ram (2x 4gb)
- 560ti gfx card
- Fractal R3 case

I immediately OC'd by only adjusting two settings:

1.) up multiplier to 45, taking the cores to 4500mhz
2.) adjust the VRAM Frequency from 300 to 350

The only change is I'm running Windows 8 64bit, and I've upgraded the BIOS to 3602 (the latest version on the ASUS page). Running Prime95 in blend mode will trigger the crash within a few minutes. Sometimes it successfully reboots, sometimes it hard locks (if it hard locks, it takes down my router's internet connectivity with it - see below*).

Restoring BIOS settings to factory seems to make it pretty stable, I've run Prime95 multiple times for a few hours at factory, but haven't had crashes each time. I've tried cleaning out the inside and resetting the ram and video card - the only thing I haven't tried is resetting the cooler, but right now it's idling at 31c and jumps to 42c after 20 minutes with Prime95 running, so I feel like the cooler is doing its job.

Any ideas as to what would suddenly cause the instability? Any suggestions on how to troubleshoot it?

* If the machine hard locked after a blue screen, it would also rendered my router's internet connection unresponsive. It was the strangest thing, the computer is connected by Ethernet to one of the router ports, but once it locked, the router was somehow overloaded - if you were connected by wifi at the time, you could access the admin interface, but you couldn't access it remotely, and all internet activity was shut down. Restarting the computer restores the connection without any additional interaction with the router - it's important to stress that the RT-N16 router seems otherwise unphased other than its inability to communicate with the internet during the rig's lockup.
 

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510


It doesn't happen now at default, but it has run games OC'd for 100's of hours over the past two years with no problems, and the only that that has changed is BIOS updates and an upgrade to Windows 8... I'm concerned that it's a hardware failure issue, and disabling BIOS features is only delaying the inevitable. If it's memory or even the processor, that's easier to replace, but if it's motherboard failure, I'm better off building a new rig from scratch.
 

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510
So I've ruled out memory, heatsink/fans, BIOS, CMOS settings, drivers, and Windows installs. Since I made this post I've reinstalled Windows 8 clean, downgraded the BIOS, done multiple CMOS clears, tried each memory stick one at a time, and reseated the CPU with fresh thermal paste. This leaves only three possible points of failure:

- the motherboard
- the CPU
- the power supply (Corsair HX 650w)

Is there any way to rule out these areas? The motherboard is still under warranty, but how do I know it's not the CPU or PSU?
 

Tradesman1

Legenda in Aeternum
PSU is fairly easy if you have another, or know a friend, co-worker you can one from to test with for an evening or so....when you did the BIOS update, and if your DRAM is 1600 or better, did you re-enable XMP? Another thing would be to check and make sure all your drivers are up to date, your BSOD is indicative of a hardware failure...
 

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510


Unfortunately I don't have access to another PSU. The DRAM is only 1333, so no XMP. The drivers are up-to-date, I did a clean Windows 8 install with drivers from asus.com and all Windows Updates.
 

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510
Here's the debug report:


Microsoft (R) Windows Debugger Version 6.2.9200.20512 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Bitmap Dump File: Only kernel address space is available

Symbol search path is: srv*c:<local folder>*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 8 Kernel Version 9200 MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 9200.16581.amd64fre.win8_gdr.130410-1505
Machine Name:
Kernel base = 0xfffff800`c7e76000 PsLoadedModuleList = 0xfffff800`c8142a20
Debug session time: Sat Jun 15 14:13:52.525 2013 (UTC - 5:00)
System Uptime: 0 days 0:07:24.175
Loading Kernel Symbols
...............................................................
................................................................
...................
Loading User Symbols
PEB is paged out (Peb.Ldr = 000007ff`fffdb018). Type ".hh dbgerr001" for details
Loading unloaded module list
..........
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 124, {0, fffffa8007e2e028, be200000, 5110a}

Probably caused by : GenuineIntel

Followup: MachineOwner
---------

2: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa8007e2e028, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000be200000, High order 32-bits of the MCi_STATUS value.
Arg4: 000000000005110a, Low order 32-bits of the MCi_STATUS value.

Debugging Details:
------------------


BUGCHECK_STR: 0x124_GenuineIntel

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

PROCESS_NAME: prime95.exe

CURRENT_IRQL: f

STACK_TEXT:
fffff880`01a67868 fffff800`c7e37965 : 00000000`00000124 00000000`00000000 fffffa80`07e2e028 00000000`be200000 : nt!KeBugCheckEx
fffff880`01a67870 fffff800`c7fd6ca9 : 00000000`00000001 fffffa80`066d8490 00000000`00000000 fffffa80`07e2e028 : hal!HalBugCheckSystem+0xf9
fffff880`01a678b0 fffff800`c7e37703 : 00000000`00000728 00000000`00000002 fffff880`01a67a10 fffffa80`066d8490 : nt!WheaReportHwError+0x249
fffff880`01a67910 fffff800`c7e37020 : 00000000`00000010 fffffa80`066d8490 fffff880`01a67ac8 fffffa80`066d8490 : hal!HalpMcaReportError+0x53
fffff880`01a67a70 fffff800`c7e36f1b : fffffa80`0676a7a0 00000000`00000001 00000000`00000002 00000000`00000000 : hal!HalpMceHandlerCore+0xd4
fffff880`01a67ac0 fffff800`c7e36d78 : 00000000`00000004 00000000`00000001 00000000`00000000 00000000`00000000 : hal!HalpMceHandler+0xe3
fffff880`01a67b00 fffff800`c7e37f0f : fffffa80`0676a7a0 fffff880`01a67d30 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0xd4
fffff880`01a67b30 fffff800`c7ece77b : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40
fffff880`01a67b60 fffff800`c7ece52e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x7b
fffff880`01a67ca0 00000001`419496b7 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x16e
00000000`03f2e610 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00000001`419496b7


STACK_COMMAND: kb

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: GenuineIntel

IMAGE_NAME: GenuineIntel

DEBUG_FLR_IMAGE_TIMESTAMP: 0

FAILURE_BUCKET_ID: 0x124_GenuineIntel_PROCESSOR_CACHE

BUCKET_ID: 0x124_GenuineIntel_PROCESSOR_CACHE

Followup: MachineOwner
---------

 

heyrobscott

Honorable
Jun 2, 2013
8
0
10,510


All drivers are definitely up to-date. I've tried drivers in waves - first barebones clean Win8 install, then Windows Update, then files from asus.com, then finally newer versions posted to http://www.overclock.net/t/910402/asus-p67-series-information-thread-drivers-bioses-overclocking-reviews-updated-4-22 specifically for my motherboard. Nothing changes unfortunately.