BSOD caused by hal.dll, ntoskernel and pshed [Windows 10]

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Hello,

I've seen the several threads that also pertain to this issue but also see that people gave more detailed suggestions after seeing the logs / mem dumps, apologies if creating another post as such is frowned upon.

I've recently upgraded to a Core i7 8700K, Gigabyte Aorus Ultra Gaming, and 16GB of 3200 MHZ G.KILL Ripjaw RAM. The operating system is Windows 10

Just today after about 2 weeks of zero issues am i having constant BSOD's.

Since I got home at around 6 PM to two hours later it is still constantly BSODing.

I had the chance to run bluescreenviewer and it was the same 3 culprits:
hal.dll
pshed
ntoskernel

==================================================
Dump File : 122017-16328-01.dmp
Crash Time : 12/20/2017 8:00:11 PM
Bug Check String :
Bug Check Code : 0x00000124
Parameter 1 : 00000000`00000000
Parameter 2 : ffffbe02`8c945028
Parameter 3 : 00000000`be000000
Parameter 4 : 00000000`00800400
Caused By Driver : hal.dll
Caused By Address : hal.dll+3bf1f
File Description : Hardware Abstraction Layer DLL
Product Name : Microsoft® Windows® Operating System
Company : Microsoft Corporation
File Version : 10.0.16299.98 (WinBuild.160101.0800)
Processor : x64
Crash Address : ntoskrnl.exe+1640e0
Stack Address 1 :
Stack Address 2 :
Stack Address 3 :
Computer Name :
Full Path : C:\Windows\Minidump\122017-16328-01.dmp
Processors Count : 12
Major Version : 15
Minor Version : 16299
Dump File Size : 848,956
Dump File Time : 12/20/2017 8:02:50 PM
==================================================

I have also a full 800 mb mem dump that I can provide but I'm unaware of a reliable resource to upload files of that size.


I've performed a BIOS update to the latest version on GIGABYTE's website with no success as well as pulling out my sound card (PCIe) without any change in the systems behavior.

Any help is much appreciated, I really don't know why it suddenly is playing up

Thanks
 
Solution
What i would do was take it to a repair place and get them to identify the exact cause, it could be CPU or motherboard still, and i find its faster to get a store to confirm which part is broken instead of doing an RMA of motherboard just to find its the CPU

Someone who has spare parts they know work will figure this out faster than guessing.

Colif

Win 11 Master
Moderator
Can you go to C:\windows\minidump
Copy the dmp files from here to another folder
upload the copies from other folder to a file sharing site and I will get someone to look at them for you. It will show what drivers were loaded at time, give us a hint what cause is.

NTOSKRNL = windows kernel. It handles all driver requests, power management, and memory management. It sits between Hardware and Applications. It got blamed but its not the cause
HAL.dll = Hardware Abstraction Layer - it sits between hardware and the kernel
not sure about that other file name?

WHEA stands for Windows Hardware Error Architecture - error is called by CPU but not necessarily caused by it. Can be caused by overclocking software
remove overclocks
remove any overclocking software (such as MSI Afterburner)

Try running this on CPU: https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Hello Colif,

I am currently at work and do not have access to the machine. I've also had very limited success of using the computer for more than 1 minute before it BSOD's and continues the cycle of doing so.

I did however grab a memdump (not mini unfortunately) before it began the cycle last night. Its been uploaded to dropbox here: https://www.dropbox.com/s/u9010ssoni6fmgu/MEMORY.DMP?dl=0

Hopefully that will be able to give some insight as to what is going on.

The computer is fully stock, no overclocking, no msi afterburner etc

Thanks for your help!

 

gardenman

Splendid
Moderator
Hi, I ran the dump file through the debugger and got the following information: https://pste.eu/p/bUnt.html

File: MEMORY.DMP (Dec 20 2017 - 20:07:53)
BugCheck: [WHEA_UNCORRECTABLE_ERROR (124)]
Probably caused by: GenuineIntel (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 00 Min(s), and 39 Sec(s)

The overclocking driver "IOCBios2.sys" was found on your system. (Intel Extreme Tuning Utility)

I can't help you with this. Wait for additional replies. Good luck.
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Hi Gardenman,

I've uninstalled Intel Extreme Tuning Utility and got hit with another BSOD with the error code CLOCK_WATCHDOG_TIMEOUT

After booting it BSOD'd as usual and was back to WHEA_UNCORRECTABLE_ERROR

Thanks for your input!
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Reinstallation of Windows was a failure, it continues to occur in the same frequency with the same error.

I have not run the intel processor diagnostic tool, I'll try to run that and hopefully it will have enough time to complete
 

gardenman

Splendid
Moderator
Here's the minidump results before the Windows reinstall: https://pste.eu/p/GyQK.html

File: 122217-18015-01.dmp (Dec 22 2017 - 22:44:19)
BugCheck: [CLOCK_WATCHDOG_TIMEOUT (101)]
Probably caused by: Unknown_Image (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 01 Min(s), and 38 Sec(s)

File: 122217-17562-01.dmp (Dec 22 2017 - 22:57:32)
BugCheck: [WHEA_UNCORRECTABLE_ERROR (124)]
Probably caused by: GenuineIntel (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 00 Min(s), and 35 Sec(s)

File: 122117-16312-01.dmp (Dec 21 2017 - 20:20:15)
BugCheck: [WHEA_UNCORRECTABLE_ERROR (124)]
Probably caused by: GenuineIntel (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 00 Min(s), and 28 Sec(s)

File: 122117-16203-01.dmp (Dec 21 2017 - 19:44:52)
BugCheck: [WHEA_UNCORRECTABLE_ERROR (124)]
Probably caused by: GenuineIntel (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 02 Min(s), and 19 Sec(s)

File: 122117-16109-01.dmp (Dec 21 2017 - 20:05:38)
BugCheck: [WHEA_UNCORRECTABLE_ERROR (124)]
Probably caused by: GenuineIntel (Process: MsMpEng.exe)
Uptime: 0 Day(s), 0 Hour(s), 03 Min(s), and 41 Sec(s)
I can't help you with this. Wait for additional replies. Good luck.
 

Colif

Win 11 Master
Moderator
it sure crashes fast...

SO you haven't loaded any drivers and it still crashes at the same rate? Does it crash at boot or after login? If its after login, see what happens in safe mode
on the screen you have to click to get login box to appear, click the power button in bottom right
while holding down a shift key, click restart now
this should load a blue menu
choose troubleshoot
choose advanced
choose start up options
hit the restart button
choose a safe mode (it doesn't matter which) by using number associated with it.
Pc will restart and load safe mode

If it gets into safe mode and doesn't crash within a few minutes, it might be a driver. Otherwise its looking like something in your PC is broken and causing it to crash.

You can try running the INtel Processor Diagnostic tool in linux and see if it completes there: http://www.tcsscreening.com/files/users/IPDT_LiveUSB/index.html

linux can also get WHEA errors and if you get them, its a good sign its hardware. possibly CPU since it crashed running a load test
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Well good news, it appears that it is a driver issue as booting into safe mode gave me complete stability. I left it on with Bluescreenviewer and a few tabs opened so in the event of a BSOD i'd know it happened and when I returned, everything still remained!

Furthermore, the only driver I installed was NVidia drivers for my GTX 970. I uninstalled that in safe mode and I am now currently writing this post from a normal boot without any GPU drivers installed....

I will be reseating my GPU (doubt that will change anything) and installing the latest from Nvidia. If it persists I guess the GPU is toast? Which wouldn't make sense if it can function as it is meant to be without drivers.


Any ideas?

Again thank you all for your help
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Okay back to the drawing board. As I was running normally booted writing the message above, several minutes later it crashed. Found the root cause of windows simply installing nvidia drivers automatically as it had an internet connetion (and the resolution going to 1080p vs the low rest) but now removing all the video drivers still allows the system to have a higher resolution, even after removing all possible nvidia drivers.

Honestly about to give up at this rate this is ridiculous. Forever booted in safe mode :(
 

Colif

Win 11 Master
Moderator
sorry, had stuff i had to do last 2 days...

Try removing all unessential parts, so remove GPU and run using integrated off motherboard, remove all ram except 1 stick, see if we can install Win 10 in a limited hardware form.

I will ask for a 2nd opinion
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
No problem,

I've been running with Integrated and it still continued. Seemed any time i attempted to load any driver (chipset, storage, gpu etc) it would die. Even if left alone it would eventually BSOD.

I switched back to my 4670K, i got too tired of it. Someone on reddit did some analysis as well and its pointing most likely to the CPU, awaiting their answer from my updates
 
Since it fails at any driver install, and without too, it points towards motherboard failure. But before jumping to that conclusion I would for more tests, just to be sure. Check your RAM with memtest86 (I doubt it would be it, since it works in safe mode, but do it nevertheless). Try installing Windows on different drive to rule out bad drive. And as Colif mentioned run a Linux from bootable USB.
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
I upgraded from a 4760k, and they are different chipsets so I essentially have another mobo, set of ram and the cpu itself which i swapped in. I believe the ram to be okay, but I'll try to run memtest in safe mode / driverless Windows.

The Intel Diagnostic Tool was able to complete yesterday and the CPU "passed" everything. On my reddit thread people have seen bad CPU's still pass the test.

I tried loading Windows on an older SSD I had and no luck. As we speak I'm using the 4670K setup with all existing drives having only swapped mobo ram and cpu.
 
If the ram passes test, that leaves you with only two possible hardware suspects: CPU and motherboard. While I do agree CPU could pass Intel test and still be faulty, however if it was faulty, most likely would give you BSODs in safe mode too. And the fact, that you had BSOD when trying to reinstall Windows from scratch makes me believe it is not purely software issue. So unless I see something else that could contradict it, I assume it is faulty motherboard that's giving you the trouble.
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
All fair points. I think by tonight I'll swap everything over again, getting faster and faster at doing so at this rate. I'll reinstall Windows, now with the newer hardware (8700K etc), run memtest, run the diagnostic tool again (if it is stable enough since i cannot run it in safe mode) and give the results. I just wish there was a more concrete way to prove which thing is the problem so RMA'ing will be ezpz

Thanks again for all your help everyone
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Here is what the person from Reddit who ran manual analysis has told me:

Due to the 0x124 bug checks being consistently Internal timer I highly suspect it's a CPU problem. The 0x124 bug checks caused by a motherboard problem that I have seen were all related to PCIe communication issues. Are there newer dump files that have a larger variety of bug check codes?

Likewise, if it was a power problem I'd expect the BSODs to vary much more than what I've seen so far. Same with a heat problem. It's just too consistent and internal to the CPU rather than external.

I've helped out in some other threads where the OP replaced just about everything, tested hardware in every way available (including the Intel Diagnostic Tool which passed), and the only thing that made the system stable was a new CPU. They were Skylake and Kaby Lake CPUs. Here is one of the more detailed threads if you have the time and patience.

I guess my advice would be to not wait too long before trying a different CPU since dealing with Intel for a warranty would likely take quite a bit longer than dealing with the shop you got it from.

Please let us know what you find out.


I will be running tests that I've previously mentioned still
 

1337jesuschrist

Honorable
Dec 25, 2012
26
0
10,530
Welp back with some results.
Reinstalling the 8700K system, I was able to reinstall Windows.
On the first boot of Windows it essentially BSOD'd before the login screen.
When i returned to it with a restart, it idled about before BSOD'ing.
I ran memtest86 for 4 cycles which took approximately 4 hours or so and have the results on a generated html file.
Zero issues found at all. It was clear across the board.
Then i booted it and ran the intel diagnostic tool again. Passed wth flying colors.
Next i tried to disable core by core and install NVidia drivers / letting it idle. Each time I disabled a core, the BSOD's came faster and faster.
I then had it with one core enabled, and now the computer won't start. It will not even pass POST, and I cannot get access into the BIOS. There is no prompt, display, progress or anything. Its completely black on the screen and the computer is just on.

Here are minidumps that i also provided the user on Reddit: https://share.rtechsupport.org/minidumpsDec26.zip

I'm really leaning towards CPU failure
 

gardenman

Splendid
Moderator
I ran the dump files through the debugger and got the following information: https://pste.eu/p/1cRA.html

File: 122617-15468-01.dmp (Dec 26 2017 - 20:43:35)
BugCheck: [MACHINE_CHECK_EXCEPTION (9C)]
Probably caused by: ntkrnlmp.exe (Process: setup.exe)
Uptime: 0 Day(s), 0 Hour(s), 01 Min(s), and 10 Sec(s)

File: 122617-15375-01.dmp (Dec 26 2017 - 20:33:44)
BugCheck: [CLOCK_WATCHDOG_TIMEOUT (101)]
Probably caused by: Unknown_Image (Process: ShellExperienc)
Uptime: 0 Day(s), 0 Hour(s), 01 Min(s), and 04 Sec(s)

File: 122617-15296-01.dmp (Dec 26 2017 - 20:37:06)
BugCheck: [MACHINE_CHECK_EXCEPTION (9C)]
Probably caused by: dxgmms2.sys (Process: System)
Uptime: 0 Day(s), 0 Hour(s), 02 Min(s), and 54 Sec(s)

File: 122617-14750-01.dmp (Dec 26 2017 - 20:48:23)
BugCheck: [MACHINE_CHECK_EXCEPTION (9C)]
Probably caused by: ntkrnlmp.exe (Process: svchost.exe)
Uptime: 0 Day(s), 0 Hour(s), 00 Min(s), and 16 Sec(s)

File: 122617-14500-01.dmp (Dec 26 2017 - 20:45:39)
BugCheck: [MACHINE_CHECK_EXCEPTION (9C)]
Probably caused by: ntkrnlmp.exe (Process: dwm.exe)
Uptime: 0 Day(s), 0 Hour(s), 01 Min(s), and 10 Sec(s)
I can't help you with this. Wait for additional replies. Good luck.