Windows 10 BSOD but Fine in Safe Mode

jameshan2155

Prominent
Jan 20, 2018
11
0
510
Specs:
i5-6600k
GIGABYTE GA-Z170-MX 5 GAMING --> MSI Z270M Mortar
Samsung EVO 850 250GB SSD
Corsair Vengeance LPX 3200MHz 16GB (2x8GB)
WD 1TB HDD

I installed some NVIDIA GTX1070 drivers around 2-3 months ago, and a week later I began to experience BSODs, usually MACHINE_CHECK_EXCEPTION, IRQL_NOT_LESS_OR_EQUAL, and CLOCK_WATCHDOG_TIMEOUT. I initially thought it might be because of my overclock settings, so I reset my BIOS settings, but the BSODs persisted.
I then suspected either a CPU, motherboard, or RAM hardware issue, so I tested my CPU on my friend's z270 motherboard, and ran VR and prime95, which had no errors or crashes. I ran memtest86 for 10 passes, again with no errors, and I replaced my motherboard from the Gigabyte to the MSI motherboard shown in the specs. Still BSODs.
I decided to try running in Safe Mode, which miraculously had no BSODs for over 2 hours, over the course of which I ran an hour long YouTube video, and ran prime95 for an hour, with no errors or crashes. This made me think that it was a driver problem, so I wiped my SSD with Samsung Magician, and attempted to clean install windows using Windows Boot Media.
Interestingly, the Windows Boot Media still had BSODs (keep in mind that at this point, the SSD was blank, and the HDD was unplugged). I tried running Windows Boot Media with both the SSD and HDD unplugged, but still had BSODs. I suspected the Boot Media might be bad, so I first ran sfc /scannow, which said there were no errors with the Boot Media. I also tried running Ubuntu's "Test on this computer" as well as tried to install Ubuntu (with all combinations of HDD, SSD, or none plugged in), and still had freezing.
At this point, I'm completely lost on what to do next. Since the computer ran fine in Safe Mode, the issue has to be a software/driver problem, but I have no OS installed, and the problems persist even without an SSD or HDD plugged in, meaning its a hardware problem, which makes absolutely no sense. Does anyone have any ideas?
 

Colif

Win 11 Master
Moderator
if ubuntu and win 10 won't install, its not likely to be software.

which BSOD did you get while installing IRQ or WHEA or MCE? (last two are same error, just different descriptions)
All of the errors you got originally could be software but its too late now you swapped motherboard.

So it shouldn't be ram (Memtest isn't perfect)
It shouldn't be CPU (could have run INtel processor diagnostic tool on it in other PC, as well)
Its unlikely to be motherboard (since you swapped it)

Try installing win 10 with just 1 ram stick, no GPU and see if it makes any difference. Since you mentioned these all happened after GPU was added. Remove power from hdd while you do it.

WHEA = Windows Hardware Error Architecture. They are errors called by CPU but not necesarily caused by them. Can be caused by overclocking, and/or Overclocking software
Clock Watchdog Timeout is also a hardware error associated with CPU

IRQ errors are normally drivers

Safe mode doesn't test PC out much, it uses minimal settings and default safe drivers. Hardware that fails in windows can work in Safe mode
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510

Thanks for the quick reply!
I got mostly MCE while installing, but I also got IRQ once or twice.
All my troubleshooting so far has been with no GPU attached, and I've tried every combination of RAM on my motherboard.
 

jr9

Estimable
Well it seems like a hardware issue for sure if you can't even install an OS, and you've already swapped out so many parts. Both DIMMs failing is very unlikely as well.

Do you have another power supply you can try? Also which one do you have?

Is your RAM running at 2133MHz or does it have an XMP OC?
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


I could try a different power supply, and the one I have is a Thermaltake Gold 750W.
Even still, why would the computer work in safe mode if the power supply was faulty?

I tried running the RAM at both 2133MHz and XMP OC at 3200MHz, but no difference.
 

jr9

Estimable
When the GPU has its drivers loaded and it's in use it will draw far more power than in safe mode where GPU drivers aren't even loaded. If the PSU can't supply clean power or enough of it to the system or GPU, you can get OS level crashes. Ive seen bad power supplies generate BSODs before. The errors you are getting really make no sense at all, they are very random.
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


What you said about the GPU drawing power makes sense, but the computer is crashing even when the GPU is not connected at all to the computer. Is there any other reason the power supply could be causing BSODs in regular Windows but not in Safe Mode?
 

jr9

Estimable
I didn't know that the GPU was disconnected when the PC cashed. Well, if it crashes in normal mode not in safe with the GPU removed from the PC and the PC running Intel integrated graphics I would still try another PSU anyways just to rule it out. It's less likely to help if the GPU isn't part of the equation but the way I see it if you are installing the PSU yourself you don't have anything to lose except some time. I think a motherboard issue is more likely but I still like ruling out PSU on any PC I work on with hardware issues 1st or 2nd because it can cause so many bizarre issues and lead you on a wild goose chase looking at other parts.

This case is is odd because crashes only outside safe mode points directly do driver or software problems while crashes during Windows installations or Linux installations especially points towards hardware problems. All you can do is eliminate things one by one.
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510




I recently tested a newly bought PSU, and same errors.
I also tried building out of the case on a cardboard box, but still got the same errors.
I also tested the installation media on a friends computer, and it didn't crash, so the installation media is not the issue.
 

jr9

Estimable
Did you try using just the integrated graphics when it was on the cardboard box? I would try running the installation media using the integrated graphics for sure if you didn't already. Also try with just one stick of RAM and run the RAM at the default non XMP speeds which is generally 2133MHz
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


I've only been using integrated graphics, and it was still crashing.
 

jr9

Estimable
With:

- Just one stick of RAM at non XMP speeds. Reset them back to defaults.
- Any over drive other than the drive you are trying to reinstall Windows on disconnected

See if you can get through the Windows installation. Disconnect ethernet so it doesn't try to do any updates. Don't download any drivers even and see if it's stable.
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510

Actually, I tried with using only one stick of RAM without XMP, and still crashed.
I thought it might be a hard drive issue, so I disconnected my hard drive and my SSD, and the installer still crashes without any drives plugged in.
 

jr9

Estimable
Wow. So there goes anything inside Windows or Windows in general as the cause of the issue and both drives. It crashes with with the Linux boot USB as well :| That is dire. This has to be a hardware problem then to state the obvious. It must be one of these:

- RAM
- CPU
- Motherboard

At this point I'd recommend a shop to do this diagnostic for you unless you have spares of these parts. I personally don't recommend buying anything else as you'd just be guessing. Sorry the PSU didn't help, I was pretty sure it was the PSU.

My last ditch ideas are try a full BIOS reset and checking the CPU socket for bent pins.
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


I actually replaced my previous motherboard because I noticed bent pins, but that didn't help.
I tested the CPU for over 2 hours on my friends computer with VR and prim95, and it didn't crash on his computer, so I don't think it's CPU.
I also ran memtest86 for 10 runs, with 0 errors detected (I know memtest isn't perfect, but I've read that 99.99% of important errors can be detected by the first few runs).

After all these tests, I wouldn't think it could be CPU, RAM, or motherboard, which is strange because I've basically isolated the issue to these three components.
 

jr9

Estimable
Well it's seemingly absolutely impossible for it to be not one of those 3 things. Everything else was physically removed from the system and it still crashes. Do you have any idea what other piece of hardware is connected to the system could cause it that isn't: GPU, HDD, SDD, OS, PCI devices as they removed from the system.

- Try a different electrical outlet
- BIOS reset
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


I've tested it both at my house and my friend's house, so the problem isn't affected by the outlet.
I also tried multiple BIOS resets as well.
Only other connected hardware is: USB mouse, USB keyboard, HDMI for display.
 

jr9

Estimable
Well I've never seen a display or USB keyboard crash a system. That would be a first. I still think it's one of those three primary components. Only way I would be assured is with replacing each of the parts with a known working one starting with the CPU and RAM on that system. A shop would have these. If I was in the same position I would cast aside what I think and starting going by what is impossible and what's not. It's impossible the GPU is the problem. It's not impossible that software level testing missed the problem or the MSI board has issues as well.
 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


Good point. I'll take it into a shop. Thanks for the help.
 
If you changed mobos did you do a clean install??

It may not crash in safe mode because not a lot loads.

But if you didnt do a clean install the drivers from the old mobo will load and probably crash the system with the new mobo. If you boot into windows normally

 

jameshan2155

Prominent
Jan 20, 2018
11
0
510


I wiped the SSD and HDD as one of the troubleshooting steps and I am unable to install Windows at all due to the freezing with the installation media, so old drivers from Windows would not be affecting it, as far as I know.