CPU: Intel Core i5-3470 3.2GHz Quad-Core Processor
Motherboard: Gigabyte GA-H77M-D3H Micro ATX LGA1155 Motherboard
Memory: Corsair Vengeance 8GB (2 x 4GB) DDR3-1600 Memory
Storage: Western Digital Caviar Blue 500GB 3.5" 7200RPM Internal Hard Drive
Video Card: HIS Radeon HD 7870 2GB Video Card
Case: Rosewill Challenger-U3 ATX Mid Tower Case
Power Supply: Corsair Enthusiast 650W 80 PLUS Certified ATX12V / EPS12V Power Supply
OS: Windows 7 Professional 64 Bit
Long story short, built a new computer a week ago, started crashing a handful of times while using onboard graphics; would crash on windows updates, light usage like streams and skype video chat. Other times, it would stay stable for hours, allowing me to watch downloaded movies (although they weren't hd, but still). Got a 7870, things still didn't improve.
I get both kinds of errors: blue screen saying uncorrectable hardware error, and other times my computer will just shut off with nothing. If I did something like, update my windows experience index, which tests the graphics and such, it would always shut off during that. Ran memtest for 2 hours, memory came up clean. I can get around in the bios just fine. When I try to go through safe mode, it will always blue screen after a minute.
I can't restore back to a point when it was working. It will crash early on when I've reinstalled my OS. I don't have the error logs anymore but one of the crashes was hal.dll.
I ran memtest on my memory for 2 hours. I'm downloading my windows straight from the site and putting it on a usb to install. It crashes with or without my graphics card. I'm not sure if it could be my psu because it has more than enough power and is not a cheap one.
Is there anything else I can troubleshoot?
The only thing I've used to "stress" test it is WEI because it has a function that tests the cpu and the graphics, etc. It will always crash on that. But if I'm doing light usage, like just browsing the internet like I am now, I can get by for hours without crashing.
Post a screenshot of what BSOD's Bluecreenview reports. That will help a bit, especially if various BSOD messages are being generated.
Anyway, a Machine Check Exception BSOD (0x9C) points to a HW problem, probably mobo or PSU related. Normally, 0x124 [WHEA_Uncorrectable_Error] would be generated for a generic HW problem, but if a crash occurs before WHEA gets initialized, a MACHINE_CHECK_EXCEPTION BSOD gets thrown instead. The fact HAL.dll is crashing [the HW Abstraction Layer] farther points to a HW related problem.
For the most part, I treat this BSOD similar to the WHEA BSOD's when it comes to debugging. First thing to check is temperatures, to make sure nothing is overheating. Next thing would be the voltages that are being reported, to ensure they remain within ATX spec. After that, you start going to a barebones configuration [one RAM stick, etc] to try and lessen the number of potential HW factors [for instance, some low-end mobos are unstable when using 4 sticks of RAM with certain agressive timings].
It's solved. It was an improperly installed cpu heatsink/fan. It was running too hot, I just didn't realize the temperatures were wrong/too high because I've never monitored temperatures before. Reseated the fan and put a little more thermal paste in, and there are no more crashes.