This February I built an almost all new PC. New Mobo, CPU, Video Card, and Ram. I used a Asus Radeon HD7750 (Which comes "overclocked"), and had tons of trouble with DX11. Not only driver crashes which were recovered, but entire program lockups which required ending the task, but when I turned back to DX9 most of my problems went away, with the exception of a few random driver crashes. Using the Asus utility for the card I underclocked it and no longer had crashes.
My first thoughts were driver issues, or possibly a power supply issue (Possibly underpowered) or even a problem with the windows install (I had not used an official windows disc to install, since I no longer had one). Long story short, I had been using a 550W power supply (With a max of 600W) from like 2005, a TTGI TT-550K04, but I had no replacement supply to test with at the time. I had updated all drivers, tried downgrading video drivers, then upgrading, then totally removing, and clearing out registry keys, and reinstalling to no avail. So I went ahead and found where to download the actual windows 7 install iso, checked hash, installed it using my product key, reinstalled everything being super careful, install a driver, restart, rinse, repeat.. and the problem still existed. I even removed everything except for my SSD, my Video Card, Ram, Processor, Keyboard, and Mouse.
So where we are now is that with a fresh, seemingly good installation of windows, all the updated drivers, I could not run DX11 without crashes, and had to underclock to avoid random crashes during games. So I just dealt with it for a bit, because I couldn't RMA the Graphics card, or honestly any parts yet since I was in the middle of classes for college, and I am not trying to program on a netbook for 3 weeks.
Fast forward to now. I've ordered a new video card, and a new power supply. Uninstalled everything related to AMD, ATI, Radeon etc, ripped open the PC installed everything, did crazy awesome cable management, etc, booted up. Installed drivers, set everything up, everything seemed great.. for about 3 hours. Basically I had a game running, and DVD Fab copying a movie, and firefox running with youtube.. then suddenly all the screens flashed to white. Flashed back, firefox crashed, close firefox and suddenly one screen is purple, the other is white.. system just sat there hung on white and purple till I did a hard reset. After that hard reset the system had been crashing constantly, Nvidia kernel driver crashing, couldn't load firefox without the entire system hard crashing to reboot... all kinds of hell, but worse yet now randomly while the BIOS is booting, screen shoots to purple and reboots or hangs.
So I go and do a few things..
A memory test, it ran for a good amount of time with zero errors, but then the screen goes purple and crashes.
Reinstalled windows. Everything seems to be running well till I install the Nvidia Drivers. Then constant crashes and hangs, sometimes unable to even finish loading windows before the screen goes some weird color and reboots. Or the screen doesn't turn on.
Safe mode appears to work flawlessly, but it doesn't load the Nvidia drivers. That lead me to believe I could run in normal without nvidia drivers and no problems, but that isn't the case. I uninstalled all the drivers, and the system, while more stable, still had occasional hangs. Of course the fact that it crashed the same way in BIOS makes it hard to say it's driver error, and it seems more likely the drivers are exacerbating the hardware problem.
Removed new video card, reinstalled old card, removed old drivers completely, new drivers for radeon. System seems fine, so I load up FurMark to try to crash the system, because I'm a sadist. Everything runs well for about 5 minutes, the video card gets to about 60C and still only at 30 percent fan (Why such a slow rampup I don't know?) watching the rest of the system on SpeedFan and nothing get's over 44C.
Suddenly a hard crash, and then the system won't boot, I'm forced to unplug the power, and wait, and attempt but still no boot. I then unplug the power, flip off the power supply, unplug from the mobo, and pull the CMOS battery.. put it all together.. reboot and it keeps failing to boot. The power light turns on.. fans turn on.. then the power light fades away. I keep unplugging replugging.. and after about 15 minutes it turns on and starts to boot. It's seems to be booting, and resetting far slower than normal, but other than that it's running. Now I am getting random resets.
I'm about to go use the Ultimate Boot CD to try and run more diagnostics and tests and see if I can narrow down the problem. I'll be checking memory, hard drives, and anything esle I can. I have dump files from the Nvidia card I've attempted to read, but I don't really know enough to fully interpret them, I can provide them if needed. Below is what I believe to be the most pertinent information..
3 Dumps with this information
VIDEO_TDR_FAILURE (116)
DEFAULT_BUCKET_ID: GRAPHICS_DRIVER_TDR_FAULT
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
and 1 dump with this
WHEA_UNCORRECTABLE_ERROR (124)
DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT
PROCESS_NAME: iexplore.exe
MODULE_NAME: hardware
IMAGE_NAME: hardware
The system has not made any dump files from the ATI card, apparently it hasn't bluescreened, just hard reset... of course I never saw a blue screen from the Nvidia card either but that's what windows has done.
A list of all hardware being used is below.
OCZ Agility 3 AGT3-25SAT3-60G 2.5" 60GB SATA III MLC Internal Solid State Drive (SSD) (About 1 year old this month, never had a problem with it in the old build, assuming it's good)
About a 5 year old 500GB Western Digital 7200 RPM drive, which I disconnected during the new install of windows and tests so I'll consider it a non factor in the original or ongoing problems.
ASRock Z75 Pro3 LGA 1155 Intel Z75 HDMI SATA 6Gb/s USB 3.0 ATX Intel Motherboard
Intel Pentium G860 Sandy Bridge 3.0GHz LGA 1155 65W Dual-Core Desktop Processor Intel HD Graphics BX80623G860
G.SKILL Sniper Series 8GB (2 x 4GB) 240-Pin DDR3 SDRAM DDR3 1866 (PC3 14900) Desktop Memory Model F3-14900CL9D-8GBSR
ASUS HD7750-1GD5-V2 Radeon HD 7750 1GB 128-bit GDDR5 PCI Express 3.0 x16 HDCP Ready Video Card (Original Video Card)
EVGA GeForce GTX 660 SUPERCLOCKED 2048MB GDDR5 DVI HDMI DP Graphics Card 02G-P4-2662-KR (New Video Card)
OCZ ZT Series 750W Fully-Modular 80PLUS Bronze High Performance Power Supply compatible with Intel Sandy Bridge Core i3 i5 i7 and AMD Phenom (New Power Supply)
Any advice would be greatly appreciated.. I'm about to go whip out my multi-meter and start checking leads at this point. In retrospect I wish I had done a stress burn in the day I bought everything, perhaps I would have found a problem then, never again will I skip out on that.
Thanks everyone.
Edit:
I meant to add that I'm thinking the problem is less likely the video card at this point, but either the motherboard, or PSU, or both, but I find it highly unlikely that the PSU would be bad on arrival, and I still had problems with stability with the other PSU as well. The only things that have been constant are the Mobo, Processor, and Ram. If the RAM continues to test good I'm thinking it's gotta be the Mobo at this point. I know the processor is almost always not the culprit.
I have also gone ahead (Because I ordered through Amazon for the new Vid and PSU) had replacements shipped (Since it's free and I have prime) so on Friday I will be able to at least try swapping things in if I don't find anything else wrong, though if it's a problem with the Mobo I'm hesitant to add a Video card to it, because perhaps that would fry it.. I mean I know it wouldn't cost me anything to replace again but if I start frying parts from lack of due diligence it's definitely not a financial burden I want someone else to have to carry.
My first thoughts were driver issues, or possibly a power supply issue (Possibly underpowered) or even a problem with the windows install (I had not used an official windows disc to install, since I no longer had one). Long story short, I had been using a 550W power supply (With a max of 600W) from like 2005, a TTGI TT-550K04, but I had no replacement supply to test with at the time. I had updated all drivers, tried downgrading video drivers, then upgrading, then totally removing, and clearing out registry keys, and reinstalling to no avail. So I went ahead and found where to download the actual windows 7 install iso, checked hash, installed it using my product key, reinstalled everything being super careful, install a driver, restart, rinse, repeat.. and the problem still existed. I even removed everything except for my SSD, my Video Card, Ram, Processor, Keyboard, and Mouse.
So where we are now is that with a fresh, seemingly good installation of windows, all the updated drivers, I could not run DX11 without crashes, and had to underclock to avoid random crashes during games. So I just dealt with it for a bit, because I couldn't RMA the Graphics card, or honestly any parts yet since I was in the middle of classes for college, and I am not trying to program on a netbook for 3 weeks.
Fast forward to now. I've ordered a new video card, and a new power supply. Uninstalled everything related to AMD, ATI, Radeon etc, ripped open the PC installed everything, did crazy awesome cable management, etc, booted up. Installed drivers, set everything up, everything seemed great.. for about 3 hours. Basically I had a game running, and DVD Fab copying a movie, and firefox running with youtube.. then suddenly all the screens flashed to white. Flashed back, firefox crashed, close firefox and suddenly one screen is purple, the other is white.. system just sat there hung on white and purple till I did a hard reset. After that hard reset the system had been crashing constantly, Nvidia kernel driver crashing, couldn't load firefox without the entire system hard crashing to reboot... all kinds of hell, but worse yet now randomly while the BIOS is booting, screen shoots to purple and reboots or hangs.
So I go and do a few things..
A memory test, it ran for a good amount of time with zero errors, but then the screen goes purple and crashes.
Reinstalled windows. Everything seems to be running well till I install the Nvidia Drivers. Then constant crashes and hangs, sometimes unable to even finish loading windows before the screen goes some weird color and reboots. Or the screen doesn't turn on.
Safe mode appears to work flawlessly, but it doesn't load the Nvidia drivers. That lead me to believe I could run in normal without nvidia drivers and no problems, but that isn't the case. I uninstalled all the drivers, and the system, while more stable, still had occasional hangs. Of course the fact that it crashed the same way in BIOS makes it hard to say it's driver error, and it seems more likely the drivers are exacerbating the hardware problem.
Removed new video card, reinstalled old card, removed old drivers completely, new drivers for radeon. System seems fine, so I load up FurMark to try to crash the system, because I'm a sadist. Everything runs well for about 5 minutes, the video card gets to about 60C and still only at 30 percent fan (Why such a slow rampup I don't know?) watching the rest of the system on SpeedFan and nothing get's over 44C.
Suddenly a hard crash, and then the system won't boot, I'm forced to unplug the power, and wait, and attempt but still no boot. I then unplug the power, flip off the power supply, unplug from the mobo, and pull the CMOS battery.. put it all together.. reboot and it keeps failing to boot. The power light turns on.. fans turn on.. then the power light fades away. I keep unplugging replugging.. and after about 15 minutes it turns on and starts to boot. It's seems to be booting, and resetting far slower than normal, but other than that it's running. Now I am getting random resets.
I'm about to go use the Ultimate Boot CD to try and run more diagnostics and tests and see if I can narrow down the problem. I'll be checking memory, hard drives, and anything esle I can. I have dump files from the Nvidia card I've attempted to read, but I don't really know enough to fully interpret them, I can provide them if needed. Below is what I believe to be the most pertinent information..
3 Dumps with this information
VIDEO_TDR_FAILURE (116)
DEFAULT_BUCKET_ID: GRAPHICS_DRIVER_TDR_FAULT
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
and 1 dump with this
WHEA_UNCORRECTABLE_ERROR (124)
DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT
PROCESS_NAME: iexplore.exe
MODULE_NAME: hardware
IMAGE_NAME: hardware
The system has not made any dump files from the ATI card, apparently it hasn't bluescreened, just hard reset... of course I never saw a blue screen from the Nvidia card either but that's what windows has done.
A list of all hardware being used is below.
OCZ Agility 3 AGT3-25SAT3-60G 2.5" 60GB SATA III MLC Internal Solid State Drive (SSD) (About 1 year old this month, never had a problem with it in the old build, assuming it's good)
About a 5 year old 500GB Western Digital 7200 RPM drive, which I disconnected during the new install of windows and tests so I'll consider it a non factor in the original or ongoing problems.
ASRock Z75 Pro3 LGA 1155 Intel Z75 HDMI SATA 6Gb/s USB 3.0 ATX Intel Motherboard
Intel Pentium G860 Sandy Bridge 3.0GHz LGA 1155 65W Dual-Core Desktop Processor Intel HD Graphics BX80623G860
G.SKILL Sniper Series 8GB (2 x 4GB) 240-Pin DDR3 SDRAM DDR3 1866 (PC3 14900) Desktop Memory Model F3-14900CL9D-8GBSR
ASUS HD7750-1GD5-V2 Radeon HD 7750 1GB 128-bit GDDR5 PCI Express 3.0 x16 HDCP Ready Video Card (Original Video Card)
EVGA GeForce GTX 660 SUPERCLOCKED 2048MB GDDR5 DVI HDMI DP Graphics Card 02G-P4-2662-KR (New Video Card)
OCZ ZT Series 750W Fully-Modular 80PLUS Bronze High Performance Power Supply compatible with Intel Sandy Bridge Core i3 i5 i7 and AMD Phenom (New Power Supply)
Any advice would be greatly appreciated.. I'm about to go whip out my multi-meter and start checking leads at this point. In retrospect I wish I had done a stress burn in the day I bought everything, perhaps I would have found a problem then, never again will I skip out on that.
Thanks everyone.
Edit:
I meant to add that I'm thinking the problem is less likely the video card at this point, but either the motherboard, or PSU, or both, but I find it highly unlikely that the PSU would be bad on arrival, and I still had problems with stability with the other PSU as well. The only things that have been constant are the Mobo, Processor, and Ram. If the RAM continues to test good I'm thinking it's gotta be the Mobo at this point. I know the processor is almost always not the culprit.
I have also gone ahead (Because I ordered through Amazon for the new Vid and PSU) had replacements shipped (Since it's free and I have prime) so on Friday I will be able to at least try swapping things in if I don't find anything else wrong, though if it's a problem with the Mobo I'm hesitant to add a Video card to it, because perhaps that would fry it.. I mean I know it wouldn't cost me anything to replace again but if I start frying parts from lack of due diligence it's definitely not a financial burden I want someone else to have to carry.