Is it my GPU, PSU or motherboard that is failing?

hdmackay

Reputable
Jan 18, 2016
12
0
4,510
Hi guys,

A couple of months ago my computer started occasionally doing this thing where both of my monitors went a blank colour with lines across them and the only thing I can do was restart, then everything would be back to normal. I didn't think much of it since it happened pretty rarely.

This is a picture of the most recent failure (which was actually minutes after playing a game and exiting out of it):
FjgNyG5.jpg


However, it's starting to happen a lot now (mainly when I play games). When the failure occurs, 50% of the time I can still hear my music and everything fine, I just can't do anything else with the PC. The other 50% of the time the sound glitches too and just creates a continous hum.

My PC spec is as follows (bearing in mind I am not overclocking at all):

Motherboard - Gigabyte GA-H97-D3H
GPU - MSI AMD R9 280 (3GB, GDDR5)
CPU - Intel i5 4690 Quad Core
PSU - Corsair Builder Series CXM 600W Modular 80 PLUS Bronze Certified
Memory - Corsair CMZ8GX3M2A1600C9 Vengeance 8GB (2x4GB) DDR3

Now, I think the chances are that is my GPU that is failing. However, I want to be 100% sure before forking out for a new one incase it ends up being the PSU or the motherboard. My whole PC is only 18 months old.

My case has plenty of space and decent airflow with 3 case fans. I fully cleaned out my case and took my gpu out and plugged both my monitors into my motherboard and used the intel HD integrated graphics for a while. The problem never happened. This points towards it being the GPU that is faulty but then it occured to me that the PSU will be under less demand without the GPU plugged in, so I'm thinking there is still a chance that it is the PSU.

I ran a CPU torture test and everything was fine, then I ran a GPU stress test and the failure occured.

I ran MSI Afterburner whilst playing a game and this is what the numbers were:

GPU Max. Temp. - 45
GPU fan tachometer - 944 RPM
GPU Core Clock, MHz - 972 MHz
GPU Memory clock, MHz - 1250 MHz
CPU Max. Temperature - 51

This temperatures are all fine so it obviously isn't overheating.

Am I thinking too much into this or is there a way I can fully make sure that it isn't my PSU or motherboard that is causing the problem?

Sorry for the wall of text and cheers in advance for any help.
 
I think your temps-especially your gpu temps are very low for a stress test. You should be getting up to the 60s or so. If your system is not correctly sensing the temp it might not be ramping up your fan when it needs to. You might try manually setting your GPU fan to run on high and repeat some tests to see if that helps.
It is also possible that your PSU is not keeping up. Voltage irregularities can cause the sort of GPU failures you are experiencing. You might be able to observe PSU issues by watching voltages using Speccy or a similar utility.
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


My bad I should have said, the gpu temperatures were measured when I was playing a game. I didn't run it when I stress tested.
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


I checked in my bios at the voltages were all fine. I had pretty much decided it was my psu at fault but now I'm back to square one.

I ran a stress test for half an hour last night (gpu was at 85 degrees) and it never failed.

Any ideas?

 
It could be that you issue was caused by a driver. I believe there was a version of AMD's drivers not too long ago that caused overheating. I think the issue has since been fixed, but you should make sure you use the latest drivers and/or check to see if the problem still exists with the drivers you used before you noticed the problem.
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


I ran a CPU torture test and everything was fine. I ran the Furmark GPU stress test (at a separate time).

 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510
UPDATE: I bought a brand new platinum PSU and added it to my PC and the problem is still there. This is very disappoint but now I have legitimately no idea what to do.

If it possible it could be an OS problem? Or maybe a harddrive fault?
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


I recently did the free upgrade to Windows 10. However the problem started before that when I was on Windows 8.1 Pro.

I'm not entirely sure what you mean by dump file, but I found this in Windows Event Viewer :
ruKPN1D.png


Although I don't think that's what you were asking about? That is obviously just getting logged when I press the reset button on my case.
 


I doubt this very much. You are experiencing pretty clear signs of GPU failure. Your new PSU has pretty much ruled out your GPU failing due to power issues. It still could be a GPU driver issue, but the problem is almost certainly with your GPU. Have you tried using different drivers and/or manually changing your GPU fan settings?
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


Well the problem started when I hadn't updated my gpu drivers in quite a long time. I then installed the newest ones and it still happened. Just earlier I completely cleanly uninstalled all drivers and reinstalled the newest ones again and it still happened.

I thought it was a gpu problem but the fact that I ran a half hour furmark test without it failing makes me think it is ok. The recorded temperatures were as high as 85 during that time and it never crashed. Then it can crash when under barely any load.


I appreciate all the help you guys are giving me by the way. :)
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510


Will it create a dump file for this type of error? It's not actually a BSOD I'm getting, just a blank screen. (usually white)
 

hdmackay

Reputable
Jan 18, 2016
12
0
4,510
Okay guys, I thought I had tried this already but apparently I have not. My motherboard has a second pci-e slot so I've tried my gpu in there. So far it hasn't crashed yet. It's only be a few hours so it might be too early to say but it would usually have happened by now.

I must have forgotten to do this as it was one of the first things I thought of doing.

I'll leave it a few days to see if the problem truly has gone away, and if so, I guess I need a new motherboard. The slot that it is in currently is wayyyy to close to all the sata/psu cables.

A faulty PCI-E slot, who would've known!