HELP!!! A slight Voltage spike causes MASSIVE GPU OVERHEAT

kolokol220

Prominent
Jan 23, 2018
20
0
510
[strike]Today i installed a new PCIe sound card into my PC. After i finished installing the sound card driver, my GPU suddenly overheated and shut down. After this, upon starting the PC it takes about 10 seconds for it to reach critical temperature and shut down. MSI Afterburnes has showed that when the GPU is in waiting mode and works at 200-300mhz the temperature stays at around 45-50 celsiums. Should some process make the card awaken from waiting mode and the numbers get back to 1250mhz - 10-15 degrees of temperature are gained in an instant and from there is goes up and up with the speed of 1 celsium per second until it reaches the critical 95' temp and shuts down.

I have tried completely reinstalling video drivers, but it didn't solve anything. I double checked if my PSU is capable of feeding my rig with the addition of a new "mouth" in the face of the sound card - and PSU is definitely not the reason.
I opened up my PC and took the GPU out. I cleaned the cooler from dust, applied oil to the fans (even though they were rotating fine). Afterall, the cooling system is working absolutely fine, the problem couldn't have suddenly appeared in it, gotta look somewhere else.[/strike]

[strike]I really don't know what to do, and am in desperate need of a helpful advise.
[/strike]

[strike]My RIG: [/strike]
GTX 970 MSI 4G Gaming - GPU
[strike]Creative Sound Blaster Z - new sound card[/strike]


Update: It's become clear that temp instability begins upon breaching some voltage level between 1000mV and 1137mV. It's not like it's such a high voltage, i heard this card is supposed to run smoothly even on 1250, and even more. Besides, this anomaly very suddenly appeared just today, seemingly without any reason, although consequentially it did happen right after installing new drivers on a newly installed sound card. I could try modifying bios of gpu and locking it at a safe point, but really the card shouldn't behave like this, and i would like to keep my voltages unlocked if i can find another way to treat this problem, so please give your suggestions.
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


ah, yes, i forgot to give my RIG info, i'll update now
 
Backtrack. Remove the new sound card (and uninstall drivers) to see if problem goes away, as it could be coincidence that problem happened at same time as new hardware addition. If it does, then it would be some kind of conflict between those cards. But if the problem persists, then it most likely is hardware problem with the GPU (failed temp sensor perhaps?).
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


Ok, i'll try that now

Upd.: Ok, i disconnected the sound card from PCIe port and deinstalled the driver, rebooted the PC, and the problem was still present.
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


my sound card is at the bottom of the body and there are more that 10cm between it and GPU , it cant be that
 


in general check airflow and make sure GPU have enough power to spin the fans, of if they are spinning at all.
I would say you either disconnected additional GPU power by mistake, overloaded PSU or blocked airflow, ether by card or stray hair.

 


So it is the problem with GPU itself. Can you test it in another machine?
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


i've already said that coolers are working fine and the is no way the sound card could somehow block the airflow. power cables are all connected finely and in right places

 

kolokol220

Prominent
Jan 23, 2018
20
0
510


sadly, no, i can't test it in another pc
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


Different PCIe x16 slot then?[/quotemsg]

i only have one PCEe x16 3.0 slot, the other one is 2.0 and it's kind of pointless to put GPU in it
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


it's not the fan issue! the fans are set to always be at 100% efficiency, besides, my cover is opened so i can perfectly see and hear how they spin.

the overal way of how FAST the overheat goes - it is DEFINITELY NOT an issue with the airflow, if it was, it would took it at least somewhat more than A FEW SECONDS to reach almost 100 celsiums!!!
 

RobCrezz

Expert
Ambassador
I would say scan for malware, as that can hammer the GPU usage causing the temp to go up. But the fans should increase speed to combat the heat if its not been modified in Afterburner.

If you are sure its not a software issue, then replace the thermal paste on the gpu.
 

RobCrezz

Expert
Ambassador


I didnt get that from what you said...

Then how is it going to be a software issue, if its getting hot while the fans are maxed out?

Replace the thermal paste on the gpu if you think its not transferring heat correctly.
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


i don't think thermal paste has anything to do with it, because prior to installing a driver for new device the GPU worked perfectly. I mean, thermal paste doesn't get outdated in an instant! I would've noticed the gradual decrease of cooling efficiency throughout a period of time, cuz i often check the graphs. It seems to me as if GPU started eating much more power than is supposed to, but it's not like PSU is damaged or can't feed my RIG.

And i've already ran a malware scan and it didn't detect anything
 


That is a very good thought. You should check out voltages on GPU because it is most likely cause of such sudden temp increases (aside from failed temp sensor).
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


okay, i did notice a spike. when card is in relaxed mode (135 mhz on gpu clock and 324 on mem clock) the voltage is a constand 843 mV. i gave the card a few seconds of load (1250 on gpu clock and 3750 on mem clock) and it jumped up to 1137 mV - at the same instant the the temp jumped up from 48 to 62 just as instantly. After a few more tests it became perfectly clear that voltage spike is connected to temp spike. If you leave voltage at 1137 where it jumps to, the temp rushes to maximum extremely fast, and just lowering the voltage back to 843 tames the temp back and it cools down in seconds.
p.s. to me it doesn't seem like such Voltage would be too high for this card
 


Correct, those voltages are perfectly fine.
 

kolokol220

Prominent
Jan 23, 2018
20
0
510


The temp actually starts getting cooled down before it reaches 843 mVolts. Sadly the refresh rate in real time monitoring is way too slow, but i did notice that at least on 1000 mV it already starts to cool down. Perhaps some gpu bios setting went damaged, and what i should do is find the critical point on Voltage and put and lock it there in bios?