My GPUs are running extremely hot

Deralinoux

Distinguished
Jul 8, 2014
49
0
18,530
Hey guys !

I'm currently running 2 Sapphire Tri-X r9 290x non-overclocked GPUs in Crossfire in a medium-sized ATX case CoolerMaster Storm Enforcer. Now I know those babies, even the triple-fan model that I have, have a tendency to run slightly hotter than usual, especially when 2 such monsters are put so near to one another, but mine are going mental.

So I took my recently built computer apart and put it back together a few days ago, didn't change any part, but I put the wires back in a much more efficient way and removed a useless HDD case to improve overall airflow. I also dented 3 of my MB's CPU socket's pins in the cleaning/checking process somehow and had to put them back upright (fixed 2 but broke the last one :/), fortunately there didn't appear to be any issue because of it. I'm now using it for pretty much the first time since, and I've noticed a huge chance in the way it's running. So, my performance in-game is actually apparently better than it was before (not quite certain, but I think so at least), but the GPUs are in return getting super hot...

Right now, I've been playing Batman: Arkham City on 4K resolution with max graphics (aside obviously from AA and also DirectX 11 tessellation), and before it ran at around 50 fps with Vsync on, but now it's a rock solid 60 (aside maybe from the cutscenes or the very demanding action/panorama scenes, I've only checked with the built-in run with a benchmark in the options menu) WoW, for all I can tell, is the same, almost everything maxed out in 4K (aside from AA, some AF, and a -very- slightly lower view distance) which now runs between 60 to 120 fps without Vsync and at a solid 55 with it (60Hz monitor actually feels like a drawback when you have to lock your framerate to avoid tearing :c )

Let me tell you about the heat now. I don't actually know how hot it used to run before, but with the built-in fan curve settings, which seemed to favor silence a lot over heat from what I know, it did run fairly hot when playing stuff like Batman: AC for a long time, but I don't recall noticing the fans having to go full-speed, nor the case getting really noticeably hot, though overall it certainly was running pretty hard. Right now however, when idling on the desktop, according to both MSI AfterBurner and GPU-Z, my first GPU hovers at around 50° C, and after I started playing Batman: AC, it quickly jumped to 75/80 and the second one to 65/70, and during the 1.5/2 hours during which I played, their temperature was slowly but surely increasing constantly, with slight drops when I paused and Alt-Tabbed the game to check AB, and even with the fans running at max thanks to my custom curve, the first one started going over 90 (highest so far was 93) and the second one over 80 (highest 84), and I assume it would have climbed maybe even more... So far, about half an hour since I Alt-Tabbed to write this and check other stuff, they've climbed down to hover at a constant 80 and 73° C respectively (fans still at max and no load on them)

To be honest, performance is so good that I personally really don't mind the loud noise of these 6 fans running so fast, but if it's still barely manageable on Batman: AC (I don't mind pausing for a while every 1.5/2 hours), there are some more demanding games that I haven't run yet that I will still play, like Crysis 3, Metro: Last Light, and all this sweet-looking upcoming next-gen stuff, and even now I'm still worried that the heat could end up being damaging... It really just feels like I've somehow removed a bottle-neck and finally freed my GPUs and that they are now running at their best, but with my case and only stock fans the raw power would just be too much for lower temperatures... So kind of a good thing really, but I'm still worried about the drawbacks :p

Now there are many questions to be asked and hopefully answered: How do you think that change happened, what caused it, could it be the thing with the pins (really no idea how that could ever be) ? Is it really possible that non-OC'd they'd be running that hot, even when idle or is it some kind of malfunction ? But, most importantly, what can I do about it ? I've already raised the fan curve a lot on AB, fans are running at max from 80° and upwards, I could also speed up my 2 case fans, but it may actually end up raising the noise level to an uncomfortable point... Since my case isn't that big and I'm running dual-GPUs, most existing closed-loop water-cooling systems would probably not work (need confirmation on that though, if you have suggestions feel free), maybe a custom one, but it would be probably both more tiring and expensive, and I'm still not sure I could fit it in (I have a Corsair H60 for my CPU, and the width of the radiator and the fan combined makes them barely fit with the MB...)

Sorry for the long read, but thanks a lot for anyone's help and suggestions !
 
Solution
You really need to track down exactly what that broken CPU pin does. If it is a temperature sensor pin and the default temperature setting that would be in place with the broken pin is zero, it might be that the CPU thinks is has plenty of thermal overhead to play with and so there is no temperature throttling and it is running full blast. That would account for the better game response you mentioned. And, I suppose, the GPU's are running hotter because they are trying to keep up. All of that is just a guess without knowing exactly what that broken pin does. You might find a technical manual for the CPU online.

The best solution would be to replace the CPU. It seems silly to me to add a water cooling system to a broken CPU. IMO, the...
You really need to track down exactly what that broken CPU pin does. If it is a temperature sensor pin and the default temperature setting that would be in place with the broken pin is zero, it might be that the CPU thinks is has plenty of thermal overhead to play with and so there is no temperature throttling and it is running full blast. That would account for the better game response you mentioned. And, I suppose, the GPU's are running hotter because they are trying to keep up. All of that is just a guess without knowing exactly what that broken pin does. You might find a technical manual for the CPU online.

The best solution would be to replace the CPU. It seems silly to me to add a water cooling system to a broken CPU. IMO, the CPU isn't going to last long with or without the water cooling system if the problem is as described above because however long it lasts it will be running full throttle.

In the mean time you might try running it with one or both side covers removed to get more air through the box. If the temperatures come down with the sides off you probably need a bigger case also. The down side is that heat damage is permanent so you risk frying your GPU's also and possibly the RAM, PSU and mobo as well.

 
Solution

Deralinoux

Distinguished
Jul 8, 2014
49
0
18,530


Well I misclicked and chose your reply as best solution, sorry :/ But thanks for your tips !

So, the thing is, the broken/bent pins were actually on the MB's CPU socket, not the CPU itself, which means that I should actually be trying to find a new LGA 2011 MB, and those are not as easy to find as CPUs around here :p

Also, your idea with the temperature sensor is quite interesting, and I will have to run some tests on the CPU as well to check it out, but I'm already pretty sure that it wasn't really putting that much heat out, only the GPUs, hence my doubts as to whether or not these 2 things (the broken pin and the change in my GPUs' performance) were linked...

Just in case, are the temperatures I got to with the GPUs potentially damaging in your opinion ? And what could my 4820k i7 get to ?

Well, either way, thanks a lot for your time and help !
 

Deralinoux

Distinguished
Jul 8, 2014
49
0
18,530
Well here's an update at least:

My CPU appears to be doing fine really... It idles under 40 and stayed in the high 60's when under load (and didn't appear to be going too hard)

So, it probably isn't what you thought, right ?