This is a long post but should be an easy read. I have a lot at stake, and enormously appreciate your taking time to help me!
I recently obtained an HP z400 computer. It has an HP-customized version of an ATX system, Xeon 8-thread quad-core processor, 8GB DDR3 RAM and, well, that’s about it.
I’m going to Adobe Premiere Pro CS5 for video production and really want to take advantage of the CUDA and Mercury Playback Engine benefits of using it with a Premiere-supported CUDA GPU. I went with the new GTX-470 after the Adobe gurus unanimously said it is the best and the best for the money.
But the power requirements for the GPU were more than the power in the computer, and the computer has a custom power system that doesn’t lend itself to being replaced by a higher-output ATX PSU.
So, I opted for keeping a separate power supply outside the machine, in back, to power the GPU.
I installed Windows 7, and I installed the drivers for the GPU from the provided CD (just the drivers).
I was awaiting two extensions I had ordered for the PCI-Express power cords for the GPU. Their plugs would fit through an empty PCI opening. In the meanwhile, I kept the side panel off the machine so I could run PCI-Express power cables off the second power supply sitting on the floor behind the machine.
I left the machine on the whole time for two or three days, with the door off.
The default power management settings had the computer going into a “user locked” mode or something after about ten or fifteen minutes of being idle. After a little while longer the screen would go black and the machine would go to sleep – and the power button on the front would start blinking. Hitting a key or moving the mouse would bring the screen back on. All proper behavior so far.
For two nights I played around with Windows 7 a little bit to get familiar with it.
When I came into the office on the third day, though, the computer was frozen. I couldn’t get the screen to come back on when hitting keys or moving the mouse. The power button on the front of the machine was solid, as it is in normal mode. I thought maybe it went to sleep, but HP techs agreed that the power button would be blinking and that hitting it once would wake it up, but hitting it didn’t do anything.
I checked to make sure everything inside the machine was running. Power light on the system board. Fan on the external power supply. Fan on the case. Fan on the CPU. Fan on the GPU, which required I feel the underside of it since the machine was standing up in the normal position. All were running fine.
Then, I reached under the GPU again to touch the other half in case there was a fan there also – and I nearly burned my fingers! I mean it was scorching hot! It was so hot I was worried it could melt!
To test it to see if that was the problem with the machine being frozen, I placed a large 12” fan in front of it and ran it awhile to see if the video would come back. The GPU certainly cooled down dramatically, to the point of just being slightly warm, but the display didn’t come back and the machine was still frozen.
So, I restarted the machine, and Windows came up fine.
I removed the 12” fan and, with the side panel still removed, I kept the machine on for another twelve hours or so, doing a very little work on it here and there – periodically waking it and taking it out of lock. The entire time – all twelve hours – the GPU stayed just mildly warm. Barely enough to even notice.
What should I make of this? I’m very perplexed, and so are my couple tech advisors.
They explained to me that there’s far less cooling inside the box when the side panel is off, but this is an air-conditioned office and this GPU has a fan in it and even an exhaust port at the second PCI opening it uses – and the machine wasn’t doing any work at all all night, not even showing a static display.
I realized later that it should have kicked into sleep mode within about 15-20 minutes of when I had left the office the previous night. But the power light was solid… and the machine was frozen solid.
Do I need to put a special cooling device into the machine for the GPU? (Maybe it got hot when I was using it before I left the office the night before and froze right after I left, before going into sleep mode?)
Should I assume there’s something incompatible between the GPU and the mother board because of something HP customized on it? (The HP techs can’t think of any reason that should be the case, but my tech gurus say they wouldn’t be surprised at all.)
Should I expect it to stay cool enough once the side panel is replaced?
Should I do as my main tech guru suggests and just go build a machine made to handle this GPU, with a large power supply and case loaded with fans?
That option is appealing, but we’re talking thousands of dollars and lots of time for what I thought was finally completed and ready to rock – until this heat mystery arose.
I need to figure this out and get a machine working so I can get back to working.
If that GTX 470 was running at very hot temperatures for two days, you could have damaged the chip. Luckily, I would think that it would throttle its clock speeds once/if it reaches 105°C.
Those GTX 400 Fermi chips run very hot just so you know, but if it was burning hot then you might have some concerns, just try adding the side panel and see if it runs a bit cooler. Try checking the temperature using GPU-Z.
You might need to underclock it a bit. There are custom GPU cooling devices mainly waterblocks for watercooling but I'm not sure about air cooling. Otherwise you could try turning the fan speed up.
I've done more testing, and spoke to HP tech support further.
I've decided it has nothing to do with cooling. I was using it fine for two days, and I've used it fine for now 15 hours after rebooting, and it's remained cool the whole time. And before it locked up and got hot it was idle; my testing since has involved full-screen videos and still it has remained cool.
Something caused it to lock up, which in turn is what caused it to get so hot. I'm sure of that now.
So, my question is whether the cause was one of two things. I’m 99% convinced it was one of these two:
A. Bad GPU.
B. Incompatibility with the HP machine.
It's hard to say which way to go. The GPU, and system with GPU in it for that matter, have apparently worked fine for a few days (albeit with minimal work) except for that one lock-up (which was within 3 days of setting up the system).
One interesting piece of information I discovered is that the sound is not working on the machine, and that in Device Manager under "Sound, video and game controllers" the on-board sound is not appearing like it should, but what is showing up are four instances of "NVIDIA High Definition Audio" -- which has the HP tech baffled.
So, A or B? Bad GPU, or incompatibility with HP's little z400 workstation system?
What do you think?
(And if we decide it’s not the GPU, what’s the chance I hurt the GPU with that period of extreme heat? It “seems” to be working fine since, but with minimal demand on it as yet.)