Random Freezes Under Heavy GPU Load

willpett

Honorable
Aug 30, 2012
1
0
10,510
Hello everyone, This is my first time here.

I am having random freezes which seem to occur primarily when the GPU is under heavy load. The freezes are hard and take hold of the entire system, mouse, keyboard, and all. The only solution is a reboot. I usually cannot go for more than 24 hours without a lockup while under 100% GPU load. Despite my suspicion that the GPU is involved, the freezes will also occasionally occur when simply surfing the web, or doing other not very GPU-intensive tasks. It even once froze while in the BIOS.

My system specs are as follows:

GPU: Nvidia GTX 690
MOBO: Asrock x79 Extreme4
CPU: Intel Core i7-3930K
RAM: Corsair Vengeance 8GB CMZ8GX3M4X1600C9
PSU: Rosewill HIVE-750 750W
SSD: Crucial M4 64GB
HDD: Maxtor 7Y250M0
Heatsink: CM Hyper 212+
Case: Rosewill FUTURE Mid-tower
OS: Ubuntu 12.04

I have made absolutely sure that my BIOS settings reflect the ratings for my RAM, and I was able to run memtest86 for 24 hours without errors.

I have updated my BIOS, SSD firmware, and NVIDIA drivers all to the latest versions (2.10, 000F, and 304.37 respectively), which at first seemed to decrease the frequency of the freezes.

I have monitored my voltages and the 12V drops from 12.35 while idle to 12.24 under heavy GPU load, which seems fine.

I can run Prime95 on all 6 cores for at least 24 hours without any errors, CPU temp staying below 64C.

It is mainly when I use something like cuda_memtest, or perform lengthy CUDA computations that directly put the GPU under heavy load that I will get consistent freezes within 12-24 hours (often much sooner).

I have monitored the GPU temps and they never go above 72C (which should be reasonable, yes?)

I should also add that there is no helpful information pertaining to any errors at the time of freeze in any system or kernel logs.

One thing that might be important is I have a mid-tower case which is quite compact and the cabling situation is kind of messy inside, with the 5V power cable hanging right in front of the GPU fans, and fan cables dangling everywhere.

I am wondering if it is a power/heat issue? If so, maybe getting a water block for the GPU is in order? I am just wondering if there is something else to try before spending $$.

Thanks!
 
First off, let me commend you on an excellent and organized post.

Take the side of the case off for a few days and point a household fan into it. This will help determine if it's heat related.

You always have to consider the PSU when the causes seem mysterious. The Hive should be adequate but remember that things might happen with the power that the motherboard or a DMM would not pick up.