Hi everyone. I'm new to the forums but I've visited the site some in the past. Hopefully someone can help me out with this.
I've been running my homebuilt machine for about three months now, and occasionally I get weird graphics crashes. It's hard to explain - sort of like parts of the image are overwritten by an uneven checkerboard, and everything in the checkerboard gets turned shades of pink. It flickers for a second, then my monitor (Acer: running at 1920 x 1200) says 'no signal' and I have to reboot.
I think (and hope) its an overheating issue, and not a GFX memory failure or something. I'm running an ASRock K10N780SLiX3 - Wifi motherboard, which has two PCI x16 slots side by side and an x8 slot further over. I have the two cards (EVGA 9800 GTX+'s) in the x16 slots, which means the top one's fan is right up against the back face of the bottom one. Usually the crashes happen when I'm gaming, but not always. It's sometimes at weird times, like as a level loads or even as the game is starting and I skip the opening cutscene. Sometimes it's out of nowhere when I'm playing.
Anyways, the MB supports three-way SLi, and so it came with a triple SLi bridge as well as a regular double one. My question is:
Can I put one of the cards (the secondary one - that is the one my monitor ISN'T plugged into) in the x8 slot, so they have a gap between them, and then use the triple SLi bridge to link them? And if so, what kind of performance hit would I expect to see when running games in SLi mode? My current highest-end game gfx-wise is Mirror's Edge, with 4x anti-aliasing and PhysX on.
I to have the same motherboard and run to inno3d gtx260's in sli. I have noticed that the top card gets VERY hot because the fan is right up against the other card. I fixed this by downloading ntune and setting the top cards fan to 100%. Worked for me. If you put one of the cards in the 3rd slot it will reduce your performance a fair bit as the 3rd slot is only a x8 instead of x16 slot.
Thanks, I'll look into that. I switched the position of the top and bottom cards yesterday, so that when I'm not using SLi (some old XP games seem to dislike it) I'm on card 2 now, not card 1. So far it's only done the crash when running in SLi, leading me to believe that the card 1, now on the bottom, has failed somehow. I'll contact EVGA if it still hasn't crashed running on just card 2 in a few days. Either way, I'll DL ntune and crank up the fans - my machine's a little noisy anyways so it won't bother me to have a little extra whirring.
Thanks again, I'll update once I'm sure about what the problem is.
You could also try a masscool slot fan from newegg.
I dont have time to find the link right now but it fits right under
in your case the lower video card and really helps bring down the temps.
On my 8800gtx they came down about 15 degrees and it only cost $8.99.
I did some more poking around and I've only succeeded in puzzling myself further. I thought only one card had the problem - maybe it's both, since I did get a single crash of the same type right after I swapped the two cards, and not in SLi mode.
I downloaded and installed nTune - it crashed my video drivers, and I had to boot in safe mode and completely remove my Nvidia drivers, then reinstall them without ntune. After that, I downloaded EVGA's precision tune utility - works flawlessly - and I cranked up both the fans to about 65% (auto had them on 30%-ish).
Here's the puzzling thing though: the utility says my core temps (and I'd guess it's reasonably accurate) are about 40 degrees and 34 degrees for top/bottom when only running on the top. So idle temp is in the low 30's, while load temp is less than 10 degress C above that?
I think it must be that heating isn't the issue. I took a picture of the screen during the crash - maybe it will help someone identify it. I'll post it soon - gotta go for now.
Okay, here's the images I took of the screen during the crash. The first one is the screen looking normal (Beginning of a BF1942 level). The second and third pictures are of the same crash - the screen is frozen on the Spitfire I'm flying (picture 3 is a close up of one part of the screen). You can see the grid/checkerboard that forms, and the way certain areas, like the clouds, get especially messed up.
Hopefully these will help someone identify the problem.
EDIT: If higher-res pictures would help, I can post the originals, which are like 5 megapixels. Just ask - it's no trouble to upload them.
Another third option is your MB is going bad, as well. I couldn't imagine that both would be bad at once but not out of the question. Do you have another computer to swap cards with to see if you get the same issue with the different card or the cards in a different computer? This is helpful tool to help narrow down the root of the issue. If a different card does the same thing on your computer, I would say MB bad. If the other card has no issues, points to the GPU's, specially if you put them in another computer and get the same issue.
Good advice. the other computer in our house is a family computer, and I can't be ripping components out of that, and the only old card I have laying about is a PNY Geforce 7600 - which is AGP, not PCIe.
I may be able to find another card - borrow one from a friend or something...
On the plus side, it hasn't happened recently... I haven't been playing the games that it crashed most often on (BF1942, Serious Sam II on GameTap, and Homeworld 2 on GameTap)... hopefully it's some driver compatibility, and won't happen on most new games. BF1942 didn't work on the family computer (also vista, although running on a Dell box) so it doesn't have the best track record, and GameTap is built on the concept of emulating a PC on a PC by mapping nonexistant X and Y drives. That'd confuse me if I were a pair of 9800 GTXs <grin>.
I have seen patterns like that it it has usually been memory corruption on the video card.
I would suggest taking one card out altogether and see if it happens.
From the sounds of it it will. If it does try to turn the memory clock down slightly
in Precision. When I overclock my 8800gtx memory it looks the same way.
It may or may not solve it but will help to troubleshoot it.
Then do the same with the other card to try to isolate it furthur.
Also it may be worth trying a program called driver sweeper to get rid of any pieces of old drivers. Sometimes Nvidia drivers dont setup correctly if there was a previous version on the system. It is a simple thing to do to ensure there are no conflicts. http://www.guru3d.com/category/driversweeper/
One other thing I can think of but probably isnt the cause is what type of power supply do you have?
I have a Raidmax RX-730SS... It's 'SLi Certified' so if it is the problem, at least it should be covered by the warranty. I'm going to pull a card out of it in a minute and test it (found out that Freelancer crashes it readily - usually less than 1/2 hour in). If that doesn't work, I'll try underclocking the memory. I'm a bit hesitant to do that, as it's not overclocked (well, factory overclocked, but not above how it came), and I'd rather not make it quit working completely. Thanks for the help.
Oh and as far as drivers: if this doesn't work I'll try, but the drivers have only been installed once, so I don't think that's the problem.
EDIT: I checked in the EVGA Utility, and it says my memory clock is at '1123'. IDK what the units are, or if it's even a single number (and not a series), but is that faster than normal for an EVGA GeForce 9800GTX+? Is it possible that something could have set the clock to a higher value than the 'stock' value?
I guess its possible.
In EVGA just turn the memory down to what you know it should be and see if it helps.
It wont mess up the card just make it run slower.
Actually make sure all settings are the same as you know they should be.
If it does work then you can start to look for whatever may have turned them up.
Okay, thanks. I looked on Newegg, and checked the specs. The GPU clock is supposed to run at 738 MHz, but the utility says it's going at '756' (I assume MHz, no units are printed). The core clock and shader clock are coupled, with the shader clock at '1836'.
My concern is that the Memory clock on the site claims to run at 2200 MHz. the number (unitless, but again I'm assuming MHz) is only 1123. I checked (without applying settings, of course) and the slider goes from '875' to '1800'. Why is this number only half of what (I think...) it should be? If I'm misunderstanding the whole thing, my question simply is what, exactly, should all of the numbers be?
YES! I think it worked! I dropped the memory clock to 1000 and the crashes stopped, even in the two games which crashed most frequently. I set the core clock to 738 and bumped the memory clock back up to 1100 - seems to still be running stable. THANK YOU so much for your help - I'm really glad this thing is resolved.
I'll post back in a couple days when I figure out what the max stable clock speed is before it starts again... maybe it'll be useful to someone. Thanks again,