GTX 295 Failure - but perhaps not?

Hey everyone,

I bought an eVGA GTX 295 a few months back and it's been working great. This morning, I decided to do something I usually do during the summer and pull the card apart to replace the thermal compound. I did a quick search on how to properly disassemble the card (as I had never taken apart a dual PCB card before) and quickly figured out how it was supposed to go. I got the card apart, and about an hour later, I had it back together with the silicon based TC removed, with Arctic Silver 5 in it's place. I hooked up the card and to my shock, I found artifacts littering the screen! So bad in fact, that I could barely read anything on the display. After reseating everything and attempting again, I got the same artifact filled display. This time I took the card completely apart again, and reseated those 2 tiny cables bridging the 2 PCB's, in case I had jolted them loose during re-assembly. Unfortunately, once again I got the same artifact filled screen. At this point, I considered everything I had done step by step to see if anything popped into my head as the source of the problem. I realized that it was very warm where I live today, and the humidity was very low. A split second later, I also realized I did not use proper ESD protection methods when handling the card throughout the TC replacement process, which I believe to be the cause of the problem.

I then went and got another GTX 295 (my credit card company now loves me by the way), and figured I would attempt to get the seemingly now defective card replaced under warranty (the card has not been modified in any way, other than the TC replacement). After a few hours of using the new card I bought, I decided to shut down and put my original card back in to take some pics of what I was getting on my monitor, to show my friends later of my colossal f**k up. This time however, the screen was almost artifact free! It's not completely gone, there are still a few minor glitches in the display so something has gone wrong with the card (unfortunately this still doesn't negate the 'my fault' part of the equation). Because of this, I'm going to try once more a little later on tonight, but I am left wondering... has anyone ever seen an artifacting problem get slightlybetter over time? I'm still pissed at myself for screwing this up, but right now i'm more puzzled than anything. If anyone would care to share their ideas on this one, I'd appreciate it.

Many Thanks,
TP
 
Well, I have two guesses as to the problem with the original card. While disassembling a capacitor or resistor was 'damaged'. The handling of the card was just heavy enough to inadvertently 'bend' or 'break loose' a soldered connection in some manner. If you have the card apart again perhaps perform a visual examination, maybe even use a magnifying glass, and search the card over good (both sides) for a loose or broken previously soldered component of the board. In addition, I have experienced the same 'artifact filled display' due to an insufficient, overworked hot running PSU.

Edit for writer's embellishment: I'm currently trying to 'repair' a couple of cards I inadvertly 'broke'. LOL! I have the parts that broke off in my glove box. I don't know which direction to head out in to find the replacements. The cards artifact like crazy!
 
I had considered a broken capacitor or resistor while disassembling the card, however due to the following, i'm more inclined to rule that out, but at the same time am still puzzled. After leaving the card out for another hour or so, i plugged it back into my tower.

It works flawlessly this time. Gets into Windows without issue. I let it come back to normal idle temp for the card and fire up a game to see how it handles the load. After a few minutes, it seems fine so I open a session of 3DMark Vantage. I set everything as high as it will go and start the tests. A few minutes into the first test, it starts artifacting again, and eventually freezes the system entirely. I have opened the card up again and removed some of the TC I applied earlier, as it did look to be too much after it was spread around when i got the card re-assembled the first time.

I am about to shut down and pop the old card in again to see what happens. I'll post back and let you know what the result was.
 

efeat

Distinguished
Jul 13, 2008
272
0
18,790
Thermal paste/grease does take some time to set in and get its full heat conductivity. Perhaps you just pushed it too hard, too quick as well as used too much.
 


That was rather pointless. Since I posted this thread, it seems to be a little late for that, isn't it?

Getting back on track now... I retried the card once more and it is now back to serious artifacting like it did when i first noticed the problem. Perhaps I should wait a few more hours for the paste to reset itself. If heat is the issue, then I should wind up with a stable card this time around, as there is less TC on both chips. Hopefully it won't insulate this time...
 

hundredislandsboy

Distinguished
My ti4600 was at the top of the the GPU line when I bought it and after 2 years it started off with problems just like the OP's artifacts. It would happen after 30 mins of Generals' Zero Hour. I disassembled and cleaned the card and fan and the artifacts (about 90% of it) almost disappeared but still came back after about 30 minutes of gameplay but never artifacts in Windows 2D. I thought it's just a heat problem but the side case was already out and my apartment temps were cool. I took a break and the next evening on a fresh power on cold boot I go right into Zero Hour and about 50 percents of the artifacts came back after about 40 minutes. I took a 10 minute break, started Zero Hour and now the artifacts in full force (100% of original problem) came back in 20 minutes. Over the next week, the artifacts started to appear sooner and sooner but since I knew I couldn't sell the card as useable and I was going to get a replacement anyway I kept playing until it would freeze and crash 5 minutes into the game. Once it reached that point, the artifacts little at a time permanently appeared during post and on the 2D desktop and the more I kept playing 3D the more artifacts appeared.

With that "experiment" my theory is that with each crash resulting from overheat and artifacts, slight irreversible damage has occurred in one of those miilions of transistors, pathways, or gates, and maybe not enough to discern, like a faint scratch on a photo or a vinyl record. But the damage is still there and and eventually just like CPU degradation from overvoltage or overheat, that GPU will become less tolerable to its "pedal getting pushed to the metal" and with each overheat/artifact incident it loses mroe and more of its stability under full load.

I would RMA that card if it was possible. Doesn't EVGA sell that model with a lifetime warranty?

 
All eVGA cards sold in North America come with a lifetime warranty if registered within 30 days of purchase, which this card is. That's why i'm playing around with it to see if I can get it to work. Eventually I'll give up and RMA it if I can't figure out what's wrong.
 

hundredislandsboy

Distinguished
Maybe you meant all GTX 295s because I checked Newegg and looked at all the new retail EVGA videocards, some have 1 year. some have 2 years, even some that are over $120 have only two years warranty.

I prefer XFX after EVGA burnt me on a 90 day recertified 8800 GTS 512 that died in less than 4 months of use and EVGA refused support. It was my fault for not seeing what a day warranty really means - that the manufacture will sell it as B stock but absolutely has no confidence in that item, hence 90 days only.
 


My ti4200 did something just like that, but it was the video memory that had gone bad. A quick RMA later and it still works to this day.

Any number of things could have caused this, but i think its safe to say, its not fixable now if you have tried that many times.
 


I was refering to brand new eVGA video cards sold in North America. A returned card of course would not be elligable for the original warranty. eVGA also specifies that their warranty extends to the original purchaser only. According to eVGA's Warranty page:

All EVGA products come automatically with a one year, parts and labor, limited warranty. Upon registration within 30 days of your original purchase, you will be upgraded to one of the extended warranties we offer. For details on each program, click the appropriate warranty below:


Lifetime:
The EVGA limited lifetime warranty is only eligible for part numbers ending in:
-A1, -A2, -A3, -A4, -AR, -AX, -CR, -CX, -DX, -FR, -FX, -SG, -SX.

1+2:
EVGA 1 + 2 limited warranty only eligible for part numbers ending in:
-K1, -KR.

1+1:
EVGA 1 + 1 limited warranty only eligible for part numbers ending in:
-LA, -LE, -LR, -LX, -T1, -TR, -TX.

Complete details are on eVGA's Warranty Page
 

Vythiel

Distinguished
Feb 14, 2008
37
0
18,540
I've never disassembled a video card past the nvidia 7xxx generation so this is entirely based on what I've read online. Some cards come with "pads" of thermal compound between their components and heatsink. These pads can be quite thick. It is possible that after you applied the arctic silver, there were gaps left between the heatsink, GPU, and memory since it is not as thick as the original layer of thermal putty. I also remember reading that you have to insert plastic rings (at the screws) to make up for this difference in space.
 
The thickness of the new TC is if anything, less than it was with the old silicon based compound that ships with the card. The plastic washers are only needed if you are installing an aftermarket heatsink which requires an extra bit of a gap between the new 'sink and the mounting points. Since there was no aftermarket heatsink involved... no washers needed either.
 

Vythiel

Distinguished
Feb 14, 2008
37
0
18,540


That's what i was saying. Because the new thermal compound is not as thick as the factory stuff, the heatsink may not be making full contact with the cards components.