Radeon x1950 256mb crashes

CRamano

Distinguished
Mar 4, 2010
10
0
18,510
hello,

when i run some games my video card crashes im not sure what causes this. these games are older and do not have a massive work load. when i play RRT3 it crashes idly at the main menu.


IE. 1 railroad tycoon 3 - predictably crashes (alot)
2 empire earth 2 - unexpectedly crashes
3 simcity 4 - rarely crashes

but some games don't crash at all -

1 rail works train sim/ MS flight sim
2 C&C 3
3 The Sims 3


i have the following hardware:

OS: XP Home SP-3
MB: ASUS M4A78-E
Ram: 2 Gigs
CPU: Athlon 64 X2 4800+
Video: Radeon X1950 Pro 256Mb PCI-E
Monitor: Dual 22"
HDD: 2x 300gb

Cooling: i have an amazing liquid cooling system and chiller that keep my NB, SB, CPU, and GPU Very Cool....

No Load l Full Load
CPU 18*C l 25*C
l
GPU 20*C l 30*C




SO,
any guesses why my card crashes (sometimes the vpu doesn't recover)


 

CRamano

Distinguished
Mar 4, 2010
10
0
18,510
both screens go blank then the game crashes.... i am pretty sure it crashes to the desktop because the pc seems to stay responsive. but the graphics card does not recover so i can't see and i need to reboot, i need to do a cold reboot (off for 10 sec) because the card does not respond after a quick reboot. this is only for RRT3, all other games it usually goes into vpu recover, and sometimes the game continues if not it goes to the desktop
 

CRamano

Distinguished
Mar 4, 2010
10
0
18,510
tested all ram using memtest for a few hours, all tested normal with no errors...

i am pretty sure the problem is a software problem
 

CRamano

Distinguished
Mar 4, 2010
10
0
18,510
after weeks of research and proper testing and benchmarking, this is what I've come up with:

the first thing i did was research my video card (X1950 Pro 256mb) for similar problems. i turned up a few articals like mine some more usefal than other but most of them had too many hardware differences to rely on. i did remember a forum talking about VRM's on ATI cards, they were saying that the VRM's on some ATI models had a tendency to get very hot and even over heat. this was news to me mostly because i had no idea what a VRM was. after some more research i learned that Voltage Regulator Modules are everywhere on computers. anyway i searched my video card with that issue and found an artical about the Radeon X1950 Pro with it's VRM's over heating. Now VRM's are usually relatively large, but i discovered the X1950 has 3 tiny Digital VRM's near the end of the card so that they were only covered by the edge of the original heatsink. the funny thing is when i removed the original heatsink to put on the water block about 3 months ago, i never knew about the VRM's, and all that time they were bare. my pc always ran pretty good but i guess i didn't notice it until i tried playing some games.

so what i did to remedy the situation was i took the original heat-sink, it's large covers most of the card, and removed the housing and stripped off all the cooling fins. now i have a flat copper plate with a hole where the fan was and a couple of cooling fins in the corners over the VRM's. i then sanded the top surface smooth, i reattached the plate to the video card and placed the water block on the plate over the GPU. i figured this should solve the problem, the heat should dissipate. but this still wasn't enough, after testing i had the same result, video card crashes. this fix only bought me minutes, so i took the card all apart again and soldered some copper tubing around the edge of the heat-sink over the VRM's, sanded, cleaned, reattached to the card and ran coolant through it. i also made sure to get thermal paste on the VRM's.

when i booted up again i checked the VRM Temps with "Everest" a cool program some cool features, I.E. VRM temp monitor. my VRM's were running at 80-90 C and are now running at 50C under load, 35-40 idle. HURRRAY another great success... or not

i downloaded Furmark to stress test my video card and again after less than 5 min in the video card crashes, but my temp on the VRM's or GPU never peak over 65C. this is where you guys come in, what else could possibly be causing this?
 

CRamano

Distinguished
Mar 4, 2010
10
0
18,510
I found a program to test video card memory, very similar to memtest. after running the test, i got these results:

[2/10/2011 2:16:44 PM] Test started for "Primary Display Driver (Radeon X1950 Pro Secondary)"...
Trying 16bpp RGB:565 mode...OK
Error at [063D200A]: must be FFFF, but found FEFF (bits: 0000000100000000)
Error at [061D200A]: must be FFFF, but found FEFF (bits: 0000000100000000)
Error at [061D200A]: must be FF00, but found FE00 (bits: 0000000100000000)
Error at [01EFCEBA]: must be FF00, but found FF80 (bits: 0000000010000000)
Error at [063D200A]: must be 5555, but found 5455 (bits: 0000000100000000)
Error at [01EFCEBA]: must be 0001, but found 0081 (bits: 0000000010000000)
Error at [00BB0F0C]: must be 0001, but found 8001 (bits: 1000000000000000)
Error at [01EFCEBA]: must be 0004, but found 0084 (bits: 0000000010000000)
Error at [00BF0F0C]: must be 0010, but found 8010 (bits: 1000000000000000)
Error at [01EFCEBA]: must be 0020, but found 00A0 (bits: 0000000010000000)
Error at [01AF0F0C]: must be 0020, but found 8020 (bits: 1000000000000000)
Error at [01EFCEBA]: must be 0100, but found 0180 (bits: 0000000010000000)
Error at [01EFCEBA]: must be 0200, but found 0280 (bits: 0000000010000000)
Error at [01EFCEBA]: must be 0400, but found 0480 (bits: 0000000010000000)
Error at [01EFCEBA]: must be 0800, but found 0880 (bits: 0000000010000000)
Error at [01AF0F0C]: must be 0800, but found 8800 (bits: 1000000000000000)
Error at [01EFCEBA]: must be 2000, but found 2080 (bits: 0000000010000000)
Error at [01EFCEBA]: must be 4000, but found 4080 (bits: 0000000010000000)
Error at [01AF0F0C]: must be 4000, but found C000 (bits: 1000000000000000)
Error at [0085200A]: must be FFFB, but found FEFB (bits: 0000000100000000)
Error at [061D200A]: must be FFF7, but found FEF7 (bits: 0000000100000000)
Error at [0CC63A88]: must be FFEF, but found FDEF (bits: 0000001000000000)
Error at [0CC63A88]: must be FFDF, but found FDDF (bits: 0000001000000000)
Error at [0085200A]: must be 7FFF, but found 7EFF (bits: 0000000100000000)
Trying 16bpp RGB:555 mode...OK
Error at [0DA7CA02]: must be 00FF, but found 00FE (bits: 0000000000000001)
Error at [07195EBA]: must be 0001, but found 0081 (bits: 0000000010000000)
Error at [07195EBA]: must be 0004, but found 0084 (bits: 0000000010000000)
Error at [06E49F0C]: must be 0004, but found 8004 (bits: 1000000000000000)
Error at [07195EBA]: must be 1000, but found 1080 (bits: 0000000010000000)
Error at [07195EBA]: must be 4000, but found 4080 (bits: 0000000010000000)
Error at [0646B00A]: must be FDFF, but found FCFF (bits: 0000000100000000)
Error at [0626B00A]: must be FDFF, but found FCFF (bits: 0000000100000000)
Error at [0DA7CA02]: must be EFFF, but found EFFE (bits: 0000000000000001)
Trying 16bpp BGR:565 mode...OK
Error at [07195EBA]: must be FF00, but found FF80 (bits: 0000000010000000)
Error at [07195EBA]: must be 0008, but found 0088 (bits: 0000000010000000)
Error at [07195EBA]: must be 0020, but found 00A0 (bits: 0000000010000000)
Error at [07195EBA]: must be 0400, but found 0480 (bits: 0000000010000000)
Error at [07195EBA]: must be 0800, but found 0880 (bits: 0000000010000000)
Error at [07195EBA]: must be 8000, but found 8080 (bits: 0000000010000000)
Error at [0C8FCA88]: must be FFFB, but found FDFB (bits: 0000001000000000)
Error at [086EB00A]: must be FDFF, but found FCFF (bits: 0000000100000000)
Error at [086EB00A]: must be BFFF, but found BEFF (bits: 0000000100000000)
Error at [0CA3CA88]: must be 7FFF, but found 7DFF (bits: 0000001000000000)
Trying 32bpp RGB:888 mode...OK
Error at [01595EB8]: must be FF00FF00, but found FF80FF00 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 00000001, but found 00800001 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 00000800, but found 00800800 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 00001000, but found 00801000 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 00010000, but found 00810000 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 00200000, but found 00A00000 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 04000000, but found 04800000 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 20000000, but found 20800000 (bits: 00000000100000000000000000000000)
Error at [01595EB8]: must be 40000000, but found 40800000 (bits: 00000000100000000000000000000000)
Error at [0CE3CA88]: must be FDFFFFFF, but found FDFFFDFF (bits: 00000000000000000000001000000000)
Trying 32bpp BGR:888 mode...OK
Error at [0D87CA00]: must be FFFFFFFF, but found FFFEFFFF (bits: 00000000000000010000000000000000)
Error at [00F95EB8]: must be 00000002, but found 00800002 (bits: 00000000100000000000000000000000)
Error at [00F95EB8]: must be 00200000, but found 00A00000 (bits: 00000000100000000000000000000000)
Error at [02C3CA88]: must be FFFFFFFE, but found FFFFFDFE (bits: 00000000000000000000001000000000)
Error at [0D87CA00]: must be FFFFFEFF, but found FFFEFEFF (bits: 00000000000000010000000000000000)
Error at [02AFCA88]: must be FFFEFFFF, but found FFFEFDFF (bits: 00000000000000000000001000000000)
Error at [02AFCA88]: must be F7FFFFFF, but found F7FFFDFF (bits: 00000000000000000000001000000000)
[2/10/2011 5:16:21 PM] Pass completed (60 errors found).


now because there are only 60 errors, is there some way to repair or block these bad bits?