GTX 970 SSC no signal upon logging in to windows.

Curbo34

Reputable
Jan 30, 2016
44
0
4,560
Hey guys, my gtx 970 ssc acx 2.0 (04G-P4-3979-B6) had a mosfet short (4c10n) around 4 months ago, and I finally got around to removing it so the card would boot again. I didn't replace the shorted mosfet, I simply removed it.

System specs:

  • CPU: R5 1500x 3.9Ghz
    Mobo: Asus prime x370 pro
    RAM: Corsair LED White 3200Mhz 2x8gb
    PSU: CX600m
    Drives tested with:250GB SSD Windows 10 pro (Main), 500gb hdd W10 Pro (Fresh install)
    GPU: GTX 970 SSC (Problem card), GTX 750 (non ti, stable card)

I booted my system, installed drivers, and loaded up some gta v. Game ran perfect, no heating issues at all on the gpu core, no idea how hot the vrms got, or how much extra load they were under in order to make up for the missing 4c10n. It ran fine, gpu utilization was at around 60%, after about half hour I got bored of gta, and moved over to rocketleague. Played rocketleague for around 20 minutes, got bored, and tabbed out. Forgetting I had rocketleague open, I went on to watch around 20 minutes of youtube videos before getting off.

When I got off, I noticed rocketleague.exe was still running, so right clicked it, and hit "Close window" the very second I hit close window, I heard what sounded like coil whine with an additional "grinding" sound come from the gpu, and my display then lost signal.
My GPU has always had coil whine, but there was something distinctly different about the sound it made.

Now whenever I try to boot into windows, as soon as I type my password and hit enter it goes to no signal. Once in a while it will go no signal before that, on rare occasion I can make it past that, but only for a minute or two at best. Tried it in another system, one that didn't have Nvidia drivers installed, the system was stable with basic vga drivers for the entire time I used it (Around an hour I think) but every time I tried to install drivers, I got "installation failed". I then tried it with a fresh install (Installed purely for this test, brand new) of windows 10 pro. Once again "Installation failed". Searching through event viewer on my main ssd, I found this error
- System

- Provider

[ Name] Service Control Manager
[ Guid] {555908d1-a6d7-4695-8e1e-26931d2012f4}
[ EventSourceName] Service Control Manager

- EventID 7023

[ Qualifiers] 49152

Version 0

Level 2

Task 0

Opcode 0

Keywords 0x8080000000000000

- TimeCreated

[ SystemTime] 2018-03-28T04:37:59.278862100Z

EventRecordID 10422

Correlation

- Execution

[ ProcessID] 616
[ ThreadID] 772

Channel System

Computer DESKTOP-DMUHIR3

Security


- EventData

param1 NVIDIA Telemetry Container
param2 %%14109
4E007600540065006C0065006D00650074007200790043006F006E007400610069006E00650072000000


--------------------------------------------------------------------------------

Binary data:


In Words

0000: 0076004E 00650054 0065006C 0065006D
0010: 00720074 00430079 006E006F 00610074
0020: 006E0069 00720065 0000


In Bytes

0000: 4E 00 76 00 54 00 65 00 N.v.T.e.
0008: 6C 00 65 00 6D 00 65 00 l.e.m.e.
0010: 74 00 72 00 79 00 43 00 t.r.y.C.
0018: 6F 00 6E 00 74 00 61 00 o.n.t.a.
0020: 69 00 6E 00 65 00 72 00 i.n.e.r.
0028: 00 00 ..
Any ideas? Should I replace the mosfet and try again? Is it just kinda toast? Software issue of some sort? I'm really stumped on this, why would it fail on the desktop but not under load?


Thanks all. :D
 

Eximo

Titan
Ambassador
Well the better question is why did the mosfet fail? If there is another fault in the card leaking voltage where it shouldn't go or what not that might be killing the components.

Putting the same burden on the remaining mosfets may have lead to pre-mature failure certainly. Typically Rocket league would run at many more FPS compared to GTA, so the rapid shifts in power demand might have made it give up the ghost.
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560

Indeed that is a very good question, I'd found another thread on a forum where someone with a gtx 970 ftw of some sort had the exact (Location wise) same mosfet fail, his had meltdown. I honestly have no idea, but wouldn't it make sense that if it had died due to overvolting that it would've melted? Mine physically appeared fine, but was shorted. P.S GTA was using about 60% of GPU power to keep it at 60fps on my current settings, RL was had the fps cap set as high as it will go without a cfg edit, so that's 250. I only tried it in training, if I recall that put somewhere between 60%-80% load on the card. Would it be logical to assume the first MOSFET to get the power, if it is leaking power to them, would be the one to fail? The one highlighted is the one that failed. (This is an actual picture of my board, shortly before I figured out what one was shorted) Also, the one that looks like it isn't marked (top left) just has glare on it.
60cd4d7e33b4c51936f9a6277ef09338-png.jpg
 

Eximo

Titan
Ambassador
That would take some circuit analysis and looking at the mosfet drivers on the left there.

If you had a pair of mosfets in parallel to make up each phase, for example, then you were running the expected load of two mosfets through one. That could easily make it kerplode. Worth a shot to replace the mosfets if you have experience doing such a thing. There is always a chance of random part failure.

I have to admit, I have not kept up on 'modern' circuit design as much as I would like to. So I am not too familiar with the manufacturers of common parts. If there was someone out there to ask it would probably be someone like buildzoid over at actually hardcore overclocking. Might even have a teardown of that card somewhere.
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560

Naturally, when I was researching the issue I looked for Buildzoid talking about this card, or something similar, didn't see anything about it, but learned a lot anyway. The reason I didn't put another 4c10n on there is purely because getting one and shipping it is like $10... In reality that sounds like a fairly crap reason to not properly fix a GPU. I might end up re-flowing the two MOSFETS below the one I removed, because I sort of uhhh... Knocked them off while trying to remove my target (That for some reason didn't want to release when everything around it already had) So it's possible I didn't get them reseated entirely perfectly and thus any movement near my PC (Me moving, cat running around, someone in the next room over, etc) could be causing it to freak out. Thanks for all the help, when/if I figure anything more out I'll come back and update you. (Ofc you're welcome to ask any questions or give any advice you might think of/come by).
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560
Okay, so as of now my card seems to be stable.
All I did was boot into safemode on my main drive, use DDU to uninstall all drivers, restart back into safe mode, and install the Nvidia drivers.
That worked perfectly, the card has been stable for a while now (I figured this out within a few minutes of my last post here, but wanted to wait a day to see if it remained stable) I've played a solid few hours of GTA V, Rocket League, and Far Cry 4 with no issues at all. If it goes kaboom on me I'll let you know.
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560


Okay, in a great big shock to everyone (not) it broke again, and in another shock (not) it seems to be related to the 12v rail into the GPU.
The fans will attempt to spin (more then a twitch but far from the spin they want, lasts 2-3 seconds till they stop entirely) every 5 or so seconds. As far as I've been able to figure out, this basically means the card isn't getting the 12v rail, so it doesn't initialize.
Booting into windows with my 750 works fine, I also tested in my other system with another PSU.
Once booted into windows, using the 750 as my main card, the 970 doesn't appear in device manager at all.

The card has been fine for several LONG gaming days, something north of 6hr/day over the last week for sure, probably closer to 8. This spent between Fornite, gta v, and rocket league. Most of them pushing the GPU to 100% constantly for various reasons.
The card was fine while gaming, I did a normal shutdown last night, then woke up and tried to power it on. Any thoughts?
Other mosfets finally had enough of picking up the slack and went kerboom? That doesn't seem likely given it broke while sitting, I also had the system idle (nothing more then youtube) for 40 or so minutes after gaming, so it had time to cool off for sure.
I'm stumped, I wanna try to fix it because I find it fun, but at some point I need to cut my losses. I'm 16, and currently don't have a job (Geographical thing mostly, no work around here that interests me) so I'm about ready to just go out and get a shit job so I can have a bit of money to spend on things I want before I end up having to spend all my money and go into dept for college and bills.
 

Eximo

Titan
Ambassador
Inrush current is a thing. But without access to the card, a good oscilloscope and probes, probably not going to be able to tell you much.

I'm told EVGA has a repair service they offer outside of warranty, might give them a call. No charge if they deem it unrepairable. Something like $65 an hour, which could be far cheaper than a new card.

Used 970 are around $250 as I recall. But a new 1050Ti is not much more which would be similar.
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560


Now I'm properly confused, card has been twitching away in my system for a bit while I did some daily stuff, decided to give it one more shot before removing the card again and it worked. I'm guessing you're right with the inrush current, and as I don't have an oscilloscope (Or the experience with it required) it's just gonna remain a mystery, and I'll dink with it by the daily if it continues acting up, probably gonna buy a new mosfet for it and put it on, as well as make sure all the others in the area are properly seated.

You can find used 970s on reddit for about $200-$225 shipped, but I'm not willing to the insane prices, I'll just suffer with my 750 until prices drop, I'm planning to buy a 1080 from my friend (Trusted friend, not just some "friend") when he goes to the new generation (1180, 2080?) so that should happen soon anyway. Any more updates from here on out will more of serve as a log for both myself and anyone else with similar things, so you're more then welcome to turn off notifications to this thread, or stick around if you're interested. Very interested in that repair service though, I'll look into that.

Thanks for all the help.

-Curbo
 

Eximo

Titan
Ambassador
Interesting.

If he can get one. I'm holding out for the 2080Ti to replace my 1080. I don't imagine it will be easy to get the 2080. I suspect Nvidia will do an FE launch again, meaning it is probably going to be $800 at launch factoring in increased DRAM prices (1080 was $700 at launch, I paid $650 when the AIB cards launched)
 

Curbo34

Reputable
Jan 30, 2016
44
0
4,560
Well, that lasted a solid month. It just went pop while I was playing warframe, not even under that much load. Unsurprisingly, another 4c10n (The one directly to the left of the highlighted one) went kaboom. Gonna order 2 and put 'em on, see if it comes back to life. I'm assuming it exploded due to a drop in efficiency, today was the first hot day here in Minnesota, and it was ~85°F as well as fairly humid. Will post again to update on resurrection, assuming I do decide to go that route.
 

IHateSmurfs

Prominent
Mar 10, 2017
149
0
760
There's an urban myth that says older nVidia GPU's are having MOSFET leaks and failures to make you upgrade to the newer gen xD. No, seriously, do you see any leakage on the board? You should send that to an expert so he could probably try to fix it.