Dead RTX 5090 with a cracked PCB gets urgent surgery from repair wizard — tech casually reballs the core, replaces a memory chip twice, and runs more wires across its traces than the NSA

RTX 5090 being brought back to life by an ingenious technician
(Image credit: northwestrepairs on YouTube)

A broken RTX 5090 recently landed on the desk of Northwest Repairs, run by "Tony" who also serves as the face of their chaotic YouTube channel. NorthWest specializes in semiconductor work, particularly GPUs that need to be brought back from the dead. So, when a PNY RTX 5090 came in with a cracked PCB and no signs of life, Tony got to work and took us along the crazy roller-coaster ride that ensued with repairing this 5090.

RTX 5090s are no stranger to controversy. They've had their fair share of melting power connectors, a legacy carried forward from the direct predecessor—the RTX 4090. However, this was somehow not an overheating GPU drunk under the weight of its own power addiction, but rather a cracked PCB, which is an increasingly common failure point on hefty modern GPUs. Cracks in the PCB can interrupt signal traces buried deep inside the board, leading to seemingly random failures that are tough to diagnose and even tougher to fix.

I removed the core from RTX 5090 to see whats underneath it - YouTube I removed the core from RTX 5090 to see whats underneath it - YouTube
Watch On

Tony began by taking the card apart, starting with the shroud, which eventually revealed the cooler's underlying design—which was... fine. The memory contact plate wasn’t touching the vapor chamber properly, meaning that thermal transfer to the memory was virtually nonexistent. Higher-end variants of the 5090 might not share these oversights as they'll feature more robust cooling systems. The VRMs and surface components, on the other hand, were solid, so power testing was the next step.

Even while idle, the GPU was pulling around 5 amps, which is high but normal for a power-hungry beast like the 5090. Tony had to bump his power supply tester up from 4A to 8A just to get it to start, and 8 amps were needed to manage the initial power spike. The card powered on, lights came on, fans spun - but there was no display output, despite the monitor's LED turning on. This meant the GPU was on but just not posting, leading the repairman to believe this was a VRAM issue.

Therefore, we continue to memory diagnostics with the processor's iGPU being used for output, and voila, a training error flagged one specific memory chip as the problem. The GPU wasn’t able to complete its DRAM initialization sequence—a sign that the solder joints on that chip had likely gone bad. Tony proceeded to remove said chip from the board, reballed it, and installed it back onto the PCB in a beautiful montage sequence, after which we see the card successfully posting. This was a 2GB Samsung GDDR7 chip, which means the card was from an older batch of 5090s before Nvidia started utilizing SK Hynix-manufactured modules.

What exactly is reballing?

A memory chip from an RTX 5090 being reballed

(Image credit: northwestrepairs on YouTube)

Reballing is the process of removing a BGA (Ball Grid Array) chip from a circuit board, cleaning both the chip and the board of old solder, applying new solder balls using a stencil, and reattaching the chip using controlled heat. It requires expert-level skill, proper alignment, and specialized equipment like rework stations and microscopes. The slightest mistake during reballing can damage the chip or board permanently, which is why you should never try this at home.

Unfortunately, the job didn't end there, as in operation the fans immediately ramped to full speed and stayed there. A GPU's fan spinning at 100% without variation means serious trouble; it's likely that the GPU itself thinks there's something wrong with it. At the same time, the PCIe interface suddenly went out. The earlier heat cycles may have worsened whatever internal damage the PCB already had, and the PEX (the initial link the PCIe protocol makes to the motherboard, signaling that the GPU is all good and ready to go) just had its last good moment before deeper damage set in.

To rule that out, the GPU core was fully reballed, an unbelievably meticulous job that involves the highest level of precision and skill. Not only that, but this is also an RTX 5090 so it's not like donor boards are flying off the shelves; there is simply no room for error. After the process was complete, Tony checked everything and quickly found that the reball had inadvertently introduced a new problem: there was a dead short on memory, likely due to thermal expansion.

An RTX 5090 core being reballed

(Image credit: northwestrepairs on YouTube)

Initially, the thermal camera connected to Tony's phone did not catch anything, so he employed "critical thinking" and started to check for rising temperatures across the board. Eventually, he identified the faulty chip—the same one that he had reballed earlier. After replacing it again, the short was fixed, but the card still wasn’t being recognized by the system as there was no PEX.

At this point, the original PCB crack had likely worsened during heat cycles, fully severing internal connections. Digging deeper revealed that only a few of the VRM power phases were actually active. The usual voltage rails—12V and Vcore—were present, but the digital "Driver ON" signals weren’t reaching half the VRM controllers. Tony's guess was proven right, a trace buried inside the PCB had been severed.

Now comes out the wiring, and perhaps the most impressive part of the repair. Since the signal wasn’t shared across phases, a simple jumper wire was run to reconnect the broken path. That restored full power delivery, but PEX was still missing. Another jumper was added to bridge a missing PCIe enable signal. This time, it worked and PCIe came back online, and the card posted again.

A jumper wire being ran across traces of the RTX 5090

(Image credit: northwestrepairs on YouTube)

Thinking that the card is repaired for good now, Tony reassembled it, but the issues returned. PCIe detection failed again with the fans maxing out like before, and one memory phase didn’t power up. Upon probing, he discovers that the enable signal for PEX, sourced from the 3.3V PCIe slot rail, is no longer reaching its destination, essentially rendering that phase dead. The final fix involved running a third wire to supply 3.3V directly to that memory enable signal. After that, everything worked.

The PCIe link was stable at last, power delivery was balanced across all phases, and the GPU passed a full round of stress testing consisting of both benchmarks and games. Tony carefully put back the GPU, making sure to even glue one of the PNY stickers that had fallen off a fan earlier. Mission complete.

From broken inner-layer traces to a fully-fledged core reballing, and even signal patching, every fix pushed right up against the limits of what can be done on a multilayer board without a factory. For a card that might cost more than an entire high-end build, this was one of the rare cases where saving it was not only possible, but entirely necessary, and Northwest Repairs did one phenomenal job at that.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

Hassam Nasir
Contributing Writer

Hassam Nasir is a die-hard hardware enthusiast with years of experience as a tech editor and writer, focusing on detailed CPU comparisons and general hardware news. When he’s not working, you’ll find him bending tubes for his ever-evolving custom water-loop gaming rig or benchmarking the latest CPUs and GPUs just for fun.

  • txfeinbergs
    Yeah, but how much would that repair have costed if I paid someone to repair my board? Probably still cheaper to just buy a whole new card.
    Reply
  • ezst036
    Admin said:
    However, this was somehow not an overheating GPU drunk under the weight of its own power addiction
    I can have more sympathy for a super rare cracked PCB than a commonly melting power connector.
    Reply