Defective RX 9070 XT card with pitted silicon surface runs extremely hot — report indicates it's unclear if this was an isolated incident

RX 9070 XT Hellhound
(Image credit: PowerColor)

Igor's Lab received a defective PowerColor RX 9070 XT Hellhound from a reader who had purchased the card. While it had apparently passed quality control tests, Igor found a defective RDNA 4 die that resulted in unsustainable temperatures — which persisted even after re-pasting. According to the tech outlet, the culprit was "pronounced pitting," which can occur during the backgrinding process.

Igor's Lab does preface its report by noting that, for now, this is an isolated case and it cannot fully confirm what caused the damage to the die. It could be a single bad card, or perhaps a faulty production line may have resulted in a bad batch of dies. Either way, the extent of the problem isn't clear at present.

Igor's Lab PowerColor RX 9070 XT Hellhound pit measurement

Igor's Lab PowerColor RX 9070 XT Hellhound pit measurement (Image credit: Igor's Lab)

Functionally, while nearly invisible to the naked eye, the surface of the silicon had irregularities that translate directly into extremely high hotspot temperatures, making the GPU unusable. Igor's Lab recorded a whopping 46 degrees Celsius (C) delta between the average GPU temperature and the hotspot temperatures, with the latter touching 113 C. 110 C is the limit for RDNA-based products, so the high hotspot temp resulted in the RX 9070 XT card thermally throttling.

Further inspection with a microscope revealed more than 1934 craters or pits in the silicon, amounting to over 1% of the chip surface. Igor alleges that this is well outside the normal tolerance level for modern chips, particularly high power chips like the Navi 48 used in the 9070 XT.

Igor's Lab says it used general industry guideline values as a reference point for allowable pit size, as RDNA 4 currently lacks any publicly available specifications for the maximum depth of a defect. "... As a rule, depths ≤ 5-10 µm with a diameter ≤ 50-100 µm are not considered critical, provided they do not occur near die edges or bond surfaces. In more sensitive areas or in applications with high mechanical stress (such as particularly thin dies), a defect with a depth of more than 2-3 µm can already be critical."

The outlet measured one pit on the faulty card with a depth of 12.59 µm and a diameter of 212.36 µm, which is beyond the standard industry guidelines. Igor's Lab suspects the origin of the damage came from improper backgrinding of the die. Backgrinding is a process where the back of the silicon (the "top" once the chip is installed in a PCB) is ground down to an appropriate thickness, which can vary depending on the design and use case.

Similar to sanding, complications in the backgrinding process can occur. Debris from the grinding process can cause scratches and pitting, flaking, or other irregularities that affect the silicon's structural integrity and reduce cooling effectiveness. Inappropriate thermal mechanical stress can also occur, which can damage the die during the grinding process.

Regardless of where the damage came from, Igor's Lab concludes that there are multiple parties that failed to catch the problem. As the card comes from PowerColor's factory, it's ultimately responsible for inadequate Q&A. TSMC and other involved parties also passed this particular sample, potentially due to AI-based inspection algorithms that didn't have sufficient training to detect the problems.

At present, the issue doesn't appear to be widespread. AMD told Igor's Lab that the defective PowerColor RX 9070 XT is "an isolated incident." Hopefully, that's correct and there aren't further incidents. And considering the GPU prices and supply shortages, hopefully the owner of the card is able to RMA it — or get a replacement if Igor's Lab purchased the card from the reader.

Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

  • hotaru251
    do wonder if we'll see future issues relating to this as everyone rushes them out and uncommon issues escape into wild.
    Reply
  • -Fran-
    Sapphire Pulse 9070XT here, sample 1 out of 1, but no issues with temps and been toying around with OC'ing and undervolting.

    In fact, it runs incredibly cool for a 300W card with """basic""" cooling on core, hotspot and memory. Quotations intentional as this thing is monstrous still and has a lot of metal to dissipate heat.

    Regards.
    Reply
  • George³
    Well, more and more defects from the factory, with the shrinking of lithography process.
    Reply
  • edzieba
    Since grinding is a wafer-scale process, that means there is at least a wafer worth of similarly affected dies that could have made it to circulation if QC could not detect this one.
    Reply
  • YSCCC
    This is really a kind of defect I don't even ever think of... with powercolour using PTM7950 I was actually surprised at the huge temp difference, wonder how much of the cards out there have these issue and won't boost. pathitic times when ppl paying retail scalping price for some GPU just for gaming and get this, or the missing ROPs in Nvidia
    Reply
  • Ali3nWnd3rWaRE4
    New here but use Techpowerup.com everyday. I bought a PowerColor RX 9070 XT Red Devil on launch day. I got the card installed and tested some benchmarks and noticed the card would not hit the rated speeds at all.

    I then tested a few games and found that my GPU temp would get in the 80's C while my hotspot was hitting 110 C I have seen it get up to 113-114 C.

    I contacted PowerColor about the issue and I'm in talks with them.

    I tried changing the thermal martial, I know that use PTM7950 but I put some Kryonaut extreme and tested with the same results, I thin swapped back to my own honeywell PTM7950 just to test if maybe they used a lesser quality PTM and still got the same results.

    After reading this article, Now I'm thinking I may have gotten one with this issue.

    I tested this on two workstation as well both in Thermaltake tower 900 cases with custom loops on the cpu side.

    1st
    Asus Crosshair VIII hero wifi (x570)
    Ryzen 3950x
    32GB 2x16 Gskill 3600M/T system memory
    Seasonic Prime TX-1000w
    2x WD Black SN750 2TB
    ASUS STRIX RX 5600 XT (was pulled out for testing)
    Powercolor RX 9070 XT Red Devil (First time purchasing PowerColor)
    2 year old windows 11 Pro install with 23h2

    2nd
    Asrock x870e Taichi with bios 3.20
    Ryzen 9950X3D
    48GB 2x24 Gskill 6400M/T system memory
    Seasonic Prime TX-1300w
    2x WD Black SN850x 4TB
    Powercolor RX 9070 XT Red Devil (First time purchasing PowerColor)
    Brand new Windows 11 Pro install with 24h2
    Reply
  • Exploding PSU
    I wonder if something like this used to happen in the past... I had a Vega 56 that was such a hohehoheho to cool... It was a Sapphire though, but an issue like this would happen before the actual die even landed on the AIB's factory, right?
    Reply