AMD RDNA4 Navi 48 is 25% denser than Nvidia Blackwell GPUs — 53.9 billion transistors in a die smaller than GB203

A high-res die shot of AMD's Navi 48 GPU
(Image credit: AMD)

Today, AMD announced details about RDNA 4 and the forthcoming Radeon RX 9000-series GPUs. As part of its presentation, AMD revealed new details about the Navi 48 GPU die, which could be one of the densest GPUs we've ever seen.

Navi 48, the GPU die that AMD's RX 9070 XT and RX 9070 cards build upon, has officially been confirmed to be 357 mm². This is quite a bit smaller than the 390 mm² estimate that floated around the internet after AMD showed off Navi 48 to outlets at CES 2025. It is also smaller than the GB203 die used by Nvidia in the RTX 5080 and RTX 5070 Ti, with which Navi 48 directly competes.

Not only is the die smaller than Nvidia's, but it is also significantly denser. Navi 48 contains 53.9 billion transistors, compared to the 45.6 billion within GB203. After some simple division, Navi 48 shows up with 150M transistors per mm² versus GB203's 120M transistors per mm². AMD's latest offering upstages Nvidia with a GPU die that is 25% denser than their competitors. Even looking up to Blackwell's consumer peak, the GB202 used in the RTX 5090, AMD outshines its 123MTr/mm² density.

Obviously, this is not a fair comparison on power, efficiency, or any other metric, as the 5090 likely sits comfortably ahead of RDNA 4 in all metrics. Still, it is worth noting that Nvidia did not seem to prioritize transistor density as highly as AMD has in its current generation. And of course we do need to mention that transistor counts are generally considered to be approximate and that there are different ways of counting, so it's possible that factors into the figures we've derived from AMD's and Nvidia's official specs.

Looking one generation back to see how much RDNA 4 improved in density over RDNA 3 becomes difficult. While Navi 48 is a monolithic die, with its cache on the die with the graphics compute, RDNA 3's Navi 31 was not. Navi 31's GCD also sits at 150M transistors per square millimeter, but without accounting for sharing space with the cache on the same die. Accurately comparing Navi 48 and Navi 31's transistor density is an apples-to-oranges game. Still, credit should go to RDNA 4 for matching the transistor density of its predecessor while also fitting its 64MB L3 cache on the die. AMD's decision to abandon RDNA 3's chiplet-style design for a return to a monolithic die does not seem to have sacrificed density or efficiency.

Navi 48 is coming out of the gate swinging fresh out of its release announcement. With the best transistor density in class, as well as a major investment in improving ray tracing and FSR 4 performance, AMD appears to be attacking Nvidia's upper-midrange products with a zeal we haven't seen from the company's GPU division in quite some time.

If the RX 9070 XT can hit its MSRP of $599 at its March launch (nothing short of a miracle based on current GPU stock), AMD will be impossible to ignore for new GPU buyers. For a deeper dive into the architecture of Navi 48 and RDNA 4, be sure to read our deep dive into all we know about the new generation. Ultimately, though, it comes down to performance. Whatever the claimed transistor density, the faster chip will still be faster. Check back next week for the full reviews.

Dallin Grimm
Contributing Writer

Dallin Grimm is a contributing writer for Tom's Hardware. He has been building and breaking computers since 2017, serving as the resident youngster at Tom's. From APUs to RGB, Dallin has a handle on all the latest tech news. 

  • Notton
    Navi 48 having more density and transistor count compared to GB203 while, most likely, being slower than an RTX5080 is not a win... just say'n
    Reply
  • mhmarefat
    Notton said:
    Navi 48 having more density and transistor count compared to GB203 while, most likely, being slower than an RTX5080 is not a win... just say'n
    Why didn't you compare 9070 XT vs 5070 Ti?
    Reply
  • A Stoner
    If nVidia had a 25% denser package and kept the same yeilds that would translate to more chips being produced and more product to be sold.

    Making more from less is better as long as you do not lose something in the process.

    It should also make things faster and cost less energy as it has less distance to cover.
    Reply
  • systemBuilder_49
    It has lots of infinity cache which will drive up density vs the 4080. Cache circuitry is very small and packs extremely tightlyonto the chip!

    The correct AMD chip to compare it to is 7800xt since its a beefed up 7800xt!

    No sense comparing this chip to 5070 ti - that chip is ALREADY IN THE REARVIEW MIRROR as it's inferior ..
    Reply
  • JamesJones44
    The 5090/5080 use the same N4p node from TSMC. I would be curious to know where the extra 25% density is coming from. I understand node process isn't everything, but 25% is a big difference for the same process node.
    Reply
  • helper800
    systemBuilder_49 said:
    No sense comparing this chip to 5070 ti - that chip is ALREADY IN THE REARVIEW MIRROR as it's inferior ..
    That's nonsense. It is likely the 5070 ti will have faster raster than the 9070 XT. We shall see shortly with reviews in the coming days.
    Reply
  • JarredWaltonGPU
    Notton said:
    Navi 48 having more density and transistor count compared to GB203 while, most likely, being slower than an RTX5080 is not a win... just say'n
    As we note in the final paragraph.

    "Ultimately, though, it comes down to performance. Whatever the claimed transistor density, the faster chip will still be faster."

    Also, there's questions to how dense these really are. I personally think there's a difference in counting transistors. Maybe AMD includes blank space, or debug logic that Nvidia omits, or something. I don't know. I'm just saying that all indications are AMD doesn't intend to take on the 5080, and so Nvidia having a slightly larger chip that's faster but has fewer transistors? That may or may not be fully accurate.
    systemBuilder_49 said:
    It has lots of infinity cache which will drive up density vs the 4080. Cache circuitry is very small and packs extremely tightly onto the chip!
    AMD and Nvidia have both talked about how cache density scaling has slowed down. That was one of AMD's key talking points for GPU chiplets with RDNA 3! I'm still not fully convinced it's accurate, but I don't know. What I do know is that AMD and Nvidia both have a lot of cache in a monolithic design now, and transistor density doesn't seem to have been hurt too much.
    Reply
  • lmcnabney
    JarredWaltonGPU said:
    As we note in the final paragraph.

    "Ultimately, though, it comes down to performance. Whatever the claimed transistor density, the faster chip will still be faster."

    Also, there's questions to how dense these really are. I personally think there's a difference in counting transistors. Maybe AMD includes blank space, or debug logic that Nvidia omits, or something. I don't know. I'm just saying that all indications are AMD doesn't intend to take on the 5080, and so Nvidia having a slightly larger chip that's faster but has fewer transistors? That may or may not be fully accurate.

    AMD and Nvidia have both talked about how cache density scaling has slowed down. That was one of AMD's key talking points for GPU chiplets with RDNA 3! I'm still not fully convinced it's accurate, but I don't know. What I do know is that AMD and Nvidia both have a lot of cache in a monolithic design now, and transistor density doesn't seem to have been hurt too much.
    The both are 'staying / going back to' monolithic because they are both stuck in the same mature process. If they broke out chiplets they would be in the same process so might as well keep them on the same die. Chiplets only work when some components can be fabricated on older tech. No money to save leaving part of the GPU on 5nm.
    Reply
  • Notton
    mhmarefat said:
    Why didn't you compare 9070 XT vs 5070 Ti?
    Uhhh, maybe because 5070Ti is the cut down GB203, where as the 5080 is the fully enabled one?

    And maybe because the 9070 is the cut down Navi48, where as the 9070XT is the fully enabled one?
    Reply
  • bit_user
    The article said:
    AMD's decision to abandon RDNA 3's chiplet-style design for a return to a monolithic die does not seem to have sacrificed density or efficiency.
    I found some info on how TSMC N4P compares with N5. According to wikichip, it's 6% denser than N5. So, replacing the high-speed links to the MCDs with 64 MiB of L3 cache would've yielded a density of 141.5 MTr/mm^2 on the N5 node used by the RX 7900 GCD.

    In other words, they did sacrifice density vs. RDNA3, but (assuming both numbers are accurate) gained back enough to compensate. Had they kept the same architecture as RDNA3, they would've gotten up to perhaps 159 MTr/mm^2. That loss of potential density increase seems to me like a sacrifice.

    Source:
    https://fuse.wikichip.org/news/6439/tsmc-extends-its-5nm-family-with-a-new-enhanced-performance-n4p-node/
    BTW, "... or efficiency"? the RX 7900 XTX actually burned something like 10W on those chiplet links, IIRC. The chiplet approach was less efficient, but something AMD did for the sake of cost reduction and perhaps in anticipation that the supply of N5 production would be limited like it was in the pandemic era.

    The article said:
    AMD appears to be attacking Nvidia's upper-midrange products with a zeal we haven't seen from the company's GPU division in quite some time.
    RDNA2 was quite competitive. AMD had a winning idea with Infinity Cache. I think they hoped their chiplets would be another game changer, but it just didn't pan out. If N5 production had been more scarce, maybe it'd have worked better for them.
    Reply