AMD Navy Flounder and Sienna Cichlid GPU Specs Leak in ROCm Update

AMD Radeon RX 6000
(Image credit: AMD)

A sharp-eyed redditor found a listing in the new ROCm (Radeon open compute) firmware update that reveals some of the specs for the highly-anticipated RDNA2 Sienna Cichlid (Big Navi) and Navy Flounder (Navi 22 or 23) graphics cards. Things could change over the next month as AMD fine-tunes its gear, so take these specs with a grain of salt - even though the numbers come from an official firmware update, these specs aren't confirmed.

The firmware update indicates Sienna Cichlid (Navi 21, i.e., Big Navi) will feature 80 CUs and a 256-bit memory bus. The CU count for this GPU is interesting, indicating all that we know on Big Navi could be legit, with most signs pointing to 80 CUs (or around 80 CUs) to be the spec for Big Navi. If this is true, not to mention if the GPU runs on TSMC's latest 7nm process, we could see RTX 3080-levels of performance, but with much better efficiency.

Swipe to scroll horizontally
Navi GPU Specifications from Linux Kernel Firmware Update:
ParameterNavi 10Navi 14Navi 12Sienna Cichlid (Big Navi)Navy Flounder (Navi 22/23)
gc_num_se21242
gc_num_cu_per_sh 1012101010
gc_num_sh_per_se22222
gc_num_rb_per_se88844
gc_num_tccs168161612
gc_num_gprs10241024102410241024
gc_num_max_gs_thds3232323232
gc_gs_table_depth3232323232
gc_gsprim_buff_depth17921792179217921792
gc_double_offchip_lds_buffer1024512102410241024
gc_wave_size3232323232
gc_max_waves_per_simd2020201616
gc_lds_size6464646464
num_sc_per_sh11111
num_packer_per_sc22244

The firmware specifications also list AMD's Navy Flounder with 40 CUs and a 192-bit memory bus. This GPU appears to a direct replacement to the RX 5700 XT and/or RX 5700 on the new RDNA2 architecture. If each CU has the same 64 stream processors as RDNA1, then Navy Flounder will have identical core counts to the 5700 XT.

Strangely, the memory bus is narrower than Navi 10 chips (RX 5700/XT) at 192-bits, AMD could be pulling an Ampere tactic by using higher-frequency GDDR6X memory to compensate for the lower bus width. Unfortunately, there is not enough information to indicate just where these GPUs will be placed in AMD's lineup - they could compete with the rumored RTX 3060 Ti, possibly the RTX 3070, or even other SKUs in the future.

Hopefully, this time around, AMD can make a bigger dent in Nvidia's market share over the upcoming months and years. AMD will announce these new GPUS under the RX 6000 series branding on October 28th.

Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

  • Endymio
    > " if the GPU runs on TSMC's latest 7nm process, we could see RTX 3080-levels of performance, but with much better efficiency. ..."
    How exactly is that conclusion reached? Isn't that the same node the 3080 is manufactured on?
    Reply
  • craigss
    no its samsung 8nm
    Reply
  • Endymio
    craigss said:
    no its samsung 8nm
    Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better?
    Reply
  • craigss
    no TSMC is better
    Reply
  • dwards
    Endymio said:
    Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better?
    No, I remember reading that Samsung’s node can hold 80 somethings per square mm, while TSMC’s can hold 96, which is exactly 20% more, so I assume that Nvidia could’ve made GPUs with better thermals if it used TSMC’s node instead of Samsung’s
    Reply
  • spentshells
    Nvidia went with samsung due to availability from what I'veread, sorry no references to offer as it was a while ago. TSMC is super busy.
    Reply
  • digitalgriffin
    spentshells said:
    Nvidia went with samsung due to availability from what I'veread, sorry no references to offer as it was a while ago. TSMC is super busy.
    It was a combination of factors. 8nm had a higher yield and depreciation cost so it was much cheaper per chip than the 7nm (Speculation) This decision was made 2 years ago. That's when you start to plan out the layout of the chip based on the fab rules. THIS IS ALL SPECULATION. But I cannot find fault in Adored's analysis.

    tXb-8feWoOE:0View: https://www.youtube.com/watch?v=tXb-8feWoOE&t=0s

    Plus 8nm Samsung was available. AMD and all the other seized that silicon capacity early on.
    Reply
  • MasterMadBones
    Endymio said:
    Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better?
    Samsung 8nm is roughly equivalent in to early-mid 2019 TSMC 7nm in terms of performance and yield, but efficiency is much worse. I'm not sure truly how efficient either of these nodes is, but when density is 20% higher and efficiency is 15% higher on TSMC, thermal density should be roughly the same. The absolute amount of heat that needs to be dissipated from Samsung die is larger though, which has resulted in large and elaborate cooling solutions on the FE cards.
    digitalgriffin said:
    8nm had a higher yield and depreciation cost so it was much cheaper per chip than the 7nm (Speculation) This decision was made 2 years ago
    This is comparing Samsung 8nm to Samsung 7nm. Samsung 7nm can be considered a 'broken' node.
    digitalgriffin said:
    Plus 8nm Samsung was available. AMD and all the other seized that silicon capacity early on.
    Nvidia tried to bully TSMC into lowering prices but they didn't budge because demand for 7nm was already high. AMD took up a large part of the capacity that Nvidia was looking to get in the meantime.
    Reply
  • Giroro
    256-bit memory bus sounds like memory bandwidth could be too bottle-necked to reach 3080 performance, even if they use best available GDDR6X.
    But, then again The Xbox Series S is targeting 1440p120 using only 10GB total system memory... so maybe Navi2 is some miracle architecture with way better memory efficiency than anybody thought was possible.
    Reply
  • digitalgriffin
    Giroro said:
    256-bit memory bus sounds like memory bandwidth could be too bottle-necked to reach 3080 performance, even if they use best available GDDR6X.
    But, then again The Xbox Series S is targeting 1440p120 using only 10GB total system memory... so maybe Navi2 is some miracle architecture with way better memory efficiency than anybody thought was possible.

    It's the caching architecture. There is a very large cache on Navi. And to be honest this is the first step to MCM Navi chips that scale using a similar type of infinity fabric.


    When you render to a small block, only a small subset of data is needed because blocks are relatively small. And you can only execute so many at a time. This limits the memory needed. BUT and this is a biggggggggg but, this is highly dependent on the look ahead predicting what memory is going to be needed for which tile and having it ready. That is some VERY tricky architectural challenges there. This might be possible as final texture render doesn't happen till later stages of the pipe. It could be easy to trip up the cache system. Textures are what eat up memory.
    Reply