AMD Navy Flounder and Sienna Cichlid GPU Specs Leak in ROCm Update
Big Navi's looking good
A sharp-eyed redditor found a listing in the new ROCm (Radeon open compute) firmware update that reveals some of the specs for the highly-anticipated RDNA2 Sienna Cichlid (Big Navi) and Navy Flounder (Navi 22 or 23) graphics cards. Things could change over the next month as AMD fine-tunes its gear, so take these specs with a grain of salt - even though the numbers come from an official firmware update, these specs aren't confirmed.
The firmware update indicates Sienna Cichlid (Navi 21, i.e., Big Navi) will feature 80 CUs and a 256-bit memory bus. The CU count for this GPU is interesting, indicating all that we know on Big Navi could be legit, with most signs pointing to 80 CUs (or around 80 CUs) to be the spec for Big Navi. If this is true, not to mention if the GPU runs on TSMC's latest 7nm process, we could see RTX 3080-levels of performance, but with much better efficiency.
Parameter | Navi 10 | Navi 14 | Navi 12 | Sienna Cichlid (Big Navi) | Navy Flounder (Navi 22/23) |
gc_num_se | 2 | 1 | 2 | 4 | 2 |
gc_num_cu_per_sh | 10 | 12 | 10 | 10 | 10 |
gc_num_sh_per_se | 2 | 2 | 2 | 2 | 2 |
gc_num_rb_per_se | 8 | 8 | 8 | 4 | 4 |
gc_num_tccs | 16 | 8 | 16 | 16 | 12 |
gc_num_gprs | 1024 | 1024 | 1024 | 1024 | 1024 |
gc_num_max_gs_thds | 32 | 32 | 32 | 32 | 32 |
gc_gs_table_depth | 32 | 32 | 32 | 32 | 32 |
gc_gsprim_buff_depth | 1792 | 1792 | 1792 | 1792 | 1792 |
gc_double_offchip_lds_buffer | 1024 | 512 | 1024 | 1024 | 1024 |
gc_wave_size | 32 | 32 | 32 | 32 | 32 |
gc_max_waves_per_simd | 20 | 20 | 20 | 16 | 16 |
gc_lds_size | 64 | 64 | 64 | 64 | 64 |
num_sc_per_sh | 1 | 1 | 1 | 1 | 1 |
num_packer_per_sc | 2 | 2 | 2 | 4 | 4 |
The firmware specifications also list AMD's Navy Flounder with 40 CUs and a 192-bit memory bus. This GPU appears to a direct replacement to the RX 5700 XT and/or RX 5700 on the new RDNA2 architecture. If each CU has the same 64 stream processors as RDNA1, then Navy Flounder will have identical core counts to the 5700 XT.
Strangely, the memory bus is narrower than Navi 10 chips (RX 5700/XT) at 192-bits, AMD could be pulling an Ampere tactic by using higher-frequency GDDR6X memory to compensate for the lower bus width. Unfortunately, there is not enough information to indicate just where these GPUs will be placed in AMD's lineup - they could compete with the rumored RTX 3060 Ti, possibly the RTX 3070, or even other SKUs in the future.
Hopefully, this time around, AMD can make a bigger dent in Nvidia's market share over the upcoming months and years. AMD will announce these new GPUS under the RX 6000 series branding on October 28th.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
-
Endymio > " if the GPU runs on TSMC's latest 7nm process, we could see RTX 3080-levels of performance, but with much better efficiency. ..."Reply
How exactly is that conclusion reached? Isn't that the same node the 3080 is manufactured on? -
Endymio
Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better?craigss said:no its samsung 8nm -
dwards
No, I remember reading that Samsung’s node can hold 80 somethings per square mm, while TSMC’s can hold 96, which is exactly 20% more, so I assume that Nvidia could’ve made GPUs with better thermals if it used TSMC’s node instead of Samsung’sEndymio said:Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better? -
spentshells Nvidia went with samsung due to availability from what I'veread, sorry no references to offer as it was a while ago. TSMC is super busy.Reply -
digitalgriffin
It was a combination of factors. 8nm had a higher yield and depreciation cost so it was much cheaper per chip than the 7nm (Speculation) This decision was made 2 years ago. That's when you start to plan out the layout of the chip based on the fab rules. THIS IS ALL SPECULATION. But I cannot find fault in Adored's analysis.spentshells said:Nvidia went with samsung due to availability from what I'veread, sorry no references to offer as it was a while ago. TSMC is super busy.
tXb-8feWoOE:0View: https://www.youtube.com/watch?v=tXb-8feWoOE&t=0s
Plus 8nm Samsung was available. AMD and all the other seized that silicon capacity early on. -
MasterMadBones
Samsung 8nm is roughly equivalent in to early-mid 2019 TSMC 7nm in terms of performance and yield, but efficiency is much worse. I'm not sure truly how efficient either of these nodes is, but when density is 20% higher and efficiency is 15% higher on TSMC, thermal density should be roughly the same. The absolute amount of heat that needs to be dissipated from Samsung die is larger though, which has resulted in large and elaborate cooling solutions on the FE cards.Endymio said:Yes, but I thought the two nodes were essentially equal in performance (and density), but that NVidia switched to Samsung because the thermals were better?
This is comparing Samsung 8nm to Samsung 7nm. Samsung 7nm can be considered a 'broken' node.digitalgriffin said:8nm had a higher yield and depreciation cost so it was much cheaper per chip than the 7nm (Speculation) This decision was made 2 years ago
Nvidia tried to bully TSMC into lowering prices but they didn't budge because demand for 7nm was already high. AMD took up a large part of the capacity that Nvidia was looking to get in the meantime.digitalgriffin said:Plus 8nm Samsung was available. AMD and all the other seized that silicon capacity early on. -
Giroro 256-bit memory bus sounds like memory bandwidth could be too bottle-necked to reach 3080 performance, even if they use best available GDDR6X.Reply
But, then again The Xbox Series S is targeting 1440p120 using only 10GB total system memory... so maybe Navi2 is some miracle architecture with way better memory efficiency than anybody thought was possible. -
digitalgriffin Giroro said:256-bit memory bus sounds like memory bandwidth could be too bottle-necked to reach 3080 performance, even if they use best available GDDR6X.
But, then again The Xbox Series S is targeting 1440p120 using only 10GB total system memory... so maybe Navi2 is some miracle architecture with way better memory efficiency than anybody thought was possible.
It's the caching architecture. There is a very large cache on Navi. And to be honest this is the first step to MCM Navi chips that scale using a similar type of infinity fabric.
When you render to a small block, only a small subset of data is needed because blocks are relatively small. And you can only execute so many at a time. This limits the memory needed. BUT and this is a biggggggggg but, this is highly dependent on the look ahead predicting what memory is going to be needed for which tile and having it ready. That is some VERY tricky architectural challenges there. This might be possible as final texture render doesn't happen till later stages of the pipe. It could be easy to trip up the cache system. Textures are what eat up memory.