Skip to main content

AMD Big Navi and RDNA 2 GPUs: Release Date, Specs, Everything We Know

AMD Big Navi teaser
(Image credit: AMD)

AMD Big Navi, RX 6000, Navi 2x, RDNA 2. Whatever the name, AMD's next-generation GPUs are promising big performance and efficiency gains, along with feature parity with Nvidia in terms of ray tracing support. Will Team Red finally take the pole position in our GPU benchmarks hierarchy and lay claim to the crown for the best graphics card, or will the Nvidia Ampere architecture cards keep the top spots? With the Radeon RX 6800 XT and RX 6800 launched, and the Radeon RX 6900 XT following close behind, the answer is ... well, It's the closest AMD has come in quite some time to taking the top spot. Here's everything we know about AMD Big Navi, including the RDNA 2 architecture, specifications, performance, release date and pricing.

Nvidia has launched the GeForce RTX 3090, GeForce RTX 3080, GeForce RTX 3070 and GeForce RTX 3060 Ti. There are various ways of looking at the Nvidia Ampere launch. It's Nvidia doing its best to bury AMD before Big Navi even steps out the door, or Nvidia is scared of what AMD is doing with RDNA 2, or Nvidia rushed the launch to get ahead of the holiday shopping spree, or ... you get the point. The RTX 3060 Ti and 3070 are priced reasonably (relative to the Turing launch at least), but demand right now is very high. Frankly, AMD would have loved to launch Big Navi earlier, but it had a lot of other balls it's juggling (like Zen 3, not to mention all the COVID stuff).

AMD officially unveiled Big Navi, including specs for the RX 6900 XT, RTX 6800 XT, and RTX 6800. We've updated this article with revised deatils and expectations, though there are still future RDNA2 products yet to be revealed. Based on what we've seen, Big Navi has finally put AMD's high graphics card power consumption behind it. Or at least, Big Navi is no worse than Nvidia's RTX 30-series cards, considering the 3080 and 3090 have the highest Nvidia TDPs for single GPUs ever. Let's start at the top, with the new RDNA 2 architecture that powers RX 6000 / Big Navi / Navi 2x. Here’s what we know, expect, and occasionally guess for the AMD's upcoming GPUs.

Big Navi / RDNA 2 at a Glance

  • Up to 80 CUs / 5120 shaders
  • 50% better performance per watt
  • Launched November 18 (RX 6800 series) and December 8 (RX 6900 XT)
  • Pricing of $579, $640 and $999 for RX 6800, RX 6800 XT and RX 6900 XT

(Image credit: AMD)

The RDNA 2 Architecture in Big Navi 

Every generation of GPUs is built from a core architecture, and each architecture offers improvements over the previous generation. It's an iterative and additive process that never really ends. AMD's GCN architecture went from first generation for its HD 7000 cards in 2012 up through fifth gen in the Vega and Radeon VII cards in 2017-2019. The RDNA architecture that powers the RX 5000 series of AMD GPUs arrived in mid 2019, bringing major improvements to efficiency and overall performance. RDNA 2 looks to double down on those improvements in late 2020.

First, a quick recap of RDNA 1 is in order. The biggest changes with RDNA 1 over GCN involve a redistribution of resources and a change in how instructions are handled. In some ways, RDNA doesn't appear to be all that different from GCN. The instruction set is the same, but how those instructions are dispatched and executed has been improved. RDNA also adds working support for primitive shaders, something present in the Vega GCN architecture that never got turned on due to complications.

Perhaps the most noteworthy update is that the wavefronts—the core unit of work that gets executed—have been changed from being 64 threads wide with four SIMD16 execution units, to being 32 threads wide with a single SIMD32 execution unit. SIMD stands for Single Instruction, Multiple Data; it's a vector processing element that optimizes workloads where the same instruction needs to be run on large chunks of data, which is common in graphics workloads.

This matching of the wavefront size to the SIMD size helps improve efficiency. GCN issued one instruction per wave every four cycles; RDNA issues an instruction every cycle. GCN used a wavefront of 64 threads (work items); RDNA supports 32- and 64-thread wavefronts. GCN has a Compute Unit (CU) with 64 GPU cores, 4 TMUs (Texture Mapping Units) and memory access logic. RDNA implements a new Workgroup Processor (WGP) that consists of two CUs, with each CU still providing the same 64 GPU cores and 4 TMUs plus memory access logic.

How much do these changes matter when it comes to actual performance and efficiency? It's perhaps best illustrated by looking at the Radeon VII, AMD's last GCN GPU, and comparing it with the RX 5700 XT. Radeon VII has 60 CUs, 3840 GPU cores, 16GB of HBM2 memory with 1 TBps of bandwidth, a GPU clock speed of up to 1750 MHz, and a theoretical peak performance rating of 13.8 TFLOPS. The RX 5700 XT has 40 CUs, 2560 GPU cores, 8GB of GDDR6 memory with 448 GBps of bandwidth, and clocks at up to 1905 MHz with peak performance of 9.75 TFLOPS.

On paper, Radeon VII looks like it should come out with an easy victory. In practice, across a dozen games that we've tested, the RX 5700 XT is slightly faster at 1080p gaming and slightly slower at 1440p. Only at 4K is the Radeon VII able to manage a 7% lead, helped no doubt by its memory bandwidth. Overall, the Radeon VII only has a 1-2% performance advantage, but it uses 300W compared to the RX 5700 XT's 225W.

In short, AMD is able to deliver roughly the same performance as the previous generation, with a third fewer cores, less than half the memory bandwidth and using 25% less power. That's a very impressive showing, and while TSMC's 7nm FinFET manufacturing process certainly warrants some of the credit (especially in regards to power), the performance uplift is mostly thanks to the RDNA architecture.

(Image credit: AMD)

That's a lot of RDNA discussion, but it's important because RDNA 2 appears to carry over all of that, with two major new additions. First is support for ray tracing, Variable Rate Shading (VRS), and everything else that's part of the DirectX 12 Ultimate spec. The other big addition is, literally, big: a 128MB Infinity Cache that will help optimize memory bandwidth and latency.

There are other tweaks to the architecture, but AMD is making some big claims about Big Navi / RDNA 2 / Navi 2x when it comes to performance per watt. Specifically, AMD says RDNA 2 will offer 50% more performance per watt than RDNA 1, which is frankly a huge jump—the same large jump RDNA 1 saw relative to GCN.

It means RDNA 2 could deliver either the same performance while using 33% less power, or 50% higher performance with the same power, or most likely some in between solution with higher performance and lower power requirements. It looks like AMD is doing a bit of both, as TBP (Total Board Power) is 300W for the 6900 XT and 6800 XT, and 250W for the RX 6800. Part of the PPW improvements come thanks to the Infinity Cache, which we'll discuss more below.

RDNA 2 / Big Navi / RX 6000 GPUs support ray tracing, via DirectX 12 Ultimate or VulkanRT. That brings AMD up to feature parity with Nvidia. There was some question as to whether AMD would use the same BVH approach to ray tracing calculations as Nvidia, and with the PlayStation 5 and Xbox Series X announcements out of the way, the answer is yes. The devil is in the details, though.

If you're not familiar with the term BVH, it stands for Bounding Volume Hierarchy and is used to efficiently find ray and triangle intersections; you can read more about it in our discussion of Nvidia's Turing architecture and its ray tracing algorithm. While AMD didn't provide much detail on its BVH hardware, BVH as a core aspect of the ray tracing APIs is required, and we heard similar talk about ray tracing and BVH with the VulkanRT and DirectX 12 Ultimate announcements.

AMD's RDNA 2 chips will contain one Ray Accelerator per CU, which is similar to what Nvidia has done with it's RT cores. Even though AMD sort of takes the same approach as Nvidia, the comparison between AMD and Nvidia isn't clear cut. Nvidia for example says it roughly doubled the performance of its RT cores in Ampere. AMD says the Ray Accelerator does ray-triangle intersection calculations about 10 times faster than a software (shader) solution. In testing, Big Navi RT performance generally doesn't come anywhere close to matching Ampere, though it can usually keep up with Turing RT performance.

Big Navi die shot, Navi 21

AMD rendered image of the Navi 21 die. The Infinity Cache is the large green sections above and below the GPU cores. (Image credit: AMD)

The Infinity Cache is perhaps the most interesting change. By including a whopping 128MB cache (L3, but with AMD branding), AMD should be able to keep basically all of the framebuffer cached, along with the z-buffer and the some recent textures. That will dramatically reduce memory bandwidth use and latency, and AMD claims the Infinity Cache allows the relatively tame GDDR6 16 Gbps memory to deliver an effective bandwidth that's 2.17 times higher than the raw numbers would suggest.

The Infinity Cache also helps with ray tracing calculations. We've seen on Nvidia's GPUs that memory bandwidth can impact RT performance on the lower tier cards like the RTX 2060, but it might also be memory latency that's to blame. We can't test AMD Big Navi performance in RT without the Infinity Cache, however, and all we know is that RT performance tends to lag behind Nvidia.

One thing we don't know about the Infinity Cache is how that will propagate down to the lower tier RDNA2 chips. 128MB is very large, and based on AMD's image of the die, it's about 17 percent of the total die area on Navi 21. The CUs by comparison are only about 31 percent of the die area, with memory controllers, texture units, video controllers, video encoder/decoder hardware, and other elements taking up the rest of the chip. Navi 22 and Navi 23 will have far lower CU counts, which means a 128MB cache would occupy even more of the die space. Most likely, the Infinity Cache will either be cut down in size to 64MB (maybe 96MB or 32MB or some other value), or on the budget parts it might be removed completely. We'll have to wait and see what AMD does, but a smaller size is the most likely approach.

It's worth noting that Nvidia also has Tensor cores in its Turing architecture, which are used for deep learning and AI computations, as well as DLSS (Deep Learning Super Sampling), which has now been generalized with DLSS 2.0 (and DLSS 2.1) to improve performance and image quality and make it easier for games to implement DLSS. AMD doesn't have a Tensor core equivalent, though it mentioned an upcoming Super Resolution feature that will attempt to take on DLSS. AMD's CAS (Contrast Aware Sharpening) and RIS (Radeon Image Sharpening) already overlap with DLSS in some ways, and Super Resolution will be part of the open FidelityFX suite, so we'll have to wait and see how it compares against DLSS. Even without Tensor cores, it may be possible to do all the inference side of a DLSS-style algorithm using the FP16 capabilities of Navi 2x. For that matter, Super Resolution may even work on older generation GPUs from AMD, Nvidia, and even Intel.

AMD plans to have multiple Navi 2x products, and we expect to see extreme, high-end and mainstream options—though the latter are likely coming later, given RX 5500 XT and RX 5600 XT aren't that old. AMD released the highest performance options first, eventually following up with mid-range and budget solutions.

(Image credit: AMD)

RX 6000 / Big Navi / Navi 2x Specifications 

What does all of this mean for RX 6000 / Big Navi / RDNA 2 desktop GPUs? If you thought the Xbox Series X GPU sounded potent, what with its 52 CUs, Big Navi is taking things a step further. AMD is basically doubling down on Navi 10 when it comes to CUs and shaders, shoving twice the number of both into the largest Navi 21 GPU.

Navi 10 is relatively small at just 251mm square, and while AMD didn't reveal an official die size yet, AMD has used much larger die sizes in the past. After months of rumors and speculation, we can finally shed some light on the full specs for the top three Big Navi GPUs.

AMD RX 6000 / Big Navi / Navi 2x Specifications
Graphics CardRX 6900 XTRX 6800 XTRX 6800RX 6700 XT?RX 6500 XT?
GPUNavi 21 XTXNavi 21 XTNavi 21 XLNavi 22?Navi 23?
Process (nm)77777
Transistors (billion)26.826.826.8??
Die size (mm^2)519519519?236?
CUs8072604032?
GPU cores51204608384025602048?
Max Clock (MHz)2250225021052100?2100?
VRAM Speed (MT/s)16000160001600016000?16000?
VRAM (GB)161616128
Bus width256256256192128
Infinity Cache (MB)128128128??
ROPs6464646432?
TMUs320288256160128
TFLOPS (boost)23.020.716.210.28.2
Bandwidth (GB/s)512512512384?256?
TBP (watts)300300250200?125?
Launch DateDec 8, 2020Nov 18, 2020Nov 18, 2020Early 2021?Early 2021?
Launch Price$999$649$579$399?$299?

The highest spec parts all use the same Navi 21 GPU, just with differing numbers of enabled functional units. Navi 21 has 80 CUs and 5120 GPU cores, and is more than double the size (519mm square) of the current Navi 10 used in the RX 5700 XT. But a big chip means lower yields, so AMD has parts with 72 and 60 CUs as well.

So far, Radeon RX 6900 XT has only been seen in very limited quantities, and there aren't nearly as many third party designs as there are for the 6800 XT. That makes sense, as yields of fully functional Navi 21 chips are likely to be quite low. Disable 8 CUs and yields will jump significantly, while with 20 CUs disabled probably most of the chips in a wafer will be viable options. Anyway, the RX 6900 XT gets the full chip, RX 6800 XT has 72 CUs and is the main launch part, and the RX 6800 is the third Navi 21 variant with 60 CUs.

Big Navi / RDNA 2 adds support for ray tracing and some other tech, which require quite a few transistors. The very large Infinity Cache is also going to use up a huge chunk of die area, but it also helps overcome potential bandwidth limitations caused by the somewhat narrow 256-bit bus width on Navi 21. Note that Nvidia has opted for a 320-bit bus on the 3080 and 384-bit on the 3090, plus faster GDDR6X memory, so Nvidia has much higher raw bandwidth, but it comes at the cost of higher power requirements and a larger die size.

Architecturally, AMD has tuned Big Navi to hit the highest GPU clock speeds we've seen to date, which come thanks to optimizations and data path updates that AMD has put into the design. While the Game Clock is only around 2GHz, in practice we've seen clocks in the 2.2-2.4 GHz range. That's perhaps one of several advantages stemming from TSMC's N7 process, which is generally superior to Samsung's 8N (revised 10nm) that Nvidia's using for Ampere.

Navi 10

(Image credit: AMD)

Getting back to the memory side of things, AMD's configurations are going to be very interesting to see in practice. 16GB of GDDR6 for the top three cards is great to see, and it means memory capacity at least won't be a problem. The memory bandwidth might look a bit weak, but the Infinity Cache goes a long way toward alleviating demands on the GDDR6 memory. If AMD's 117% improvement in effective bandwidth claims prove accurate, even with 512 GBps, the Navi 21 GPUs will behave like they have over 1 TBps, relative to Nvidia's cards. That's more than even the RTX 3090.

There are two other Navi 2x cards listed where we still don't have official specs. We've filled in the blanks with the latest rumors, but as with all rumors, take them with a scoop of salt. RX 6700 XT looks like it will carry on from the current RX 5700 XT at the $400 price point, and the RX 6500 XT (there's apparently no RX 6600 XT, at least so far) will take over from the RX 5600 XT. We don't expect either of these to launch until 2021, and they may not come out until March or even April.

As far as a true budget Navi 2x card is concerned, no one is posting any real information on that yet. There might be a Navi 24 or something in the coming year, with only 20-30 CUs max. That would put it at the level of the Xbox Series S, at which point we're not sure if it's really worth including ray tracing support. We'll have to see how things develop in the coming months, though, as 1080p with Super Resolution upscaling might run fine on such a GPU.

Image 1 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)
Image 2 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)
Image 3 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)
Image 4 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)
Image 5 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)
Image 6 of 6

AMD Biggest Navi (RX 6900 XT) performance

(Image credit: Tom's Hardware)

Big Navi / Navi 2x Performance

With the official launches now complete, we have replaced AMD's internal performance metrics with our own benchmarks — 21 of them to be preceise. All of the testing was done on a Core i9-9900K setup, though as we saw with the RX 6800 XT launch, 10900K and Ryzen 9 5900X don't radically change things (especially at 1440p and 4K). The specs and performance are good to great in rasterization games, while AMD generally comes up far short of the competition in ray tracing workloads.

At the top, the RX 6900 XT goes up against RTX 3090. AMD gets some decisive wins, and overall the price to performance ratio of the 6900 XT looks good... until we check out ray tracing. Of course, when we're looking at $1,000 and $1,500 graphics cards, value shouldn't be a primary concern. Anyone with the money to pay for the absolute best should probably go with Nvidia, which provides access to DLSS and other extras that AMD doesn't offer.

Stepping down to the RX 6800 XT and RTX 3080, things get quite a bit more heated in the matchup. AMD's card comes up with the overall win at 1440p and 1080p (granted, in an AMD-weighted suite of games), while the 3080 still leads at 4K ultra. Of course, the 6800 XT costs $50 less and uses a bit less power as well. Ray tracing, again, is a weak spot for AMD, and that's without enabling DLSS on the Nvidia cards, which can boost performance by 50 percent or more in many cases.

The RTX 3070 and RX 6800 head-to-head swings even further in AMD's favor. AMD leads across nearly the entire test suite, the main exception being Watch Dogs Legion where we enabled DXR. This time, however, AMD's card costs more than the Nvidia competition, and we can point to the $400 RTX 3060 Ti as being the best overall value.

More important than price and performance right now is going to be availability, however. Whoever can ship more GPUs in the next month or two will come out ahead, and Nvidia of course got a two month head start over AMD in that respect. Based on what we've seen so far, supply of Big Navi GPUs is quite a bit worse than that of Ampere GPUs, and will likely remain so until February 2021 or later.

Big Navi / Navi 2x Graphics Card Model Names 

(Image credit: AMD)

AMD has said it will launch a whole series of Navi 2x GPUs. The Navi 1x family consists of RX 5700 XT, RX 5700, RX 5600 XT, and RX 5500 XT (in 4GB and 8GB models), along with RX 5600/5500/5300 models for the OEM market that lack the XT suffix. The RX 6000 series so far consists of just three options, but more will come.

AMD could simply add 1000 points to the current models, but there are a few changes this round. At the top, the 6900 and 6800 models don't have any direct previous equivalent. Below those, however, AMD hasn't announced any actual cards. That leaves us with new performance and pricing tiers with the RX 6900 XT, RTX 6800 XT, and RX 6800. We expect RX 6700 XT, RX 6500 XT, and other cards to show up next year.

There's still a 200 point gap between the high-end 6700 XT and the mainstream/budget 6500 XT, according to the latest rumors. But then, the rumors could be wrong so we'll just have to wait and see how things develop.

RX 6000 / Big Navi / RDNA 2 Release Date 

AMD successfully launched RDNA 2, aka Big Navi, in 2021 — depending on how you want to define some of those words. November 18 saw the RX 6800 XT and RX 6800 hit retail outlets for a few seconds, and the same happened on December 8 for the RX 6900 XT. What we don't know is how many cards were actually sold. We don't know how many Nvidia Ampere GPUs have been sold either, though the RTX 3080 at least has managed to show up in the latest Steam Hardware Survey (it's 'tied' with the RX 5600 XT in overall market share at 0.23 percent, though Valve has never divulged the statistics behind its survey so take that with a shovel of salt).

The impact of COVID-19 around the globe continues to be a problem, and AMD has lots of pies cooking right now. Besides Big Navi, there's the new Ryzen 5000 CPUs, Xbox Series X and Xbox Series S, and PlayStation 5. All of the chips are manufactured using TSMC's N7 node, which is perhaps why Nvidia ended up at Samsung 8N for the consumer Ampere GPUs.

Regardless, supply likely won't be sufficient to meet the demand from gamers and enthusiasts any time soon, even on extreme parts like the RX 6900 XT. Just like Nvidia's Ampere GPUs, we expect AMD to sell everything it produces for at least the rest of 2020 and well into 2021. Hopefully by spring 2021 things will calm down a bit, and we'll have enough GPUs to go around. And less pandemic mayhem (knock on wood).

RX 6000 / Big Navi / Navi 2x Cost 

(Image credit: AMD)

It's not too surprising that AMD's RX 6000 GPUs are targeting higher prices than the previous generation. Even the RX 6800 costs $579, which is more than any of the previous RX series GPUs. (Only the Radeon VII cost more.) The RX 6900 XT meanwhile goes for a cool grand, a price we haven't seen from any AMD graphics card since the dual-GPU HD 7990 back in 2013 (though the R9 295X2 still holds the record for AMD pricing at $1499).

Earlier guesses at pricing ended up being far too ambitious, though consisering where AMD stands relative to Nvidia's RTX 30-series performance, that's not too surprising. When you can play with the best, you expect people to pay for the best. AMD comes in well below Nvidia's pricing on the RTX 3090, and slightly below the RTX 3080 pricing as well. Meanwhile, RX 6800 costs slightly more than the RTX 3070.

As for overall value, it's a bit of a mix. AMD gets a lot of wins in traditional rasterized games, but Nvidia has been doing RTX hardware and DLSS for two years now, and there's at least some benefit in that. Use of DLSS in particular is starting to pick up, and we definitely appreciate the option to boost performance with very little change in image fidelity. Depending on whether you want to play with ray tracing on or off, paying a bit more for Nvidia isn't out of the question.

Considering the large 519mm square or larger die size, the cost of TSMC's N7 lithography, and the 16GB VRAM on the models AMD has officially revealed, the pricing looks pretty reasonable. There are already situations where 8GB VRAM can be a bit of a handicap, and the RTX 3070 sometimes struggles or has to drop settings a notch. With 16GB, VRAM isn't ever really a problem on the Big Navi cards. AMD has competitive GPUs that can go toe to toe with Nvidia, and just like it's doing with Zen 3 CPUs now that it appears to have a clear lead over Intel, AMD is raising its prices and ditching — or at least deemphasizing — the value proposition.

Traditionally, AMD GPUs tend to drop below MSRP much more quickly than Nvidia GPUs, but with the current shortages across all GPUs, that's not likely to happen any time soon. Maybe by mid-2021 we could see sub-$500 RX 6800, but much of that will depend on supply and demand. Nvidia can drop prices as well, so that's something to watch for once supply improves.

Big Navi and RX 6000 Closing Thoughts

AMD has a lot riding on Big Navi, RDNA 2, and the Radeon RX 6000 series. After playing second fiddle to Nvidia for the past several generations, AMD is ready to take its shot at the top. AMD has to worry about more than just PC graphics cards, though. RDNA 2 is the GPU architecture that powers the next generation of consoles, which tend to have much longer shelf lives than PC graphics cards. Look at the PS4 and Xbox One: both launched in late 2013 and are still in use today.

If you were hoping for a clear win from AMD, across all games and rendering APIs, that didn't happen. Big Navi performs great in many cases, but with ray tracing it looks decidedly mediocre. Higher performance in games that don't use ray tracing might be more important today, but a year or two down to road, that could change. Then again, the consoles have AMD GPUs and are more likely to see AMD-specific optimizations, so AMD isn't out of the running yet.

Just as important as performance and price, though, we need actual cards for sale. There's clearly demand for new levels of performance, and every Ampere GPU and Big Navi GPU so far has sold out as quickly as the products are available for purchase. There's only so much silicon to go around, sadly. Samsung apparently can't keep up with demand for Ampere GPUs, and TSMC has a lot more going on — it can only produce so many N7 wafers per month!

The bottom line is that if you're looking for a new high-end graphics card, Big Navi is a good competitor. But if you want something that can run every game at maxed out settings, even with ray tracing, at 4K and 60 fps? Not even the RTX 3090 can manage that, which means even while we're plagued with shortages on all the current GPUs, we're already looking toward the future next-gen GPUs. Save us, Hopper and RDNA3. You're our only hope!

  • animalosity
    Unless my math is wrong; 80 Compute Units * 96 Raster Operations * 1600 Mhz clock = 12.28 TFLOPS of single precision floating point (FP32).

    Not bad AMD. Not bad. Let's see what that translates to in the real world, though with the advances of DX12 and now Vulkan being implemented I expect AMD to be on a more even level playing field with high end Nvidia. I might be inclined to head back to team Red, especially if the price is right.
    Reply
  • JarredWaltonGPU
    animalosity said:
    Unless my math is wrong; 80 Compute Units * 96 Raster Operations * 1600 Mhz clock = 12.28 TFLOPS of single precision floating point (FP32).

    Not bad AMD. Not bad. Let's see what that translates to in the real world, though with the advances of DX12 and now Vulkan being implemented I expect AMD to be on a more even level playing field with high end Nvidia. I might be inclined to head back to team Red, especially if the price is right.
    Your math is wrong. :-)

    FLOPS is simply FP operations per second. It's calculated as a "best-case" figure, so FMA instructions (fused multiply add) count as two operations, and each GPU core in AMD and Nvidia GPUs can do one FMA per clock (peak theoretical performance). So FLOPS ends up being:
    GPU cores * 2 * clock

    For the tables:
    80 CUs * 64 cores/CU * 2 * clock (1600 MHz) = 16,384 GFLOPS.

    ROPs and TMUs and some other functional elements of GPUs might do work that sort of looks like an FP operation, but they're not programmable or accessible in the same way as the GPUs and so any instructions run on the ROPs or TMUs generally aren't counted as part of the FP32 performance.
    Reply
  • animalosity
    JarredWaltonGPU said:
    Your math is wrong. :)

    FLOPS is simply FP operations per second. It's calculated as a "best-case" figure, so FMA instructions (fused multiply add) count as two operations, and each GPU core in AMD and Nvidia GPUs can do one FMA per clock (peak theoretical performance). So FLOPS ends up being:
    GPU cores * 2 * clock

    For the tables:
    80 CUs * 64 cores/CU * 2 * clock (1600 MHz) = 16,384 GFLOPS.

    Ah yes, I knew I was forgetting Texture Mapping Units. Thank you for the correction. I am assuming you meant 16.3 TFLOPS vice GigaFLOPS. I knew what you were trying to convey. Either way, those some pretty impressive theoretical compute performance. Excited to see how that translates tor real world performance versus some pointless synthetic benchmark.
    Reply
  • JamesSneed
    Speaking of FLOPS we also should note that AMD gutted most of GCN that was left especially the parts that helped compute. I fully expect the same amount of FLOPS from this architecture to translate into more FPS since they are no longer making general gaming and compute GPU but a dedicated gaming GPU.
    Reply
  • JarredWaltonGPU
    animalosity said:
    Ah yes, I knew I was forgetting Texture Mapping Units. Thank you for the correction. I am assuming you meant 16.3 TFLOPS vice GigaFLOPS. I knew what you were trying to convey. Either way, those some pretty impressive theoretical compute performance. Excited to see how that translates tor real world performance versus some pointless synthetic benchmark.
    Well, 16384 GFLOPS is the same as 16.384 TFLOPS if you want to do it that way. I prefer the slightly higher precision of GFLOPS instead of rounding to the nearest 0.1 TFLOPS, but it would be 16.4 TFLOPS if you want to go that route.
    Reply
  • JarredWaltonGPU
    JamesSneed said:
    Speaking of FLOPS we also should note that AMD gutted most of GCN that was left especially the parts that helped compute. I fully expect the same amount of FLOPS from this architecture to translate into more FPS since they are no longer making general gaming and compute GPU but a dedicated gaming GPU.
    I'm not sure that's completely accurate. If you are writing highly optimized compute code (not gaming or general code), you should be able to get relatively close to the theoretical compute performance. Or at least, both GCN and Navi should end up with a relatively similar percentage of the theoretical compute. Which means:

    RX 5700 XT = 9,654 GFLOPS
    RX Vega 64 = 12,665 GFLOPS
    Radeon VII = 13,824 GFLOPS

    For gaming code that uses a more general approach, the new dual-CU workgroup processor design and change from 1 SIMD16 (4 cycle latency) to 2 SIMD32 (1 cycle latency) clearly helps, as RX 5700 XT easily outperforms Vega 64 in every test I've seen. But with the right computational workload, Vega 64 should still be up to 30% faster. Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games.
    Reply
  • JamesSneed
    JarredWaltonGPU said:
    I'm not sure that's completely accurate. If you are writing highly optimized compute code (not gaming or general code), you should be able to get relatively close to the theoretical compute performance. Or at least, both GCN and Navi should end up with a relatively similar percentage of the theoretical compute. Which means:

    RX 5700 XT = 9,654 GFLOPS
    RX Vega 64 = 12,665 GFLOPS
    Radeon VII = 13,824 GFLOPS

    For gaming code that uses a more general approach, the new dual-CU workgroup processor design and change from 1 SIMD16 (4 cycle latency) to 2 SIMD32 (1 cycle latency) clearly helps, as RX 5700 XT easily outperforms Vega 64 in every test I've seen. But with the right computational workload, Vega 64 should still be up to 30% faster. Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games.


    "Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games. "

    Was my point ^ We will see more FPS in games than the FLOPS is telling us. It is not a flops is 30% more so we can expect that much more gaming performance it wont be linear this go around.
    Reply
  • JarredWaltonGPU
    JamesSneed said:
    "Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games. "

    Was my point ^ We will see more FPS in games than the FLOPS is telling us. It is not a flops is 30% more so we can expect that much more gaming performance it wont be linear this go around.
    I agree with that part, though it wasn't clear from your original post that you were saying that. Specifically, the bit about "AMD gutted most of GCN that was left especially the parts that helped compute" isn't really accurate. AMD didn't "gut" anything -- it added hardware and reorganized things to make better use of the hardware. And ultimately, that leads to better performance in nearly all workloads.

    Interesting thought:
    If AMD really does an 80 CU Navi 2x part, at close to the specs I listed, performance should be roughly 60% higher than RX 5700 XT. Considering the RTX 2080 Ti is only about 30% faster than RX 5700 XT, that would actually be a monstrously powerful GPU. I suspect it will be a datacenter part first, if it exists, and maybe AMD will finally get a chance to make a Titan killer. Except Nvidia can probably get a 40-50% boost to performance over Turing by moving to 7nm and adding more cores, so I guess we wait and see.
    Reply
  • jeremyj_83
    JarredWaltonGPU said:
    I agree with that part, though it wasn't clear from your original post that you were saying that. Specifically, the bit about "AMD gutted most of GCN that was left especially the parts that helped compute" isn't really accurate. AMD didn't "gut" anything -- it added hardware and reorganized things to make better use of the hardware. And ultimately, that leads to better performance in nearly all workloads.

    Interesting thought:
    If AMD really does an 80 CU Navi 2x part, at close to the specs I listed, performance should be roughly 60% higher than RX 5700 XT. Considering the RTX 2080 Ti is only about 30% faster than RX 5700 XT, that would actually be a monstrously powerful GPU. I suspect it will be a datacenter part first, if it exists, and maybe AMD will finally get a chance to make a Titan killer. Except Nvidia can probably get a 40-50% boost to performance over Turing by moving to 7nm and adding more cores, so I guess we wait and see.
    Looking at the numbers AMD could get an RX 5700XT performance part in a 150W envelope if their performance/watt numbers can be believed. Having a 1440p GPU in the power envelope of a GTX 1660 would be a killer product.
    Reply
  • JamesSneed
    I am expecting INT8 performance to not move much from the RX5700XT. Shall see though as they do need to handle ray tracing.
    Reply