The Nvidia Ampere architecture will power the RTX 3080 and other GPUs and will be the next major upgrade from Team Green. We'll know more on August 31, 2020, apparently, as Nvidia has issued a countdown to the 21st anniversary of its first GPU, the GeForce 256. The Ampere GPUs should rank high on our GPU hierarchy and list of the best graphics cards once they arrive, and we expect the same from AMD's Big Navi. Let's get into the details of what we know about Ampere, including potential specifications, release date, price, features, and more.
In an apparent break from recent tradition, Ampere will find its way into upcoming GeForce RTX 3090, RTX 3080, RTX 3070 and RTX 3060 graphics cards. Models may also include a suffix like Ti, Super, or even Ultimate. Those model names aren't set in stone, however, so perhaps Nvidia will throw us all a curveball again and change things at the last second. Whatever. Though Micron confirmed the RTX 3090 will have 12GB of 21 Gbps GDDR6X, so probably not.
Getting into the details, there will be multiple variations of Ampere, as with previous Nvidia GPUs. Nvidia CEO Jensen Huang unveiled the data center focused A100 on May 14, giving us our first official taste of what's to come. However, the A100 is not expected to go into consumer GeForce cards. It might find its way into a Titan, perhaps, but it's the replacement for the Volta GV100. Ampere will be in consumer GPUs as well, and the latest indications are that the RTX 3090, RTX 3080, and other graphics cards will arrive in the next month or two.
It's important to note that many of the rumors and leaks are unconfirmed, and any claims of pricing are fabrications/guesses. Outside of the GA100, Nvidia A100, DGX A100 and related parts, Nvidia has released no concrete information. The GA100 GPU isn't the same core design as we'll see in consumer models, as it has no ray tracing hardware, includes extra hardware for FP64 operations, and likely has extra Tensor cores for deep learning and machine intelligence work. While we can estimate where Nvidia may go on other Ampere GPUs, nothing is certain.
What's more, no GPU company gives out pricing details months in advance of a product's launch. Nvidia is very tight lipped about what it's working on, and the transition from the Turing to Ampere architecture is going to be particularly big for the company. Just wait one more month (give or take) and we'll hopefully be able to spill the beans.
We're as excited as anyone about Nvidia's next generation GPU architecture, but we also want to separate fact from fiction. Or in other words, take everything with a grain of salt. Let's also point out that Ampere is critical for Nvidia, on many levels. Recently, in its Super Spring laptops announcement, Nvidia revealed that "15 million RTX GPUs" have been sold, which is fine but almost certainly not as many as Nvidia would like.
Ampere is Nvidia's chance to prove ray tracing is actually more than just a high-end feature. This is 2nd Gen RTX hardware, and after a relatively slow start for ray tracing use in games, Nvidia has a lot to prove. The RTX 3090, RTX 3080, RTX 3070, etc. (which is what we're calling them for now) need to provide not just better performance in games using traditional rendering techniques, but a dramatic increase in ray tracing performance would open the doors to doing more RT effects without tanking performance. Still, the supposed performance data leaks suggest RTX 3080 might be 30% faster than an RTX 2080 Ti. Yummy!
Nvidia Ampere At A Glance:
- Up to 128 SMs / 8192 GPU cores (for GA100)
- The GPUs should be much faster than the RTX 20-series
- Nvidia's first 7nm part should be much more efficient than Turing
- Release Date: We expect to see Ampere in September 2020
- Price: RTX 3080 likely to cost a lot, but we'll have to wait and see
Meet the Nvidia GeForce RTX 3080
Starting with the most concrete 'leaks,' above is a preview of what appears to be Nvidia's reference model RTX 3080 Founders Edition. While it's possible things will change, there are enough images floating about now that we can be confident the card pictured above will appear in some form. Maybe it won't be called the Founders Edition, or maybe it's a third party card — certainly we'll see a bunch of custom designs from Nvidia's add-in board (AIB) partners — but the RTX 3080 and associated GPUs might look nothing like the previous generation Nvidia cards.
Perhaps that's for the best. While I like the look of the current RTX 20-series Founders Edition line of cards, I've noticed that the backplate on the GPUs can get extremely hot while gaming—especially on the RTX 2080 Ti Founders Edition. The new cooler has two fans pushing air through the heatsink, one on each side, and there are rumors circulating that RTX 3090 could have a 350W TDP (thermal design power). That's supposedly the new 'halo' card, taking over the spot formerly occupied by the 2080 Ti; alternatively, it's replacing the Titan RTX. Either way, it's a beast.
It's not just the RTX 3090 packing a new cooler and a massive TDP, however. Igor's Lab claims contacts have verified the RTX 3080 will have a 320W TDP, which is the highest power draw from a consumer Nvidia card outside of the dual-GPU Titan Z and GTX 690. Perhaps Nvidia is feeling some pressure from AMD, or maybe it just wants to crush one out of the park. We'll have to wait and see the final specs, of course, and 300W consumer cards seem pretty unlikely given the move to 7nm.
Nvidia's Ampere Architecture
With the initial GA100 and Nvidia A100 announcement behind us, a few things have been cleared up. For one, Nvidia will continue to have two separate lines of GPUs, one focused on data centers and deep learning, and the other on graphics and gaming. The changes made with the data center GA100 may or may not propogate to the other Ampere GPUs, but here's what we know of the Ampere architecture so far.
First, Ampere is far more than a simple die shrink of Turing from 12nm to 7nm. The fundamental building block of Nvidia GPUs is called a Streaming Multiprocessor (SM). Its AMD analog is the Compute Unit (CU), and at a high level, it's relatively safe and easy to compare the two companies' GPUs based on SMs vs. CUs. The Turing architecture brought plenty of changes to the SM configuration, and it's a safe bet that Ampere will bring additional changes.
Turing added RT cores and Tensor cores, for ray tracing ray/triangle intersection calculations and deep learning FP16 calculations, respectively. Beyond the RT and Tensor cores, the CUDA core is the major GPU hardware in Nvidia graphics cards. For Turing, Nvidia switched from having 128 CUDA cores per SM to 64 CUDA cores. Turing also added a dedicated integer (INT) pipeline to each CUDA core, which allows for concurrent INT and FP (floating-point) calculations. Previously, a shader core would have to switch from doing FP to doing INT, which reduced overall efficiency and throughput. The Turing CUDA cores also added support for rapid packed math (FP16) calculations, which basically double the computational power of FP32 but with reduced precision, as FP16 is useful for certain types of calculations.
We're really condensing everything that changed with Turing here, but in addition to the above, there were changes to the L1/L2 cache, support for Variable Rate Shading (VRS), mesh shaders, Texture Space Shading (TTS), Multi-View Rendering (MVR), and enhancements to Simultaneous Multi-Projection (SMP). Most of those are now part of the official DirectX 12 Ultimate API, and also have support in VulkanRT. Oh, and the NVENC hardware got a major upgrade that added hardware accelerated encoding and decoding of higher resolutions and more codecs like VP9 and HEVC.
With the GA100, Nvidia upgrades the Volta GV100 architecture from late 2017. There are no RT cores, but there are major changes to the Tensor cores at the very least. Also, there are a lot of SMs: GA100 has up to 128 SMs (only 108 SMs are enabled in the Nvidia A100), with 8192 FP32 CUDA cores, 8192 INT32 CUDA cores, and 4096 FP64 CUDA cores. Most importantly, the GA100 has 54 billion transistors, 2.56 times as many as the GV100, with a die size of 826mm square that's only 1.3% larger than the GV100.
A huge chunk of the additional transistors must be going to new features. GA100 'only' has 52% more SMs and GPU cores available. We know the L2 cache is larger, and the 3rd generation Tensor core (Volta is 1st gen, Turing is 2nd gen) adds support for both TF32 and FP64 operations, along with support for 'sparcity' operations. These will prove vital for the data center, while FP64 isn't typically used with consumer GPUs (at least not for gaming purposes). Overall, Nvidia says the Tensor cores in the GA100 are twice as fast as those in GV100, even though there are half as many cores, so it's a 4X relative speedup.
The GA100 also has two additional HBM2 channels available compared to GV100, though one of those is disabled in the currently shipping Nvidia A100 solutions. Additional features include multi-instance GPU support, allowing the GA100 to function as up to seven separate smaller GPUs, support of sparsity acceleration (another data center feature), and NVLink speed is now 600 GBps, three times as fast as in GV100.
Nvidia Ampere Potential Specifications
Unlike AMD's Big Navi, Nvidia doesn't have any announced console tie-ins with hardware specs, but with the GA100 reveal we have plenty to go on regarding lower spec Ampere solutions that will go into cards like the RTX 3080. There have also been various supposed leaks over the past six months, which as usual need to be taken with a healthy dose of skepticism.
The biggest problem is that all of the leaks so far appear to have been of GA100 GPUs, which is not going into consumer graphics cards. Some of the technology, like the enhanced Tensor cores, may trickle down into consumer models (probably not with FP64 support), but the GeForce RTX 30-series hardware will absolutely have RT and Tensor cores.
At present, Nvidia is rumored to have at least three Ampere GPUs slated to launch in 2020 through early 2021, and potentially as many as three additional Ampere solutions will come out during the coming year or so. The top model Nvidia A100 will likely only be for deep learning and HPC solutions, but the others will go into GeForce and Quadro cards. Here's what the rumors indicate, along with some of our own speculation, and there are plenty of question marks in the table.
|Graphics Card||Nvidia A100||GeForce RTX 3090||GeForce RTX 3080||GeForce RTX 3070||GeForce RTX 3060||GeForce RTX 3050|
|Die Size (mm^2)||826||~500?||~367?||~267?||~200?||~150?|
|SMs||Up to 128||Up to 84?||Up to 60?||Up to 40?||Up to 30?||Up to 20?|
|Boost Clock (MHz)||1410||1750?||2000?||1900?||2000?||2000?|
|VRAM Speed (Gbps)||2.43||21 (GDDR6X)||19-21?||18?||16?||14?|
|VRAM (GB)||48 max||12||11??||10?||8?||6?|
|Bus Width||6144 max||384||352?||320?||256?||192?|
|Tensor TFLOPS (FP16)||739||602?||492?||311?||246?||164?|
|TBP (watts)||400 (250 PCIe)||350??||320??||250??||160?||120?|
|Launch Date||May 2020||September 2020||September 2020||Fall 2020||Winter 2021||Spring 2021|
|Launch Price||$199K for DXG A100||$1,500??||$800??||$600??||$400??||$250??|
The biggest and baddest GPU is the A100, where we've listed the maximum specs. It has up to 128 SMs, of which only 108 are currently enabled in the Nvidia A100, but future variations could have the full GPU and RAM configuration. However, the GA100 isn't going to be a consumer part, just like the GP100 and GV100 before it were only for data center use (and the Titan V).
Stepping down to the chips likely to be used in GeForce RTX 30-series cards, the GA102 will be the top configuration, likely going into the RTX 3090. We've heard rumors (take with a huge helping of salt) that it could have 84 SMs and 12GB of GDDR6X — the memory is basically confirmed, but the GPU isn't, and we've heard a few rumors it might be even bigger, like up to 128 SMs. Basically, take the GA100 and strip out FP64 and add in ray tracing is one possibility. Whatever the actual GPU has, it will likely be up to 50% faster than the current RTX 2080 Ti in typical performance, and we've heard the ray tracing cores have been overhauled to provide a substantial boost in performance. Even with the added features, the chip should still be quite a bit smaller than the current TU102, thanks to 7nm.
The other chips progress down the line, though there's disagreement on what the GPUs will be called, which ones will go into each GeForce model, etc. Will the RTX 3080 use a trimmed down (partially disabled) GA102, or will it have a separate GA103 chip? Either is possible. It's also possible there will be multiple levels of some GPUs, which we'll discuss below. However Nvidia gets there, we expect a range of SM configurations, which we've estimated in the above table. The performance follows the SM and GPU clocks, both of which can be tweaked as needed. On other words, nothing is set in stone until Nvidia actually announces the parts—it could easily change certain specifications up to that time.
Moving on to memory, Nvidia has been shipping 8GB VRAM with high-end GPUs with the GTX 10-series and RTX 20-series, with 11GB on the 1080 Ti and 2080 Ti. We tentatively expect to see a move to 11GB on the RTX 3080 this next round, with 12GB on the top RTX 3090 model. With the Micron data dump, the VRAM targets are looking a lot clearer, but there's still wiggle room. A third round of 11GB feels weird, for example. It was surprising when Nvidia 'turned the dial to 11' with the 1080 Ti, but now we'd just like to see a more normal number, like maybe 12GB or 16GB. With the added speed of GDDR6X, though, Nvidia may opt for 10GB and also use 10GB (but GDDR6, not GDDR6X) on the 3070.
Whatever the memory, we expect higher clocks on Ampere. The RTX 3090 will use 12GB GDDR6X, but everything else is less certain. Faster GDDR6 exists as well, so that could mean 18 Gbps or more on the higher tier GPUs, and 16 Gbps on lower tier parts. Nvidia will probably continue with 14 and 12 Gbps on the budget and midrange cards, as prices would be lower. There's also a question about whether 8GB VRAM will be 'enough' for the coming generation of games, as there are already games pushing close to using 8GB VRAM use. Nvidia could do 6GB on the budget models, but midrange and above really need to have 8GB or more. Games like Doom Eternal and Red Dead Redemption 2 both limit the settings you can use if you don't have at least 8GB VRAM.
Note that the latest power figures are much higher than the RTX 20-series. If correct, it means Nvidia is ready to push really hard for extreme performance, possibly just to ensure it stays ahead of AMD. 350W for the RTX 3090 would be far more than previous single GPU chips. And let me douse those TDP rumors with some other information.
Besides the 400W Nvidia A100 that uses the SXM form factor with a mezzanine connector, the Nvidia A100 is also available in PCIe form. Those cards will only have a 250W TDP, and they'll also be limited to just a single NVLink connection (as opposed to NVSwitch). Nvidia says the PCIe card will offer 90% of the performance of the SXM model in single-GPU workloads. Basically, a huge chunk of the 400W TDP appears to be for NVSwitch and multi-GPU configurations. I'll be quite surprised if Nvidia goes way beyond 250W for the RTX 3090 and RTX 3080, and suspect 250W-275W TDP (for stock operation) is far more likely. Partner cards are of course free to exceed that.
Nvidia historically does a full stack of GPUs from each family (e.g., Turing has GTX 1650 up through Titan RTX), and there's a moderate chance Ampere will have ray tracing on all levels. Where Turing has the ray tracing RTX 20-series and the non-ray tracing GTX 16-series, Ampere could unify the features similar to the GTX 10-series. If that rumor proves correct, we anticipate yet another step up in budget and mid-range graphics card pricing, with the least expensive cards costing at least $200, possibly more. Anything below that will be left for previous generation hardware and integrated graphics.
That actually makes a lot of sense, based on what we've seen with the latest Xe Graphics rumors and previews. We looked at integrated Gen11 Graphics performance recently, and the 28W Tiger Lake chips could come close to doubling its performance. That would make anything less than a GTX 1660 level dedicated GPU somewhat pointless, in our opinion.
The table also leaves some clear price gaps that might be filled with intermediate hardware, for example RTX 3080 Ti/Super, 3060 Ti, etc. In the past, Nvidia has launched most of the initial GPUs without a suffix (except RTX 2080 Ti and GTX 1660 Ti), and then followed with an improved variant of the GPUs about a year later (like the 'Super' line of RTX and GTX cards). Nvidia also normally delays the launch of lower tier GPUs by 6-12 months, so the RTX 3050 at least is probably coming in Spring 2021. The lower tier GPUs may also be manufactured on Samsung's 8nm or 7nm tech, as TSMC's 7nm capacity is largely tapped out right now. That shouldn't matter too much, but we'll need to wait for further details.
There's certainly a lot of fuzziness in the above potential specs, so don't take anything as gospel truth just yet. GA100 is a known quantity; everything else is up in the air right now. We anticipate an official announcement of at least a few Ampere GPUs for consumers by September, based on how Nvidia rolled out Pascal back in 2016.
Nvidia Ampere Graphics Card Models
We've been referring to the upcoming Ampere GPUs as RTX 3090, 3080, 3070 and 3060 so far, and indications are that Nvidia will mostly stick with a familiar pattern for the coming GPUs—RTX 3090 being the exception. The last time Nvidia did an 'x90' GPU was with the GTX 690 in 2012, a dual-GPU card. Given how little support there is for multi-GPU in games these days, however, as well as Micron's detailing a 12GB configuration, RTX 3090 is going that route.
Elsewhere, we wrote last year about Nvidia filing for trademarks on 3080, 4080 and 5080 in the European Union to block a rumored RX 3080 brand from AMD's Navi GPUs. AMD didn't end up using RX 3080 (whether it ever intended to try that or not isn't clear), but we expect Nvidia will. The leaked images that claim to show the new GPUs also clearly say RTX 3080 on the product, and we're pretty sure the images aren't faked.
What about suffixes like 'Super' or 'Ti,' though? Will we see RTX 3080 Super, or 3070 Ti? We're going to give that a big, fat 'maybe' (probably), but most likely not for the initial launch. Instead, Ti or Super variants could arrive next year as a refresh to the first RTX 30-series GPUs. Nvidia's current branding seems to be working fine, so hopefully it doesn't choose to fix what isn't broken. RTX 3090 is currently planned as the halo product for Ampere consumer cards, possibly replacing the Titan and/or RTX 3080 Ti.
Nvidia RTX 3090 and RTX 3080 Release Date
Many are wondering when the RTX 3080 and other Ampere GPUs will launch. There are plenty of indications that the Ampere launch is imminent, with supplies of RTX 2080 Ti and RTX 2080 Super seemingly dwindling. The Nvidia A100 reveal during Jensen's keynote suggests the rest of the lineup will be announced sooner rather than later. COVID-19 delays are certainly happening, but we expect GeForce RTX 30-series graphics cards to arrive this fall, most likely in late August or early September. One rumor pegs the official reveal as September 9, but that's not official by any means.
Historically Nvidia does a staggered launch. The fastest GPU comes out first, meaning RTX 3090 and RTX 3080, then comes the step down RTX 3070, and later the final additions arrive (e.g., RTX 3060 and RTX 3050). It has varied over the years, of course. GTX 1080/1070 launched first, with the GTX 1080 Ti arriving almost a year later. RTX 2080 Ti and 2080 on the other hand launched within a week of each other, followed by the 2070 the next month and 2060 three months after that. The 2070 Super and 2060 Super meanwhile launched last year at the same time, with 2080 Super coming a few weeks later, but those were updates of existing GPUs.
Ampere looks like it will be similar to the Pascal launch, as the GP100 announcement preceded the GTX 10-series details, but actual Pascal graphics cards were more easily found than data center hardware. However, we also expect the consumer parts will follow the Turing pattern, meaning RTX 3090 and RTX 3080 will come out basically at the same time, with RTX 3070 coming a month or so later. RTX 3060 meanwhile will probably show up later, with January 2021 being a reasonable guess, and then RTX 3050 could launch in the spring. Or Nvidia could mix things up just to keep everyone on their toes.
How Much Will RTX 3090 and RTX 3080 Cost?
One arm, one leg—next! But seriously, in the estimated specs table, we listed our own guesses as to pricing. We're probably being overly optimistic, as Nvidia could go a very different route.
There's been a steady increase in generational pricing since the GTX 900-series launched. The GTX 970 was a $329 part, GTX 1070 was $379-$449, and RTX 2070 jumped to $499-$599 at launch. The RTX 2070 Super walked that back a bit to $499, which is still $120 more than the previous generation. Or we could look at the Ti cards: $649 for the 980 Ti, $699 for the 1080 Ti, and $1,199 for the 2080 Ti. The 2080 Ti was supposed to have third party cards starting at $999, but even now, more than 18 months later, such cards are almost impossible to find in stock—and now the 2080 Ti is being phased out.
The good news is that the market has changed quite a bit since the RTX 20-series debut. There was basically no competition from AMD at the top of the GPU hierarchy for the RTX 2080 and RTX 2080 Ti, and the GTX 1080 Ti still tends to match AMD's highest performance parts. When the RTX 3080 and Ampere launch, AMD's Big Navi / RDNA 2 / Navi 2x parts will be following soon after. That could mean competitive performance as well as a similar feature set.
Nvidia is known for its aggressive business tactics. Part of the low price on GTX 970 was undoubtedly thanks to AMD having competitive R9 290/290X parts slated to arrive just a month or so later. The RTX 20-series Super cards also dropped prices to combat the RX 5700/5700 XT launch. It doesn't really matter whether Big Navi launches just before or just after Ampere; Nvidia is going to want to maintain its lead in outright performance, while remaining competitive in terms of bang for the buck. At least, that's our hope.
All of that leads to our price estimates. It's doubtful Nvidia will walk back pricing to pre-RTX levels, especially with the move to TSMC's more expensive 7nm lithography, but the timing and circumstances surrounding the RTX 3080 and Ampere launch similarly make it unlikely Nvidia will go after substantially higher prices. Well, except on the RTX 3090 and/or Titan cards, which are probably going to be stupidly expensive if the rumored specs and performance prove correct.
It's also worth noting that Intel's Xe Graphics will be joining the dedicated graphics card market this year, though initially only in integrated variants. The Intel Xe HPG isn't coming until 2021, and it could be competitive if it has more than 512 EU models, which would translate to 4096 GPU 'cores' (ALU pipelines). That's enough to at least raise an eyebrow and might actually challenge AMD and Nvidia in the mid-range to high-end markets. We'll know more in the coming months.
As with AMD's Big Navi, the best advice right now is to wait and see what actually materializes. There's plenty of speculation, including here, about what RTX 3080 and Ampere will bring to the table. Ultimately, we need to get official specs and pricing from Nvidia, and then run our own tests.
We hope and anticipate that Ampere will be a massive jump in GPU performance, with and without ray tracing. If Nvidia doubles down on ray tracing, it's also possible the RTX 3060 could match or exceed the performance of the RTX 2080 Ti in games like Minecraft RTX where the RT cores are pushed to the limit. Certainly Nvidia is going to aim high with the RTX 3090, in price as well as performance.
Ampere should certainly be faster and more efficient than Nvidia's current Turing GPUs, as 7nm alone should ensure that. However, prices and real-world performance are what really matters. The upcoming GPU launches from all three major players are going to be exciting and will shake things up at the very least. Will Nvidia maintain its pole position, or can AMD spoil the party? Check back in the next month or two and we should have the answer.