The Nvidia Ampere architecture will be the next major upgrade for GPUs from Team Green, and will find its way into the upcoming GeForce RTX 3080 Ti, RTX 3080, RTX 3070 and RTX 3060 graphics cards. Or perhaps Nvidia will throw us all a curveball again and change the model numbers. Whatever. The GPUs should rank high on our GPU hierarchy and list of the best graphics cards. Ampere is coming to consumers, and Nvidia CEO Jensen Huang unveiled the data center focused A100 on May 14 giving us a taste of what's to come. Here's what we know about Ampere, including potential specifications, release date, price, features, and more.
First, it's important to note that all of the rumors and leaks of the past year or so are unconfirmed, and any claims of pricing are complete fabrications/guesses. Outside of the GA100, Nvidia A100, DGX A100 and related parts, no concrete information has been released. No GPU company releases pricing details months in advance of a product's launch. Nvidia is very tight lipped about what it's working on, and the transition from the Turing to Ampere architecture is going to be particularly big for the company. For example, the GA100 GPU isn't the same as consumer models, and it has no ray tracing hardware, so while we can estimate where Nvidia may go on other Ampere GPUs, nothing is certain.
We're as excited as anyone about Nvidia's next generation GPU architecture, but we also want to separate fact from fiction. There's precious little of the former that can be proven, and potentially plenty of the latter, so take everything with a grain of salt. Let's also point out that Ampere is critical for Nvidia, on many levels. Recently, in its Super Spring laptops announcement, Nvidia revealed that "15 million RTX GPUs" have been sold. That sounds nice, but damn if that doesn't seem awfully low for a GPU architecture that's been around for over 18 months.
The problem is that Nvidia doesn't normally provide hard data on the number of units sold. The current Steam Hardware Survey suggests that there are about four times as many GTX 10-series GPUs in the wild as RTX 20-series GPUs, but the statistics behind Steam's survey are opaque at best so we can't be too sure about real figures. Regardless, the attitude of many with RTX 20-series was to "wait and see," with the sage advice being that the first generation of any new technology — ray tracing hardware, in this case — might be interesting, but generation two will be where it really takes off.
Ampere is ray tracing Gen2, in other words, and after a relatively slow start for ray tracing hardware and the RTX 20-series (from our perspective), Ampere has a lot to prove. The RTX 3080, RTX 3070, etc. (which is what we're calling them for now) need to provide not just better performance in games using traditional rendering techniques, but a dramatic increase in ray tracing performance would open the doors to doing more RT effects without tanking performance.
Nvidia GeForce RTX 3080 and Ampere At A Glance:
- Up to 128 SMs / 8192 GPU cores (for GA100)
- The GPUs should be much faster than the RTX 20-series
- Nvidia's first 7nm part should be much more efficient than Turing
- Release Date: We expect to see Ampere in 2020, probably fall
- Price: RTX 3080 likely to cost around $699-$799 (but we hope it's lower)
The Ampere Architecture in GeForce RTX 3080
With the initial GA100 and Nvidia A100 announcement behind us, a few things have been cleared up. For one, Nvidia will continue to have two separate lines of GPUs, one focused on data centers and deep learning, and the other on graphics and gaming. The changes made with the data center GA100 may or may not propogate to the other Ampere GPUs, but here's what we know of the Ampere architecture so far.
First, Ampere is far more than a simple die shrink of Turing from 12nm to 7nm. The fundamental building block of Nvidia GPUs is called a Streaming Multiprocessor (SM). Its AMD analog is the Compute Unit (CU), and at a high level, it's relatively safe and easy to compare the two companies' GPUs based on SMs vs. CUs. The Turing architecture brought plenty of changes to the SM configuration, and it's a safe bet that Ampere will bring additional changes.
Turing added RT cores and Tensor cores, for ray tracing ray/triangle intersection calculations and deep learning FP16 calculations, respectively. Beyond the RT and Tensor cores, the CUDA core is the major GPU hardware in Nvidia graphics cards. For Turing, Nvidia switched from having 128 CUDA cores per SM to 64 CUDA cores. Turing also added a dedicated integer (INT) pipeline to each CUDA core, which allows for concurrent INT and FP (floating-point) calculations. Previously, a shader core would have to switch from doing FP to doing INT, which reduced overall efficiency and throughput. The Turing CUDA cores also added support for rapid packed math (FP16) calculations, which basically double the computational power of FP32 but with reduced precision — FP16 is useful for certain types of calculations.
We're really condensing everything that changed with Turing here, but in addition to the above, there were changes to the L1/L2 cache, support for Variable Rate Shading (VRS), mesh shaders, Texture Space Shading (TTS), Multi-View Rendering (MVR), and enhancements to Simultaneous Multi-Projection (SMP). Most of those are now part of the official DirectX 12 Ultimate API, and also have support in VulkanRT. Oh, and the NVENC hardware got a major upgrade that added hardware accelerated encoding and decoding of higher resolutions and more codecs like VP9 and HEVC.
With the GA100, Nvidia upgrades the Volta GV100 architecture from late 2017. There are no RT cores, but there are major changes to the Tensor cores at the very least. Also, there are a lot of SMs: GA100 has 128 SMs with 8192 FP32 CUDA cores, 8192 INT32 CUDA cores, and 4096 FP64 CUDA cores. Most importantly, the GA100 has 54 billion transistors, 2.56 times as many as the GV100, with a die size of 826mm square that's only 1.3% larger than the GV100.
A huge chunk of the additional transistors must be going to new features. GA100 'only' has 52% more SMs and GPU cores available. We know the L2 cache is larger, and the 3rd generation Tensor core (Volta is 1st gen, Turing is 2nd gen) adds support for both TF32 and FP64 operations. Both will prove vital for the data center, while FP64 isn't typically used with consumer GPUs (at least not for gaming purposes). Overall, Nvidia says the Tensor cores in the GA100 are twice as fast as those in GV100, even though there are half as many cores, so it's a 4X relative speedup.
The GA100 also has two additional HBM2 channels available compared to GV100, though one of those is disabled in the currently shipping Nvidia A100 solutions. Additional features include multi-instance GPU support, allowing the GA100 to function as up to seven separate smaller GPUs, support of sparsity acceleration (another data center feature), and NVLink speed is now 600 GBps, three times as fast as in GV100.
Potential Ampere Specifications
Unlike AMD's Big Navi, Nvidia doesn't have any announced console tie-ins with hardware specs, but with the GA100 reveal we have plenty to go on regarding lower spec Ampere solutions that will go into cards like the RTX 3080. There have also bee supposed leaks over the past six months, which as usual need to be taken with a healthy dose of skepticism.
The biggest problem is that all of the leeks so far appear to have been of GA100 GPUs, which is not going into consumer graphics cards. Some of the technology, like the enhanced Tensor cores, may trickle down into consumer models (with modifications like axing FP64 acceleration), but the GeForce RTX 30-series hardware will absolutely have ray tracing support.
At present, Nvidia has at least three Ampere GPUs slated to launch in 2020 through early 2021, and potentially as many as three additional Ampere solutions will come out during the coming year or so. The top model may only be for deep learning and HPC solutions, but the others will go into GeForce and Quadro cards. Here's what the rumors indicate, along with some of our own speculation — there are plenty of question marks in the table.
|Graphics Card||Nvidia A100||GeForce RTX 3080 Ti?||GeForce RTX 3080||GeForce RTX 3070||GeForce RTX 3060||GeForce RTX 3050|
|Die Size (mm^2)||826||~500?||~367?||~267?||~200?||~150?|
|SMs||Up to 128||Up to 84?||Up to 60?||Up to 40?||Up to 30?||Up to 20?|
|Boost Clock (MHz)||1410||1750||2000||1900||2000||2000|
|VRAM Speed (Gbps)||2.43||18||18||18||16||16|
|VRAM (GB)||48 max||16||12||10||8||8|
|Bus Width||6144 max||512||384||320||256||128|
|Tensor TFLOPS (FP16)||739||602||492||311||246||164|
|Launch Date||May 2020||Fall 2020||Fall 2020||Fall 2020||Winter 2021||Spring 2021|
|Launch Price||$199K for DXG A100||$1,499?||$799?||$549?||$349?||$199?|
The biggest and baddest GPU is the A100, where we've listed the maximum specs. It has up to 128 SMs, of which only 108 are currently enabled, but future variations will likely have the full GPU and RAM configuration. However, the GA100 isn't going to be a consumer part, just like the GP100 and GV100 before it were only for data center use.
Stepping down to the chips likely to be used in GeForce RTX 30-series cards, the GA102 will be the top configuration. We've heard rumors (take with a huge helping of salt) that it could have up to 84 SMs and 16GB of GDDR6, and Quadro versions will double the VRAM. Whatever the actual GPU has, it will likely be close to 50% faster than the current RTX 2080 Ti in typical performance, and we've heard the ray tracing cores have been overhauled to provide a substantial boost in performance. Even with the added features, the chip will still be quite a bit smaller than the current TU102, thanks to 7nm.
The other chips progress down the line, with the GA103 coming next. Nvidia may go with custom chips for each GPU as shown in the table, or it may use harvested GA103 chips for RTX 3070 similar to what it's done in the past. However it gets there, we expect a range of SM configurations, which we've estimated in the above table. The performance follows the SM and GPU clocks, both of which will be tweaked as needed.
Nvidia has been shipping 8GB VRAM with high-end GPUs with the GTX 10-series and RTX 20-series, and we tentatively expect to see a move to 12GB this next round — or maybe not. Don't take the VRAM estimates as anything but a rough guess right now. We do expect higher GDDR6 clocks from Ampere, however, with 18 Gbps likely on the fastest GPUs, and 16 Gbps on lower tier parts. Nvidia might also continue with 14 and 12 Gbps on the budget and midrange cards. The question is whether 8GB VRAM will be 'enough' for the coming generation of games, as there are already games pushing close to using 8GB VRAM use.
There are rumors that Nvidia will be doing a full stack of ray tracing GPUs with Ampere. Where Turing has the ray tracing RTX 20-series and the non-ray tracing GTX 16-series, Ampere could unify the features similar to the GTX 10-series. If that rumor proves correct, we anticipate yet another step up in budget and mid-range graphics card pricing, with the least expensive cards costing at least $200. Anything below that will be left for previous generation hardware and integrated graphics.
The table also leaves some clear price gaps that might be filled with intermediate hardware, for example RTX 3060 Ti and RTX 3050 Ti. In the past, Nvidia has delayed the launch of lower tier GPUs by 6-12 months, so the 3050 at least is probably a year out. The lower tier GPUs are also likely to be manufactured on Samsung's 8nm or 7nm tech, as TSMC's 7nm capacity is largely tapped out right now. That shouldn't matter too much, but we'll need to wait for further details.
There's certainly a lot of fuzziness in the above potential specs, so don't take anything as gospel truth just yet. GA100 is a known quantity; everything else is up in the air right now. We anticipate an official announcement of at least a few Ampere GPUs for consumers by July or August, based on how Nvidia rolled out Pascal back in 2016.
Nvidia Ampere Graphics Card Models
We've been referring to the upcoming Ampere GPUs as RTX 3080 Ti, 3080, 3070 and 3060 so far, and all indications are that Nvidia will stick with a familiar pattern for the coming GPUs. We wrote last year about Nvidia filing for trademarks on 3080, 4080 and 5080 in the European Union to block a rumored RX 3080 brand from AMD's Navi GPUs. AMD didn't end up using RX 3080 (whether it ever intended to try that or not isn't clear), but we expect Nvidia will.
What about suffixes like 'Super' or 'Ti,' though — will we see RTX 3080 Super, or 3070 Ti? We're going to give that a big, fat 'maybe' (probably), though launch windows will vary across the range of Ampere GPUs. Nvidia's current branding seems to be working fine, so hopefully it doesn't choose to fix what isn't broken. RTX 3080 Ti should end up as the halo product for Ampere consumer cards, probably with some form of Titan for those with bottomless wallets.
Nvidia Ampere and RTX 3080 Release Date
Perhaps the biggest question — and another question with a lot of uncertainty — is when the RTX 3080 and other Ampere GPUs will launch. 2020 seemed a given a few months ago, and we fully expected to hear at least something at GTC 2020 in March. The Nvidia A100 reveal during Jensen's keynote suggests the rest of the lineup will be announced sooner rather than later. COVID-19 delays are certainly happening, but we expect GeForce RTX 30-series graphics cards to arrive this fall, rather than slipping into 2021.
Historically Nvidia usually does a staggered launch. The fastest GPU comes out first, then the step down, then another step down. It has varied over the years, of course. GTX 1080/1070 launched first, with the GTX 1080 Ti arriving almost a year later. RTX 2080 Ti and 2080 on the other hand launched within a week of each other, followed by the 2070 the next month and 2060 three months after that.
Ampere looks like it will be similar to the Pascal launch, as the GP100 announcement preceded the GTX 10-series details, but actual Pascal graphics cards were more easily found than data center hardware. However, we also expect the consumer parts will follow the Turing pattern, meaning RTX 3080 Ti and RTX 3080 will come out basically at the same time, with RTX 3070 coming a month or so later. RTX 3060 meanwhile will probably show up in January 2021 at the earliest.
How Much Will RTX 3080 Cost?
One arm, one leg — next! But seriously, in the estimated specs table, we listed our own guesses as to pricing. We're probably being overly optimistic, as Nvidia could go a very different route. There's been a steady increase in generational pricing since the GTX 900-series launched.
The GTX 970 was a $329 part, GTX 1070 was $379-$449, and RTX 2070 jumped to $499-$599 at launch. The RTX 2070 Super walked that back a bit to $499, which is still $120 more than the previous generation. Or we could look at the Ti cards: $649 for the 980 Ti, $699 for the 1080 Ti, and $1,199 for the 2080 Ti. The 2080 Ti was supposed to have third party cards starting at $999, but even now, more than 18 months later, such cards are almost impossible to find in stock.
The good news is that the market has changed quite a bit since the RTX 20-series debut. There was basically no competition from AMD at the top of the GPU hierarchy for the RTX 2080 and RTX 2080 Ti — even now, the GTX 1080 Ti tends to match or exceed AMD's highest performance part. But when the RTX 3080 and Ampere launch, there's a good chance AMD will also have Big Navi / RDNA 2 / Navi 2x parts available or coming soon. That could mean competitive performance as well as a similar feature set.
Nvidia is known for its aggressive business tactics. Part of the low price on GTX 970 was undoubtedly thanks to AMD having competitive R9 290/290X parts slated to arrive just a month or so later. It doesn't really matter whether Big Navi launches just before or just after Ampere; either way, Nvidia is going to want to maintain its lead in outright performance, while remaining competitive in terms of bang for the buck.
All of that leads to our price estimates. It's doubtful Nvidia will walk back pricing to pre-RTX levels, especially with the move to TSMC's more expensive 7nm lithography, but the timing and circumstances surrounding the RTX 3080 and Ampere launch similarly make it unlikely Nvidia will go after substantially higher prices. Well, except on the RTX 3080 Ti and Titan cards, which are probably going to be stupidly expensive if the rumored specs and performance prove correct.
It's also worth noting that Intel's Xe Graphics will be joining the dedicated graphics card market this year, most likely during the summer or early fall. It's unknown how fast Xe Graphics will be, but there could be up to 512 EU variants — which would likely translate to 4096 GPU 'cores.' That's enough to at least raise an eyebrow and might actually challenge AMD and Nvidia in the high-end market. We'll know more in the coming months.
As with AMD's Big Navi, the best advice right now is to wait and see what actually materializes. There's plenty of speculation — including here — about what RTX 3080 and Ampere will bring to the table, but ultimately we need to get official specs and pricing, and then run our own tests.
We hope and anticipate that Ampere will be a massive jump in GPU performance, with and without ray tracing. If Nvidia doubles down on ray tracing, it's also possible the RTX 3060 could match or exceed the performance of the RTX 2080 Ti in games like Minecraft RTX where the RT cores are pushed to the limit.
Ampere will certainly be faster and more efficient than Nvidia's current Turing GPUs — 7nm alone will ensure that. However, prices and real-world performance are what really matters. The upcoming GPU launches from all three major players are sure to be exciting.