GeForce RTX 3080 Founders Edition: Hail to the King!
Nvidia's GeForce RTX 3080 Founders Edition is here, claiming the top spot on our GPU benchmarks hierarchy, and ranking as the best graphics card currently available — provided you're after performance first, with price and power being lesser concerns. After months of waiting, we finally have independent benchmarks and testing data. Nvidia has thrown down the gauntlet, clearly challenging AMD's Big Navi to try and match or beat what the Ampere architecture brings to the table.
We're going to hold off on a final verdict for now, as we have other third-party RTX 3080 cards to review, which will begin as soon as tomorrow. That's good news, as it means customers won't be limited to Nvidia's Founders Edition for the first month or so like we were with the RTX 20-series launch. Another piece of good news is that there's no Founders Edition 'tax' this time: The RTX 3080 FE costs $699, direct from Nvidia, and that's the base price of RTX 3080 cards for the time being. The bad news is that we fully expect supply to be insufficient to keep up with what we expect to be exceptionally high demand.
The bottom line, if you don't mind spoilers, is that the RTX 3080 FE is 33% faster than the RTX 2080 Ti, on average. Or, if you prefer other points of comparison, it's 57% faster than the RTX 2080 Super, 69% faster than the RTX 2080 FE — heck, it's even 26% faster than the Titan RTX!
But there's a catch: We measured all of those 'percent faster' results across our test suite running at 4K ultra settings. The lead narrows if you drop down to 1440p, and it decreases even more at 1080p. It's still 42% faster than a 2080 FE at 1080p ultra, but this is very much a card made for higher resolutions. Also, you might need a faster CPU to get the full 3080 experience — check out our companion GeForce RTX 3080 CPU Scaling article for the full details.
|Graphics Card||RTX 3080 FE||RTX 2080 Super FE||RTX 2080 FE|
|Process (nm)||Samsung 8N||TSMC 12FFN||TSMC 12FFN|
|Die size (mm^2)||628.4||545||545|
|FP32 CUDA Cores||8704||3072||2944|
|Boost Clock (MHz)||1710||1815||1800|
|VRAM Speed (Gbps)||19||15.5||14|
|VRAM Bus Width||320||256||256|
|Tensor TFLOPS FP16 (Sparsity)||119 (238)||89||85|
Meet GA102: The Heart of the Beast
We have a separate article going deep into the Ampere architecture that powers the GeForce RTX 3080 and other related GPUs. If you want the full rundown of everything that's changed compared to the Turing architecture, we recommend starting there. But here's the highlight reel of the most important changes:
The GA102 is the first GPU from Nvidia to drop into the single digits on lithography, using Samsung's 8N process. The general consensus is that TSMC's N7 node is 'better' overall, but it also costs more and is currently in very high demand — including from Nvidia's own A100. Could the consumer Ampere GPUs have been even better with 7nm? Perhaps. But they might have cost more, only been available in limited quantities, or maybe they would have been delayed a few more months. Regardless, GA102 is still a big and powerful chip, boasting 28.3 billion transistors packed into a 628.4mm square die. If you're wondering, that's 52% more transistors than the TU102 chip used in RTX 2080 Ti, but in a 17% smaller area.
Ampere ends up as a split architecture, with the GA100 taking on data center ambitions while the GA102 and other consumer chips have significant differences. The GA100 focuses far more on FP64 performance for scientific workloads, as well as doubling down on deep learning hardware. Meanwhile, the GA102 drops most of the FP64 functionality and instead includes ray tracing hardware, plus some other architectural enhancements. Let's take a closer look at the Ampere SM found in the GA102 and GA104.
Nvidia GPUs consist of several GPCs (Graphics Processing Clusters), each of which has some number of SMs (Streaming Multiprocessors). Nvidia splits each SM into four partitions that can operate on separate sets of data. With Ampere, each SM partition now has 16 FP32 CUDA cores, 16 FP32/INT CUDA cores, a third-gen Tensor core, load/store units, and a special function unit. The whole SM has access to shared L1 cache and memory, and there's a single second-gen RT core. In total, that means 64 FP32 cores and 64 FP32/INT cores, four Turing cores, and one RT core. Let's break that down a bit more.
The Turing GPUs added support for concurrent FP32 (32-bit floating point) and INT (32-bit integer) operations. FP32 tends to be the most important workload for graphics and games, but there's still a decent amount of INT operations — for things like address calculations, texture lookups, and various other types of code. With Ampere, the INT datapath is upgraded to support INT or FP32, but not at the same time.
If you look at the raw specs, Ampere appears to be a far bigger jump in performance than the 70% we measured. 30 TFLOPS! But it generally won't get anywhere near that high because the second datapath is an either/or situation: It can't do both types of instructions on the pipeline in the same cycle. Nvidia says around 35% of gaming calculations are INT operations, which means you'll end up with something more like 20 TFLOPS of FP32 and 10 TOPS of INT on the RTX 3080.
While we're on the subject, let's also point out that a big part of the increased performance comes from increased power limits. RTX 2080 was a 225W part (for the Founders Edition), and RTX 3080 basically adds 100W to that. That's half again more power for 70% more performance. It's technically a win in overall efficiency, but in the pursuit of performance, Nvidia had to move further to the right on the voltage and frequency curve. Nvidia says RTX 3080 can deliver a 90% improvement in performance-per-watt if you limit performance to the same level on both the 2080 and 3080 … but come on, who wants to limit performance that way? Well, maybe laptops, but let's not go there.
One thing that hasn't changed much is the video ports. Okay, that's only partially true. First, there's a single HDMI port, but it's HDMI 2.1 instead of Turing's HDMI 2.0b, but the three DisplayPort connections remain 1.4a. And last but not least, there's no VirtualLink port this round — apparently, VirtualLink is dead. RIP. The various ports are all capable of 8K60 using DSC (Display Stream Compression), a "visually lossless" technique that's actually not really visually lossless. But you might not notice at 8K.
Getting back to the cores, Nvidia's third-gen tensor cores in GA102 work on 8x4x4 FP16 matrices, so up to 128 matrix operations per cycle. (Turing's tensor cores used 4x4x4 matrices, while the GA100 uses 8x4x8 matrices.) With FMA (fused multiply-add), that's 256 FP operations per cycle, per tensor core. Multiply by the 272 total tensor cores and clock speed, and that gives you 119 TFLOPS of FP16 compute. However, Ampere's tensor cores also add support for fine-grained sparsity — basically, it eliminates wasting time doing multiplications by 0, since the answer is always 0. Sparsity can provide up to twice the FP16 performance in applications that can use it.
The RT cores receive similar enhancements, with up to double the ray/triangle intersection calculations per clock. The RT cores also support a time variable, which is useful for calculating things like motion blur. All told, Nvidia says the 3080’s new RT cores are 1.7 times faster than the RTX 2080’s, and they can be up to five times as fast for motion blur.
There are plenty of other changes as well. The L1 cache/shared memory capacity and bandwidth has been increased to better feed the cores (8704KB vs. 4416KB), and the L2 cache is also 25% larger than before (5120KB vs. 4096KB). The L1 cache can also be configured as varying amounts of L1 vs. shared memory, depending on the needs of the application. Register file size is also nearly 50% larger (17408KB vs. 11776KB) with the RTX 3080. GA102 can also do concurrent RT + graphics + DLSS (previously, using the RT cores would stop the CUDA cores).
Finally, the raster operators (ROPS) have been moved out of the memory controllers and into the GPCs. Each GPC has two ROP partitions of eight ROP units each. This provides more flexibility in performance, so where the GA102 has up to 112 ROPS total, the RTX 3080 disables two memory controllers but only one GPC and ends up with 96 ROPS. This is more critical for the RTX 3070 / GA104, however, which still has 96 ROPS even though it only has eight memory controllers. Each GPC also includes six TPCs (Texture Processing Clusters) with eight TMUs (Texture Mapping Units) and a polymorph engine, though Nvidia only enables 34 TPCs for the 3080.
With the core enhancements out of the way, let's also quickly discuss the memory subsystem. GA102 supports up to twelve 32-bit memory channels, of which ten are enabled on the RTX 3080. Nvidia teamed up with Micron to use its GDDR6X memory, which uses PAM4 signaling to boost data rates even higher than before. Where the RTX 20-series cards topped out at 15.5 Gbps in the 2080 Super and 14 Gbps in the other RTX cards, GDDR6X runs at 19 Gbps in the RTX 3080. Combined with the 320-bit interface, that yields 760 GBps of bandwidth - a 70% improvement over RTX 2080.
The RTX 3080’s memory controller has also been improved, with a new feature called EDR: Error Detection and Replay. When the memory detects a failed transmission, rather than crashing or corrupting data, it simply tries again. It will do this until it's successful, though it's still possible to cause a crash with memory overclocking. The interesting bit is that with EDR, higher memory clocks might be achievable, but still result in lower performance. That's because the EDR ends up reducing memory performance when failed transmissions occur. We'll have more to say on this in the overclocking section.
GeForce RTX 3080 Founders Edition: Design, Cooling, Aesthetics
Nvidia has radically altered the design of its Founders Edition cards for the RTX 30-series. The new design still includes two axial fans, but Nvidia heavily redesigned the PCB and shortened it so that the 'back' of the card (away from the video ports) consists of just a fan, heatpipes, radiator fins, and the usual graphics card shroud. Nvidia says the new design delivers substantial improvements in cooling efficiency, while at the same time lowering noise levels. We'll see the fruits of the design later.
Aesthetics are highly subjective, and we've heard plenty of people like the new design, while others think it looks boring. There's no RGB bling if that's your thing, and the only lighting consists of a white GeForce RTX logo on the top of the card with subtle lighting around the 'X' on both sides of the card (but only half of the 'X' is lit up on the side with the "RTX 3080" logo).
Personally, I think the new card looks quite nice, and it feels very solid in the hand. It's actually about 100g heavier than the previous RTX 2080 design, and as far as I'm aware, it's the heaviest single-GPU card Nvidia has ever created. It's also about 2cm longer than the previous generation cards and uses the typical two-slot width. (The GeForce RTX 3090 is about ready to make the 3080 FE look puny, though, with its massive three-slot cooler.)
Nvidia provided the above images of the teardown of the RTX 3080 Founders Edition. We're not ready to attempt disassembly of our card yet — and frankly, we're out of time — but we may return to the subject soon. We're told getting the card apart is a bit trickier this round, mostly because Nvidia has hidden the screws behind tiny covers.
The main board looks far more densely populated than previous GPUs, with the 10 GDDR6X memory chips surrounding the GPU in the center. You can also see the angled 12-pin power connector and the funky-looking cutout at the end of the PCB. Power delivery is obviously important with a 320W TGP, and you can see all the solid electrolytic capacitors placed to the left and right of the memory chips.
The memory arrangement is also interesting, with four chips on the left and right sides of the GPU, up to three chips above the GPU (two mounting positions are empty for the RTX 3080), and a final single chip below the GPU. Again, Nvidia clearly spent a lot of effort to reduce the size of the board and other components to accommodate the new and improved cooling design. Spoiler: It works very well.
One interesting thing is that the 'front' fan (near the video ports) spins in the usual direction — counterclockwise. The 'back' fan, which will typically face upward when you install the card in an ATX case, spins clockwise. If you look at the fins, that means the back fan spins the opposite direction from what we normally expect. The reason is that Nvidia found this arrangement pulls air through the radiator better and generates less noise. Also note that the back fan is slightly thicker, and the integrated ring helps increase static pressure on both fans while keeping RPMs low.
If you don't like the look of the Founders Edition, rest assured there will be plenty of other options. We have a few third-party RTX 3080 cards in for testing, all of which naturally include RGB lighting. None of the third party cards use the 12-pin power connector, either — not that it really matters, since the required adapter comes with the card. Still, that vertically-mounted 12-pin port just looks a bit less robust if you happen to swap GPUs on a regular basis. I plan to leave the adapter permanently connected and just connect or disconnect the normal 8-pin PEG cables. The 12-pin connector appears to be rated for 25 'cycles,' and I've already burned through half of those (not that I expect it to fail any time soon).
GeForce RTX 3080: Initial Overclocking Results
If you've followed the world of CPU overclocking, you've probably noticed how AMD and Intel are both pushing their CPUs closer and closer to the practical limits. I still have fond memories of 50% overclocks on some old CPUs, like the Celeron 300A. These days, the fastest CPUs often only have a few hundred MHz of headroom available — and then only with substantial cooling. I begin with that preface because the RTX 3080 very much feels like it's also running close to the limit.
For the GPU core, while Nvidia specs the nominal boost clock at 1710 MHz, in practice, the GPU boosts quite a bit higher. Depending on the game, we saw sustained boost clocks of at least 1830 MHz, and in some cases, clocks were as high as 1950 MHz. That's not that different from the Turing GPUs, or even Pascal. The real question is how far we were able to push clocks.
The answer: Not far. I started with a modest 50 MHz bump to clock speed, which seemed to go fine. Then I pushed it to 100 MHz and crashed. Through a bit of trial and error, I ended up at +75 MHz as the best stable speed I could hit. That's after increasing the voltage by 100 mV using EVGA Precision X1 and ramping up fan speeds to keep the GPU cool. The result was boost clocks in the 1950-2070 MHz range, typically settling right around the 2GHz mark.
Memory overclocking ended up being far more promising. I started with 250 MHz increments. 250, 500, 750, and even 1000 MHz went by without a problem before my test (Unigine Heaven) crashed at 1250 MHz. Stepping back to 1200 MHz, everything seemed okay. And then I ran some benchmarks.
Remember that bit about EDR we mentioned earlier? It works. 1200 MHz appeared stable, but performance was worse than at stock memory clocks. I started stepping down the memory overclock and eventually ended up at 750 MHz, yielding an effective speed of 20.5 Gbps. I was really hoping to sustain 21 Gbps, in honor of the 21st anniversary of the GeForce 256, but it was not meant to be. We'll include the RTX 3080 overclocked results in the ultra quality charts (we didn't bother testing the overclock at medium quality).
Combined, we ended up with stable performance using the +75 MHz core overclock and +750 MHz GDDR6X overclock. That's a relatively small 4% overclock on the GPU, and a slightly more significant 8% memory overclock, but neither one is going to make a huge difference in gaming performance. Overall performance at 4K ultra improved by about 6%.
GeForce RTX 3080: Test Setup
We're using our standard GPU test bed for this review. However, the Core i9-9900K CPU is now approaching its second birthday, and the Core i9-10900K might be a bit faster. Never fear! While we didn't have nearly enough time to retest all of the graphics cards on a new test bed, we did run a series of RTX 3080 CPU Scaling benchmarks on five other processors. Check that article out for the full results, but the short summary is that the 9900K and 10900K are effectively tied at 1440p and 4K.
You can see the full specs of our GPU test PC to the right. We've equipped it with some top-tier hardware, including 32GB of DDR4-3200 CL16 memory, a potent AIO liquid cooler, and a 2TB M.2 NVMe SSD. We actually have two nearly identical test PCs, only one of which is equipped for testing power use — that's the OpenBenchTable PC. Our second PC has a Phanteks Enthoo Pro M case, which we test with the side removed because of all the GPU swapping. (We've checked with the side installed as well, and we have sufficient fans that temperatures for the GPU don't really change with the panel installed.)
Our current gaming test suite consists of nine games, some of which are getting a bit long in the tooth, and none with ray tracing or DLSS effects enabled. We're adding 14 additional 'bonus' graphics tests, all at 4K ultra with DLSS enabled (where possible) on a limited selection of GPUs to show how pushing the RTX 3080 to the limit changes things … or doesn't, in this case.
Also, Microsoft enabled GPU hardware scheduling in Windows 10 a few months back, and Nvidia's latest drivers support the features. We tested with HW scheduling enabled and disabled, with all of our other GPU results run with the feature disabled. It's not that the feature doesn't help, but overall it ends up being in the margin of error at most settings. We'll include the HW scheduling enabled results in our charts as well.
GeForce RTX 3080: Sweeping the 4K Gaming Benchmarks
For each of the games we tested, we have results at 1080p, 1440p, and 4K. However, we're not going to show or discuss all the 1080p data because the RTX 3080 most definitely isn't designed as a 1080p gaming card. Unless you're running ray tracing effects at maximum quality, but then you should probably use DLSS in performance mode to upscale to 4K. We'll start with the 4K results first, at ultra and medium settings, before wrapping up with the 1440p charts and the overall 1080p performance charts.
Starting at the high-level overview where we average performance across all nine games in our test suite, this is a good view of what's to come. The RTX 3080 Founders Edition blew away the competition: It's not even close. It ended up 27% faster than the Titan RTX, 32% faster than the RTX 2080 Ti, 57% faster than the RTX 2080 Super it replaces at the $700 price point, and 69% faster than the RTX 2080 FE. If you're hanging on to a 10-series Nvidia card, the 3080 also beat the 1080 Ti by 87% and ended up nearly 150% faster than the GTX 1080 FE.
AMD's best showing at 4K ultra is still the Radeon VII, which the 3080 beats by 79% — though, of course, we're more interested in seeing how Big Navi compares. The RTX 3080 also beat the RX 5700 XT by 95% overall. AMD has talked about 50% better performance-per-watt for Navi 2x, which might be enough to catch the RTX 3080 (depending on how AMD gets that 50% boost — a 300W card that's 50% more efficient should do it).
Perhaps more importantly, the RTX 3080 is the first graphics card we've tested that manages to break 60 fps in every game in our current test suite at 4K ultra, averaging nearly 100 fps overall. That doesn't mean minimum fps stayed above 60, and sometimes it dropped into the 40s, but it's the best result we've seen by a fairly large margin.
Stepping down to 4K medium settings, the RTX 3080 maintains its pole position, and the margin of victory was basically unchanged. It's still 26% faster than the Titan RTX (at less than one third the price) and 69% faster than the RTX 2080 FE. That's the great thing about 4K: Even at medium quality, it's still typically GPU bottlenecked (unless you're using a very slow CPU).
As we'll see below, individual games will vary a bit from these margins, and some still appear to encounter a bit of CPU bottlenecking, but these overall averages tell the real story.
Borderlands 3 is an AMD-promoted game, and we tested using the DirectX 12 (DX12) API. It's a demanding game that pushes GPUs more than normal, and in fact, it gave the RTX 3080 its second-largest margin of victory (give or take) over the other GPUs. RTX 3080 beat the RTX 2080 by 81% this time, and interestingly that grows to 88% at 4K medium. It's also 68% faster than the 2080 Super (75% at medium), 34% faster than the 2080 Ti (39% at medium), and 73% faster than the RX 5700 XT (85% at medium), which is AMD's best showing this time.
The Division 2 is another AMD-promoted game, though it is perhaps less shader or geometry intensive. It's not clear, but AMD's Radeon VII jumped back ahead of the RX 5700 XT. Again, we used the DX12 API. The 3080 was 35% faster than the 2080 Ti, 64% faster than the 2080 Super, and 78% faster than the 2080 FE. It also led the Radeon VII by 89% and more than doubled the 5700 XT performance (112% faster). 4K medium narrowed the gap across all the tested GPUs by a few percent, and AMD's GPUs benefited more from medium quality than Nvidia's GPUs — the 3080 was 'only' 84% faster than the 5700 XT and 66% faster than the VII.
Far Cry 5 is one of the games on our chopping block for GPU benchmarks. It's another AMD promoted title, but it's a bit old and doesn't push modern hardware quite as hard as other games. Plus, Far Cry 6 is coming soon-ish (Feb 2021), which will likely replace it. The game definitely hits some CPU bottlenecks, so it's one of the lesser victories for the RTX 3080. The lead over the 2080 FE was only 63% at 4K ultra, or 65% at 4K medium, and less than 30% ahead of the 2080 Ti.
Final Fantasy XIV is also ready for retirement, and it showed similar results to Far Cry 5. RTX 3080 beat the 2080 FE by 61% (58% at medium). Final Fantasy XIV does make use of some of Nvidia's GameWorks libraries, which also leads to larger victories over AMD this time. The 3080 is over 80% faster than the Radeon VII and 90-100% faster than the RX 5700 XT.
I thought about replacing Forza Horizon 4 with Project CARS 3, but I'm holding off for now. This is our car racing representative, and it's one of the lesser wins for the RTX 3080. The newcomer still bests the previous generation 2080 FE by 50%, and it's 116% faster than the GTX 1080, but all of the cards managed more than 60 fps at 4K ultra. Dropping the settings to 4K medium, Nvidia's fastest GPUs start to encounter more CPU limitations, narrowing the gap between the various cards.
Metro Exodus is another extremely demanding game — and that's without enabling ray-tracing effects. It's an Nvidia-promoted game, though without RT global illumination, it's reasonably agnostic about your GPU choice. The RTX 3080 FE is the first GPU ever to break 60 fps at 4K ultra, running at stock clocks. And let's be clear that minimum fps still fell below that mark regularly. Still, the 3080 basically matches its overall average, beating the 2080 FE by 67% and the 5700 XT by 94%.
Borderlands 3 is the second-most demanding game in our suite, with top honors going to Red Dead Redemption 2 — which we're not even running at maxed-out settings. We set all of the settings to minimum, then change all of the top (non-advanced) settings to medium or maximum (high/ultra). It's how we started doing testing months ago, and we've stayed with it since the game doesn't have built-in presets. Not surprisingly, as the most demanding game in our test suite, the RTX 3080 also saw some of its biggest leads. It was 86% faster than the 2080 FE, 47% faster than the 2080 Ti, and over twice as fast as the RX 5700 XT.
Shadow of the Tomb Raider is slightly less demanding than our overall average game, at least without ray-traced shadows enabled. Like all of the other ray tracing games right now, it was also promoted by Nvidia. It's also due for replacement (I'm looking at Watch Dogs: Legion and Cyberpunk 2077 as new alternatives), but in the meantime, we saw another 65% lead over the previous generation RTX 2080, and slightly more than double the performance of the RX 5700 XT.
Last in our list, we have Strange Brigade, which is the only game in our primary test suite that uses the Vulkan API. It's another AMD promotional title, so if you're keeping score, our current test suite is slightly AMD slanted (four AMD, three Nvidia). Strange Brigade also manages to hit the highest framerates of any of the games we tested, perhaps thanks to the Vulkan API. Even at maxed-out settings, the RTX 3080 hit 156 fps and outperformed the RTX 2080 by 87%. It's not a game that favors AMD GPUs either, as the 3080 got its largest lead (119%) over the RX 5700 XT. Dropping the quality to medium doesn't really change things much either — this is simply a game that tends to be GPU limited even at lower settings.
GeForce RTX 3080: 1440p Gaming Benchmarks
As you'd expect, dropping the resolution to 1440p tends to narrow the gap between the RTX 3080 and the other GPUs. Given the level of performance we're talking about, 1440p gaming on the RTX 3080 is going to be similar in many ways to 1080p gaming on an RTX 2080 Ti. You can do it, of course, but you'll almost certainly start to hit CPU bottlenecks in a lot of games.
Our high level overall average charts show most of what you'll want to see. At 1440p ultra, the RTX 3080 ended up 24% faster than the 2080 Ti, 43% faster than the 2080 Super, and 54% faster than the 2080 FE. Or if you're still hanging on to a GTX 1080 from 2017, it's more than twice as fast — a reasonable upgrade option. It's also twice as fast as the RX Vega 64, 74% faster than the RX 5700 XT, and 67% faster than Radeon VII.
CPU bottlenecks become even more of a factor when you drop to 1440p and medium settings. You can knock ~10-15% off the 3080's lead over the 'slower' GPUs like GTX 1080, ~10% off the lead over the cards in the middle of our charts, and only ~5% less against the top previous-gen cards, like the 2080 Ti. Again, all of that is thanks to the 3080 becoming increasingly CPU-limited.
We're going to dispense with discussions of 1440p performance on the individual games since there's really not much to add. The RTX 3080 still easily reigns as the fastest 1440p gaming GPU. It did come in slightly behind the Titan RTX at 1440p medium in Forza Horizon 4, but both cards were clearly smacking into an fps ceiling of some form.
The Division 2
Far Cry 5
Final Fantasy XIV
Forza Horizon 4
Red Dead Redemption 2
Shadow of the Tomb Raider
GeForce RTX 3080: 1080p Gaming Benchmarks
We won't belabor the point, but without ray tracing and/or DLSS, most games simply don't benefit much from the performance the RTX 3080 delivers. We'll skip the individual gaming charts and just show the overall averages:
If you want to use the RTX 3080 to play at 1080p ultra, it's now only 18% faster than the RTX 2080 Ti. Or rather, the CPU and other elements aren't able to keep up with the 3080, so it putters along admiring the view and enjoying a relaxing stroll down 1080p lane. And if you want to take things a step further, it was only 8% faster than the RTX 2080 Ti at 1080p medium — and there were a few instances where the older GPU actually came out ahead (likely thanks to better optimizations in the games for the existing Turing architecture).
Speaking of which, it's worth noting that since the Ampere architecture is brand new, games and drivers are unlikely to be fully optimized for it right now. It's unlikely that existing games will see patches to improve performance on Ampere. Still, Nvidia has a history of getting at least 5-10% more performance from its latest architecture with driver updates. It's not guaranteed by any means, but we do expect the RTX 3080's lead over the RTX 2080 to increase somewhat over the coming months.
GeForce RTX 3080: Bonus 4K Ultra Gaming Benchmarks
Which is a great segue into our bonus gaming and graphics benchmarks. We tested the RTX 3080, RTX 2080 Ti, and RTX 2080 FE on 14 additional games and graphics tests, many of which utilize ray tracing and/or DLSS. We only tested at 4K and ultra settings or equivalent, except for 3DMark Port Royal, which runs at 1440p. Here are the results:
We'll start with the overall average again, where we take the average fps of all fourteen games and graphics tests. (Yeah, it says 14 games, but we're just trying to make the chart title not look huge.) We ran some of the tests at multiple settings and included all results in the average. It's interesting that, even with an almost completely different selection of games and benchmarks, the RTX 3080 FE ends up with a nearly identical lead at 4K ultra.
In our primary nine-game suite, the RTX 3080 FE was 32% faster than the RTX 2080 Ti and 69% faster than the RTX 2080 FE. For these 14 tests, the RTX 3080 FE was … wait for it! … 32% faster than the 2080 Ti and 70% faster than the RTX 2080 FE. And trust me, I don't have time to pick and choose benchmarks to try and skew things for that result. I just grabbed a bunch of recent games and some new ray tracing benchmarks that are available and threw them at these GPUs, and it proves that the overall view is basically correct. You gotta love statistics. (Okay, you don't, but it's great when it all works out so nicely.)
Let's dig into the various benchmarks and look at the individual results.
3DMark Port Royal reports an overall score, but we used OCAT to log frametimes instead so that we can generate percentile charts and minimum fps results. I also tested Time Spy but didn't bother with gathering frametimes — it's not included in the overall average. You can see the 3DMark images above (the unnamed CPU/GPU results are for the 3080, which 3DMark's System Info utility failed to identify).
As for performance, Port Royal averaged 53 fps, even though it's running at 1440p. That might seem low, but it does use a lot of ray tracing effects and the benchmark doesn't use DLSS for upscaling. By percentages, the 3080 was 30% faster than the 2080 Ti and 75% faster than the 2080. Relying on any single benchmark as a universal determination of performance is liable to lead to cheating and gaming the system, though, so even though Port Royal is only off by 2-5% from our overall metric, it's best to keep using lots of tests. (Incidentally, Time Spy only gives the 3080 a 21% lead over the 2080 Ti, and 51% over the 2080 FE.)
Next up, Battlefield V with DirectX Raytracing (DXR) reflections and with DLSS enabled. This is a DLSS 1.0 game, and it's also the first game to get RTX enhancements. We ran around the Nordlys mission, where there are lots of puddles that provide shiny reflections. It's not as demanding as some of the other levels, or as multiplayer mode, but it's at least repeatable. Here, the 3080 is only 23% faster than the 2080 Ti and 52% faster than the 2080 FE.
The Boundary Benchmark is a new one from Chinese developers Studio Surgical Scalpels, and it includes lots of ray tracing effects: global illumination, reflections on opaque and translucent surfaces, shadows, and ambient occlusion. It also supports DLSS in performance, balanced, and quality modes, but since we were running at 4K, we just stuck with the performance option. Even then, frame rates were below 60 fps on the 3080, and just 29 fps on the 2080. This is a great example of what a next-generation ray tracing enhanced game engine might look like, and it shows that even with twice the ray tracing performance of Turing, games are going to be able to tax even Ampere GPUs if they want to.
Bright Memory Infinite is another benchmark from Chinese developers FYQD-Studio — they actually plan to make it a full game on Steam, though it's not finished yet. Like the Boundary benchmark, it includes basically all the ray tracing options, including some we haven't seen before in other games. Reflections, refractions, shadows, ambient occlusion, order-independent transparency (OIT), and caustics. It also uses multiple-bounce rays at higher settings, which makes for more realistic reflections and refractions at the cost of performance. It has four presets, and we tested using the high and very high options, which also necessitated using DLSS 2.0 in performance mode (1080p upscaled to 4K).
The high preset uses low-resolution caustics, five bounce refractions, one bounce reflections, OIT, AO, and shadows from up to three light sources. Very high kicks things up another notch with high-resolution caustics, six bounce refractions, two bounce reflections, OIT, AO, and ray tracing for "all" light sources. Either way, the images are damn impressive and show what ray tracing games of the future could bring to the party. Performance was similar to Boundary when using the very high preset on the 3080, but the 2080 and 2080 Ti were a bit slower; the high preset runs better and nearly reaches 60 fps on the 3080.
Control was the first game to push more than one major ray tracing effect, with reflections, diffuse lighting, transparent reflections, and ambient occlusion. It was also one of the first implementations of DLSS 2.0, but actually showed the rendered resolution rather than using nebulous "quality, balance, performance" labels. The 2080 Ti could just barely manage to average 60 fps with all the RT effects enabled, using 1080p upscaling to 4K. The 3080 FE can basically match that level of performance with 'quality' 1440p upscaling, or perform 40% faster than the 2080 Ti with 1080p upscaling. The 3080 is also 81% faster than the 2080 FE.
Death Stranding already ran quite well at 4K ultra settings with even a 2080 using the DLSS quality preset, so we didn't bother with lower DLSS levels. It's already a bit CPU limited, but it's a great showcase for how DLSS 2.0 can look better than native rendering — mostly because the native rendering uses temporal AA and tends to blur things too much. Anyway, it looks good, and if you like hiking around a weirder than snot representation of post-apocalyptic USA, you can now do so at over 120 fps — perfect for playing on a 4K 120Hz LG OLED TV!
Doom Eternal is our second Vulkan game, and like Strange Brigade, performance is very good across all three GPUs we tested. We did a full Doom Eternal performance analysis back when the game launched, and patches and driver updates have improved performance even more since then. There's no fancy ray tracing or DLSS effects in the game (even though it was originally on the list of games that were planning to add ray tracing when Turing launched), but you do get 171 fps on the 3080 at 4K ultra nightmare settings. Oh, and the RTX 3080 is more than twice as fast as the RTX 2080 for a change — one of the only cases where we measured such an increase in performance.
We recently did a full performance analysis of Microsoft Flight Simulator 2020 and found it to be a game that taxes CPUs more than anything else currently available. Also, it uses DirectX 11 and appears to be severely hampered by draw call limitations. Anyway, we wanted to see how the 3080 handled 4K ultra, bearing in mind the ultra preset basically maxes out at around 51 fps on our Core i9-9900K test PC. The result is better than the RTX 2080 Ti by 24%, and 54% higher than the 2080, but you'd be better off dropping a few settings (as we cover in the MSFS 2020 benchmarking article).
Similar to Death Stranding, Horizon Zero Dawn is another Sony PlayStation 4 game recently ported over to PC. Despite using the same engine, it doesn't run as well — and there's no DLSS support either. In terms of performance, however, it ends up within 1% of our overall averages. Also, 4K ultra easily sails past 60 fps now, which wasn't possible without dropping the settings on the 2080 Ti.
We already benchmark Metro Exodus without ray tracing or DLSS enabled, but we also wanted to try getting the full experience. RTX 2080 Ti couldn't come close to 60 fps before, and even the RTX 3080 still barely averages 60 fps, with frequent dips into the 40-50 fps range. The good news is that you could tweak the settings a bit to improve performance — or maybe plunk down $1,500 on an RTX 3090 next week? Hmmm.
Call of Duty Modern Warfare ran well enough at 4K ultra with ray tracing shadows enabled that the developers didn't feel a need to support DLSS. Not that it really matters since this is primarily a multiplayer game, and most multiplayer gamers aren't going to worry about tanking framerates for realistic shadows. Relative performance is once again quite close to our overall metric.
Project CARS 3 recently launched, and we've used the previous version for our CPU reviews, so we thought we'd check out the new update. It ended up being more CPU-limited than other games and lacks some major options like post-process anti-aliasing. Still, even the RTX 2080 clears 60 fps, while the RTX 3080 comes close to 100 fps.
Shadow of the Tomb Raider was cleverly named, what with the introduction of ray-traced shadows. Except the patch that added the DXR effect only arrived many months after the game's 2018 release, when most gamers had probably moved on. Still, we use it as a benchmark, and enabling the DXR shadows can drop performance quite a bit. It also supports DLSS, which we enabled, but it's DLSS 1.0 (aka, the not as good version of DLSS).
Last in our list of bonus benchmarks, Wolfenstein Youngblood also belatedly added ray tracing and DLSS support months after its release. It only does ray-traced reflections, but it also uses the Vulkan API. So far, Vulkan games have tended to perform better than DX12 games, but that might be the developers rather than the API. Anyway, Youngblood is interesting in that the RTX 3080 can run with DLSS Quality mode and still outperform the RTX 2080 Ti in DLSS Performance mode. It can also break 144 fps on the 3080 using DLSS performance mode if you have a high refresh rate monitor.
GeForce RTX 3080: Mining Performance and Compute Benchmarks
Remember how we said we weren't done with benchmarks yet? Well, this is one area where we have a few more tests we want to run. We'll update this over the coming days with additional testing. We also plan to run the Blender GPU benchmark, and we're looking at some other potential options.
We did want to check one item off the list, though: cryptocurrency mining performance. There were rumors that the RTX 3080 can hit 120 MHash in Ethereum mining, which is nearly triple what you can get from a 2080. That would potentially make this a profitable and desirable mining solution. We grabbed NiceHashminer and ran the benchmark mode, which failed on most of the hashing algorithms.
It did work on Dagger Hashimoto (Ethash), however, generating a result of 80 MH/s. That's nearly double the 2080 Ti's 43 MH/s and more than double the 2080 FE's 34.6 MH/s … except mining algorithms tend to be very picky about drivers and OS choices. Could some optimized miner boost performance to 120 MH/s? Possibly. And if so, well, prepare for yet another cryptocurrency miner shortage while every miner and their dog tries to snag an RTX 3080.
Current estimates for a 120 MH/s GPU using 320W of power are around $4 per day in net cryptocurrency mining income. That means potentially hitting the break-even point in under 200 days. Of course, the cryptocurrency market remains extremely volatile, and you could end up taking two or three times as long — or you could strike it rich and end up breaking even in only a few months.
At the current 80 MH/s we measured, however, things are far less dire. That's only $2.50-$3.00 in net profits per day at current rates, which would require at least 230 days and potentially 300 days to break even. More likely, it would take over a year, and you're chewing through power, heating your house, and potentially wearing out your GPU fans in the process. That's still probably not bad enough to keep miners away, though. Sorry, gamers, but we might end up in another 2016/2017 GPU shortage if the miners see crypto-gold in them thar crypto-hills.
GeForce RTX 3080: Power, Temperatures, and Fan Speeds
We mentioned the significantly higher TGP rating of the RTX 3080 already, but now let's put it to the test. We use Powenetics in-line power monitoring hardware and software so that we can report the real power use of the graphics card. You can read more about our approach to GPU power testing, but the takeaway is that we're not dependent on AMD or Nvidia (or Intel, once dedicated Xe GPUs arrive) to report to various software utilities exactly how much power the GPUs use.
So how much power does the RTX 3080 use? It turns out the 320W rating is perhaps slightly conservative: We hit 335W averages in both Metro Exodus and FurMark.
Temperatures and fan speeds are closely related to power consumption. More airflow from a higher-RPM fan can reduce temperatures, while at the same time increasing noise levels. We're still working on some noise measurements, but the RTX 3080 is measurably quieter than the various RTX 20-series and RX 5700 series GPUs. As for fan speeds and temperatures:
Wow. No, seriously, that's super impressive. Despite using quite a bit more power than any other Nvidia or AMD GPU, temperatures topped out at a maximum of 72C during testing and averaged 70.2C. What's more, the fan speed on the RTX 3080 was lower than any of the other high-end GPUs. Again, power, temperatures, and fan speed are all interrelated, so changing one affects the others. A fourth factor is GPU clock speed:
Here you can see that the RTX 3080 clocks to slightly lower levels than the 2080 Super, but that's the only GPU with a higher average speed in Metro Exodus. The 3080 drops a bit lower in FurMark, keeping power in check without compromising performance too much — unlike some other cards where we've seen them blow past official TDPs.
Overall, the combination of the new heatsink and fan arrangement appears more than able to keep up with the RTX 3080 FE's increased power use. It's one of the quietest high-end cards we've seen in a long time. There will undoubtedly be triple-slot coolers on some custom AIB cards that run just as quiet, but Nvidia's new 30-series Founders Edition cooler shows some great engineering talent.
GeForce RTX 3080: Is 10GB VRAM Enough?
In the past few weeks since the RTX 30-series announcement, there have been quite a few discussions about whether the 3080 has enough memory. Take a look at the previous generation with 11GB, or the RTX 3090 with 24GB, and 10GB seems like it's maybe too little. Let's clear up a few things.
There are ways to exceed using 10GB of VRAM, but it's mostly via mods and questionably coded games — or running a 5K or 8K display. The problem is that a lot of gamers use utilities that measure allocated memory rather than actively used memory (e.g., MSI Afterburner), and they see all of their VRAM being sucked up and think they need more memory. Even some games (Resident Evil 3 remake) do this, informing gamers that it 'needs' 12GB or more to properly run the ultra settings properly. (Hint: It doesn't.)
Using all of your GPU's VRAM to basically cache textures and data that might be needed isn't a bad idea. Call of Duty Modern Warfare does this, for example, and Windows does this with system RAM to a certain extent. If the memory is just sitting around doing nothing, why not put it to potential use? Data can sit in memory until either it is needed or the memory is needed for something else, but it's not really going to hurt anything. So, even if you look at a utility that shows a game using all of your VRAM, that doesn't mean you're actually swapping data to system RAM and killing performance.
You'll notice when data actually starts getting swapped out to system memory because it causes a substantial drop in performance. Even PCIe Gen4 x16 only has 31.5 GBps of bandwidth available. That's less than 5% of the RTX 3080's 760 GBps of bandwidth. If a game really exceeds the GPU's internal VRAM capacity, performance will tank hard.
If you're worried about 10GB of memory not being enough, my advice is to just stop. Ultra settings often end up being a placebo effect compared to high settings — 4K textures are mostly useful on 4K displays, and 8K textures are either used for virtual texturing (meaning, parts of the texture are used rather than the whole thing at once) or not used at all. We might see games in the next few years where a 16GB card could perform better than a 10GB card, at which point dropping texture quality a notch will cut VRAM use in half and look nearly indistinguishable.
There's no indication that games are set to start using substantially more memory, and the Xbox Series X also has 10GB of GPU memory, so an RTX 3080 should be good for many years, at least. And when it's not quite managing, maybe then it will be time to upgrade to a 16GB or even 32GB GPU.
If you're in the small group of users who actually need more than 10GB, by all means, wait for the RTX 3090 reviews and launch next week. It's over twice the cost for at best 20% more performance, which basically makes it yet another Titan card, just with a better price than the Titan RTX (but worse than the Titan Xp and 2080 Ti). And with 24GB, it should have more than enough RAM for just about anything, including scientific and content creation workloads.
GeForce RTX 3080: The New King of the Graphics Card Hill
Nvidia's last major launch was two years ago with the Turing architecture and RTX 20-series GPUs. Since then, Nvidia has released a slew of high-end, midrange, and budget offerings, plus the Super product refreshes, all built on TSMC's 12nm FinFET lithography. Meanwhile, AMD came out with its first 7nm GPUs in early 2019 with the Radeon VII and followed up with the Navi 1x RX 5700 series GPUs last July. One year later, Nvidia is finally ready to move beyond Turing and 12nm. It's about time.
Since the RTX 2080 Ti debuted in September 2018, the top of the GPU hierarchy has been static. Sure, there was the seemingly inevitable and laughably-priced Titan RTX a few months later, but at double the price of the 2080 Ti for a few percent more performance, that was purely for bragging rights — or professional users that actually needed the 24GB of memory, perhaps. Now the RTX 3080 knocks everything else off the mountain and plans to reign as the king for a while.
Except, of course, the GeForce RTX 3090 launches next week. But is that the king, or more like the emperor of GPUs? And do you really need it? Unless you were already in the market for a Titan RTX or thinking the RTX 2080 Ti was a reasonable option, probably not. On paper, the RTX 3090 is only 20% faster than the 3080, which means it will likely be more like 10-15% faster in practice — assuming you don't run into CPU bottlenecks. It does have more than twice as much memory, but as noted above, even that's more than a bit overkill.
The GeForce RTX 3080 is here, right now, and priced pretty reasonably considering the performance it offers. Last month, you could have spent $2,500 on dual RTX 2080 Ti cards hooked up via NVLink, only to find that multi-GPU support in games is largely dead, particularly in new releases. Now, for $700, you get 30% better performance than the outgoing RTX 2080 Ti and pocket $500 in savings. That's assuming you can find an RTX 3080 in stock.
Let's also be clear that the RTX 3080 is primarily for high-resolution gaming. Yes, you can run 1440p with RTX effects, and it will be a good fit. It's a better fit for 4K gaming. Don't bother with it if you're using a 1080p display, as you could get nearly the same level of performance with a lesser GPU. Which brings us to the next option: Wait for the RTX 3070 or RX 6800 XT (whatever AMD's $400-$500 option ends up being called).
The RTX 3070 should still be plenty fast for 1440p gaming, and more than fast enough for 1080p — just like the RTX 2080 Ti. Nvidia says it will perform "better than the 2080 Ti," though we take that marketing-speak with a scoop of salt. Out of all the benchmarks we ran, there was only one (Doom Eternal) where the 3080 actually doubled the 2080's performance.
Anyway, saving $200 and buying a 3070 could make a lot of sense. It's interesting to note that the RTX 3070 is a substantial step down from the RTX 3080, however. The 3080 has 48% more GPU, RT, and Tensor cores, it has 20% more memory, and the memory is clocked 36% higher. That's a big enough gap that we could see an RTX 3070 Ti down the road, but at what price? Alternatively, wait and see what AMD's Navi 2x / RX 6000 GPUs can do, which we'll hear about more on October 28.
The bottom line is that the RTX 3080 is the new high-end gaming champion, delivering truly next-gen performance without a massive increase in price. If you've been sitting on a GTX 1080 Ti or lower, waiting for a good time to upgrade, that time has arrived. The only remaining question is just how competitive AMD's RX 6000, aka Big Navi, will be. Even with 80 CUs, on paper, it looks like Nvidia's RTX 3080 may trump the top Navi 2x cards, thanks to GDDR6X and the doubling down on FP32 capability. AMD might offer 16GB of memory, but it's probably going to be paired with a 256-bit bus and clocked quite a bit lower than 19 Gbps, which may limit performance.
If you're still happy with your current graphics card, of course, there's no need to upgrade. But right now, we're staring at Cyberpunk 2077, Watch Dogs: Legion, and the next generation consoles with ray tracing support. I, for one, know I want to run those games with all the bells and whistles enabled — at least for a little while, before deciding if my GPU can cope. Certainly some time between now and December, I'm going to need a new GPU. Plus, November is my birthday month, but with time we'll hopefully have benchmarks of RX 6900 XT. One of those is destined to end up in my gaming PC.