The Nvidia RTX 3060 12GB brings a new level of performance to the mainstream market--sort of. Officially, the RTX 3060 launches today with prices starting at just $329. Realistically? You're as likely to find one at that price as you are to find an RTX 3060 Ti at $399, RTX 3070 at $499, or RTX 3080 at $699 — not entirely impossible, perhaps, but highly unlikely. Nvidia's Ampere architecture now powers many of the best graphics cards, and they're all seeing massive levels of demand from both gamers and cryptocurrency miners. Nvidia has added firmware and driver code to detect Ethereum mining, which should help a bit, but when people are willing to pay extreme scalper pricing on eBay, even for cards like the GTX 1660 Super and RTX 2060, everything in our GPU benchmarks hierarchy is pretty much sold out right now. Nvidia is even working with partners to bring back previous generation Turing and Pascal cards.
None of that makes this a bad GPU, but we expect the RTX 3060 to be just as difficult to acquire as any other modern GPU. Eventually, the current Ethereum mining boom will fade away, but it could take a year or more before we see the end of chip shortages. That shouldn't surprise anyone at this point, but if you've been hoping for a reasonably priced gaming PC upgrade, it's a depressing state of affairs.
Unlike the previous Ampere GPUs, Nvidia won't offer an RTX 3060 Founders Edition, so we're looking at a third-party card. Nvidia shipped us the EVGA GeForce RTX 3060 XC for this launch review, a reasonably compact and relatively unassuming card. There's no metal (or even plastic) backplate, no RGB lighting, and two custom-sized 87mm fans for cooling with a 2.0-slot form factor. The card measures 202x110x38mm and weighs 653g, which is quite the change of pace compared to the other third-party Ampere cards we've reviewed so far.
There are reasons for that, of course. Creating a mainstream card and decking it out with all the bells and whistles costs money. And we think most gamers shopping for a good value are far better served by modest designs with good performance. There will certainly be extreme variants of the RTX 3060, and some of them will be priced higher than the budget RTX 3060 Ti options. Let's be clear: Even the fastest RTX 3060 won't beat a 3060 Ti in most situations — yes, even with 12GB VRAM. That's because memory capacity isn't a huge factor once you go above 8GB, and having more memory bandwidth, thanks to its wider memory bus, gives the 3060 Ti a big advantage. Also, the 3060 Ti has 35% more GPU cores.
|Graphics Card||RTX 3060 Ti||RTX 3060||RTX 2060 Super||RTX 2060|
|Process Technology||Samsung 8N||Samsung 8N||TSMC 12FFN||TSMC 12FFN|
|Die size (mm^2)||392.5||276||445||445|
|Base Clock (MHz)||1410||1320||1470||1410|
|Boost Clock (MHz)||1665||1777||1650||1680|
|VRAM Speed (Gbps)||14||15||14||14|
|VRAM Bus Width||256||192||256||192|
|GFLOPS FP32 (Boost)||16.2||12.7||7.2||6.5|
|TFLOPS FP16 (Tensor)||65 (130)||51 (102)||57||52|
Here's how things break down, comparing the RTX 3060 with its closest Ampere sibling and Turing predecessors. The RTX 2060 and 2060 Super show how much things have changed for the -60 suffix cards between Turing and Ampere. Ampere gives you a lot more shader cores, which means potentially much higher computational performance, and a minor improvement in memory bandwidth for the 12GB card. It also doubles VRAM capacity (at least until the anticipated RTX 3060 6GB shows up, though perhaps maybe Nvidia will just leave that for the RTX 3050 line) and boasts improvements in the RT and Tensor cores, as well as the memory subsystem, all leading to better performance. Power use remains similar, with a 170W TGP (Total Graphics Power), a decent step down from the RTX 3060 Ti's 220W TGP.
One interesting tidbit is that this is the first time Nvidia has used 15Gbps GDDR6 memory. The RTX 20-series cards all used 14Gbps memory, except for the RTX 2080 Super that came equipped with 15.5Gbps VRAM. That narrows the bandwidth gap between the 3060 and 3060 Ti a bit, though the extra 64-bits of interface width still gives the GA104 cards a clear advantage. And GA106 doesn't have an advantage is in ROPs, Render Outputs, as it only has 48 — the same as the RTX 2060.
However, the differences between Turing and Ampere GPUs don't always show up in specs tables like the above. Theoretically, the RTX 3060 has up to 95% more FP32 performance and 97% more FP16 Tensor core performance than the RTX 2060. In practice, the actual performance difference is much less, as half of the FP32 pipelines share processing resources with INT32 pipelines. The 3060 shouldn't ever be slower for gaming purposes, but most of the time, it will only be around 20-25 percent faster.
This is the first desktop card to use Nvidia's GA106 processor. At a high level, there are three GPCs (Graphics Processing Clusters), each with up to 10 SMs and 16 ROPs (the two blocks of eight blue rectangles each at the bottom of the GPC). The full chip has 30 SMs while the 3060 disables two and ends up with 28 SMs, but everything else is left alone. (Note that the mobile RTX 3060 has all 30 SMs enabled, though it only comes with 6GB of memory, which is also clocked lower than on the desktop card.)
Each SM contains 64 dedicated FP32 CUDA cores, plus 64 more FP32+INT32 CUDA cores — only FP32 or INT32 can be used for each cycle. The SMs also contain one second-gen RT core and four third-gen Tensor cores, each of which is up to twice the performance as the previous generation cores, and with sparsity the Tensor cores are potentially four times as fast as on Turing. Finally, there are six 32-bit memory interfaces, each one linked to a single 8Gb or 16Gb GDDR6 module — the latter is reserved for desktops at present, with the 8Gb modules used on laptops.
The full GA106 chip has 12 billion transistors, down from 17.4 billion in GA104. That shrinks the die size from 393mm square to just 276mm square, which not only helps to reduce the cost of the chip, but also increases the number of chips Nvidia can get from a single wafer — and if you're wondering, GA106 is less than half the size of GA102, which measures 628.4mm square and has 28.3 billion transistors. At an estimate, Nvidia can get around 130 dies per wafer with GA104 (some of which are defective, most of which end up as partially disabled chips), while the smaller size of GA106 allows for around 200 dies per wafer. More dies mean better yields and more graphics cards to go around. That's the hope.
MORE: Best Graphics Cards
MORE: All Graphics Content