Meet TU104 and GeForce RTX 2080
TU104: Turing With Middle Child Syndrome
It’s not that TU104 goes unloved, but again, we’re not used to introducing three GPUs alongside a new architecture. Then again, with GeForce RTX 2080 Ti starting at $1000, the RTX 2080, priced from $700, is going to find its way into more gaming PCs.
Similar to TU102, TSMC manufactures TU104 on its 12nm FinFET node. But a transistor count of 13.6 billion results in a smaller 545 mm² die. “Smaller,” of course, requires a bit of context. Turing Jr out-measures the last generation’s 471 mm² flagship (GP102) and comes close to the size of GK110 from the 2013-era GeForce GTX Titan.
TU104 is constructed with the same building blocks as TU102; it just features fewer of them. Streaming Multiprocessors still sport 64 CUDA cores, eight Tensor cores, one RT core, four texture units, 16 load/store units, 256KB of register space, and 96KB of L1 cache/shared memory. TPCs are still composed of two SMs and a PolyMorph geometry engine. Only here, there are four TPCs per GPC, and six GPCs spread across the processor. Therefore, a fully enabled TU104 wields 48 SMs, 3072 CUDA cores, 384 Tensor cores, 48 RT cores, 192 texture units, and 24 PolyMorph engines.
A correspondingly narrower back end feeds the compute resources through eight 32-bit GDDR6 memory controllers (256-bit aggregate) attached to 64 ROPs and 4MB of L2 cache.
TU104 also loses an eight-lane NVLink connection, limiting it to one x8 link and 50 GB/s of bi-directional throughput.
GeForce RTX 2080: TU104 Gets A (Tiny) Haircut
After seeing the GeForce RTX 2080 Ti serve up respectable performance in Battlefield V at 1920x1080 with ray tracing enabled, we can’t help but wonder if GeForce RTX 2080 is fast enough to maintain playable frame rates. Even a complete TU104 GPU is limited to 48 RT cores compared to TU102’s 68. But because Nvidia goes in and turns off one of TU104’s TPCs to create GeForce RTX 2080, another pair of RT cores is lost (along with 128 CUDA cores, eight texture units, 16 Tensor cores, and so on).
In the end, GeForce RTX 2080 struts onto the scene with 46 SMs hosting 2944 CUDA cores, 368 Tensor cores, 46 RT cores, 184 texture units, 64 ROPS, and 4MB of L2 cache. Eight gigabytes of 14 Gb/s GDDR6 on a 256-bit bus move up to 448 GB/s of data, adding more than 100 GB/s of memory bandwidth beyond what GeForce GTX 1080 could do.
Row 0 - Cell 0 | GeForce RTX 2080 FE | GeForce GTX 1080 FE |
Architecture (GPU) | Turing (TU104) | Pascal (GP104) |
CUDA Cores | 2944 | 2560 |
Peak FP32 Compute | 10.6 TFLOPS | 8.9 TFLOPS |
Tensor Cores | 368 | N/A |
RT Cores | 46 | N/A |
Texture Units | 184 | 160 |
Base Clock Rate | 1515 MHz | 1607 MHz |
GPU Boost Rate | 1800 MHz | 1733 MHz |
Memory Capacity | 8GB GDDR6 | 8GB GDDR5X |
Memory Bus | 256-bit | 256-bit |
Memory Bandwidth | 448 GB/s | 320 GB/s |
ROPs | 64 | 64 |
L2 Cache | 4MB | 2MB |
TDP | 225W | 180W |
Transistor Count | 13.6 billion | 7.2 billion |
Die Size | 545 mm² | 314 mm² |
SLI Support | Yes (x8 NVLink) | Yes (MIO) |
Reference and Founders Edition RTX 2080s have a 1515 MHz base frequency. Nvidia’s own overclocked models ship with a GPU Boost rating of 1800 MHz, while the reference spec is 1710 MHz. Peak FP32 compute performance of 10.6 TFLOPS puts GeForce RTX 2080 Founders Edition behind GeForce GTX 1080 Ti (11.3 TFLOPS), but well ahead of GeForce GTX 1080 (8.9 TFLOPS). Of course, the faster Founders Edition model also uses more power. Its 225W TDP is 10W higher than the reference GeForce RTX 2080, and a full 45W above last generation’s GeForce GTX 1080.
MORE: Best Graphics Cards
MORE: Desktop GPU Performance Hierarchy Table
MORE: All Graphics Content