Skip to main content

Nvidia’s Turing Architecture Explored: Inside the GeForce RTX 2080

Meet TU104 and GeForce RTX 2080

TU104: Turing With Middle Child Syndrome

It’s not that TU104 goes unloved, but again, we’re not used to introducing three GPUs alongside a new architecture. Then again, with GeForce RTX 2080 Ti starting at $1000, the RTX 2080, priced from $700, is going to find its way into more gaming PCs.

Similar to TU102, TSMC manufactures TU104 on its 12nm FinFET node. But a transistor count of 13.6 billion results in a smaller 545 mm² die. “Smaller,” of course, requires a bit of context. Turing Jr out-measures the last generation’s 471 mm² flagship (GP102) and comes close to the size of GK110 from the 2013-era GeForce GTX Titan.

TU104 is constructed with the same building blocks as TU102; it just features fewer of them. Streaming Multiprocessors still sport 64 CUDA cores, eight Tensor cores, one RT core, four texture units, 16 load/store units, 256KB of register space, and 96KB of L1 cache/shared memory. TPCs are still composed of two SMs and a PolyMorph geometry engine. Only here, there are four TPCs per GPC, and six GPCs spread across the processor. Therefore, a fully enabled TU104 wields 48 SMs, 3072 CUDA cores, 384 Tensor cores, 48 RT cores, 192 texture units, and 24 PolyMorph engines.

A correspondingly narrower back end feeds the compute resources through eight 32-bit GDDR6 memory controllers (256-bit aggregate) attached to 64 ROPs and 4MB of L2 cache.

TU104 also loses an eight-lane NVLink connection, limiting it to one x8 link and 50 GB/s of bi-directional throughput.

GeForce RTX 2080: TU104 Gets A (Tiny) Haircut

After seeing the GeForce RTX 2080 Ti serve up respectable performance in Battlefield V at 1920x1080 with ray tracing enabled, we can’t help but wonder if GeForce RTX 2080 is fast enough to maintain playable frame rates. Even a complete TU104 GPU is limited to 48 RT cores compared to TU102’s 68. But because Nvidia goes in and turns off one of TU104’s TPCs to create GeForce RTX 2080, another pair of RT cores is lost (along with 128 CUDA cores, eight texture units, 16 Tensor cores, and so on).

In the end, GeForce RTX 2080 struts onto the scene with 46 SMs hosting 2944 CUDA cores, 368 Tensor cores, 46 RT cores, 184 texture units, 64 ROPS, and 4MB of L2 cache. Eight gigabytes of 14 Gb/s GDDR6 on a 256-bit bus move up to 448 GB/s of data, adding more than 100 GB/s of memory bandwidth beyond what GeForce GTX 1080 could do.

GeForce RTX 2080 FEGeForce GTX 1080 FE
Architecture (GPU)Turing (TU104)Pascal (GP104)
CUDA Cores29442560
Peak FP32 Compute10.6 TFLOPS8.9 TFLOPS
Tensor Cores368N/A
RT Cores46N/A
Texture Units184160
Base Clock Rate1515 MHz1607 MHz
GPU Boost Rate1800 MHz1733 MHz
Memory Capacity8GB GDDR68GB GDDR5X
Memory Bus256-bit256-bit
Memory Bandwidth448 GB/s320 GB/s
ROPs6464
L2 Cache4MB2MB
TDP225W180W
Transistor Count13.6 billion7.2 billion
Die Size545 mm²314 mm²
SLI SupportYes (x8 NVLink)Yes (MIO)

Reference and Founders Edition RTX 2080s have a 1515 MHz base frequency. Nvidia’s own overclocked models ship with a GPU Boost rating of 1800 MHz, while the reference spec is 1710 MHz. Peak FP32 compute performance of 10.6 TFLOPS puts GeForce RTX 2080 Founders Edition behind GeForce GTX 1080 Ti (11.3 TFLOPS), but well ahead of GeForce GTX 1080 (8.9 TFLOPS). Of course, the faster Founders Edition model also uses more power. Its 225W TDP is 10W higher than the reference GeForce RTX 2080, and a full 45W above last generation’s GeForce GTX 1080.

MORE: Best Graphics Cards

MORE: Desktop GPU Performance Hierarchy Table

MORE: All Graphics Content