Fewer than three months have passed since Nvidia took the wraps off of GeForce GTX Titan X, and the company is already launching another GM200-based graphics card called GeForce GTX 980 Ti. It’s about $400 cheaper than the flagship’s street price. Yet, we’re told it only gives up a few percentage points of performance. Is there still a reason to lust after the Titan X? Could you, in good conscience, spend $500 on the 980 knowing that this monster exists (yes, the 980 is dropping $50, according to Nvidia)? Is this move preempting AMD’s upcoming ultra-high-end Fiji unveiling?
Any answer to that last question would be purely speculative. But we weren’t expecting to see a Titan X derivative so soon. Nvidia introduced its original GeForce GTX Titan in February of 2013 and followed up nine months later with GeForce GTX 780 Ti, also based on the GK110 GPU. Those cards were decidedly not built for the same customers. The Titan had one of its SMX clusters turned off, a then-unprecedented 6GB of memory and a GPU equally adept at 3D and double-precision math. Meanwhile, the 780 Ti featured a full 2880 CUDA cores and 240 texture units for graphics supremacy, higher clock rates and a $300-lower price tag. Most gamers with money to spend had little trouble choosing 780 Ti over the Titan.
Unfortunately, there was also a good reason to ding it: Nvidia armed GeForce GTX 780 Ti with 3GB of memory, and the rumored 6GB models never materialized. Two years ago, that was fine for 2560x1440. And 4K screens weren’t really “a thing” yet; those that did exist were $3000+ affairs. We did, however, figure out that 3GB wasn’t enough RAM to game smoothly on a trio of QHD displays (>11 million pixels). Later, we also ran into situations where 4K (>8 million pixels) was held back by the card’s available memory.
Today’s monitor market looks nothing like it did then. Ultra HD screens start under $500. Nvidia’s G-Sync variable refresh rate technology is almost 18 months-old. And AMD’s FreeSync equivalent is gaining momentum as well. We have to assume that anyone shopping for a high-end graphics card in 2015 is at least considering an upgrade to 4K.
Tweaking GM200 For GeForce GTX 980 Ti
Nvidia knows where the display market is heading, and it isn’t about to shortchange this generation’s Titan-derivative in the memory department. Beyond adding more on-board GDDR5 than 780 Ti, the company’s Maxwell architecture utilizes available bandwidth to greater effect—something we first observed last February from GeForce GTX 750 Ti and its GM107 GPU. GM200 is built even more robustly than that early implementation of Maxwell. Each of its SMMs sports 96KB of shared memory and a 48KB texture/L1 cache, while a large 3MB L2 cache minimizes requests made to DRAM as much as possible. All of those hardware-oriented changes, combined with new color compression schemes, make playable performance at 4K a more realistic goal for certain single-GPU systems.
That’s the good news. But because Nvidia’s GeForce GTX Titan X already features a fully-enabled GM200 processor, there’s really no way to make the 980 Ti faster. This creates a bit of an issue for differentiating two high-end cards based on the same ASIC.
How about characterizing their strengths in compute-oriented workloads? Last generation, the Titan was capable of around 1.5 TFLOPS of double-precision math. Nvidia artificially dialed the 780 Ti to 1/8 of that, or roughly 210 GFLOPS, creating a nice split between them. But the same option isn’t available today, since GM200 gives up its compute potential altogether in favor of efficient gaming. As a result, the Titan X and 980 Ti are both limited to native FP64 rates of 1/32.
So, with Titan X already out there, selling for more than $1000, the company’s only option seemed to be a surgical incision, trimming away some of GM200’s resources and creating a GeForce GTX 980 Ti that’s slightly less potent than Titan X, but more compelling than GeForce GTX 980 (and a big upgrade over 780 Ti).
At least the haircut isn’t dramatic. We’re still looking at GM200 and its six Graphics Processing Clusters. Only, across that sextet, two Streaming Multiprocessors are disabled. With 128 CUDA cores per SMM, you’re down 256, yielding a total of 2816 cores across the processor. Similarly, the loss of eight texture units per SMM results in a GPU with 176 (instead of 192).
You might guess that fusing off ~8% of GM200’s shader and texturing resources would result in a corresponding performance drop in games bound by those parts of the graphics pipeline. But Nvidia claims that the difference between GeForce GTX Titan X and 980 Ti is minor.
The company doesn’t seem to be worried. It isn’t trying to compensate with higher clock rates—GeForce GTX 980 Ti is marketed at the same 1000MHz base and 1075MHz GPU Boost clock rates as Titan X. And the GPU’s back-end doesn’t change either. From our Titan X story:
“GeForce GTX 980’s four ROP partitions grow to six in (GeForce GTX 980 Ti). With 16 units each, that’s up to 96 32-bit integer pixels per clock. The ROP partitions are aligned with 512KB slices of L2 cache, totaling 3MB in GM200. When it introduced GeForce GTX 750 Ti, Nvidia talked about a big L2 as a mechanism for preventing bottlenecks on a relatively narrow 128-bit memory interface. That’s not as big of a concern with GM200, given its 384-bit path populated by 7 Gb/s memory. Maximum throughput of 336.5 GB/s matches the GeForce GTX 780 Ti, and exceeds GeForce GTX Titan, GeForce GTX 980 and Radeon R9 290X.”
Whereas the Titan X sports 12GB of GDDR5 memory, though, the GeForce GTX 980 Ti comes with 6GB at the same 7 Gb/s. That’s hardly a compromise, we’d say. Six gigabytes is plenty for 4K or three QHD screens in Surround. Don’t expect to see 12GB versions down the road, either. Nvidia doesn’t plan to chew into Titan X sales with a beefed-up 980 Ti.