There are more than 7 billion humans on our fair planet. As it happens, Nvidia’s GK110 GPU—previously found only in the Tesla K20X and K20 cards—comprises 7.1 billion transistors. If you could break GK110 into single-transistor pieces, everyone on Earth could have one. And here I am, staring at three of these massive GPUs driving a trio of GeForce GTX Titan cards.
According to the folks at Nvidia, GK110 is the largest chip that TSMC can manufacture using its 28 nm node. Common sense dictates that this GPU would be expensive, power-hungry, hot, and susceptible to low yields. Two of those are absolutely true. The other assumptions, surprisingly, are not.
My, Titan, You Look Familiar
In spite of its large, complex graphics processor, GeForce GTX Titan isn’t a massive card by any means. It falls right in between the 10” GeForce GTX 680 and 11” GeForce GTX 690, measuring 10.5”-long (half of an inch shorter than AMD's Radeon HD 7970). It also eschews the 680’s comparatively cheap-feeling plastic shroud with a solid dual-slot shell clearly derived from Nvidia’s work on GeForce GTX 690.
There are notable differences between the two premium boards, though. Whereas the 690 needed an axial-flow fan to effectively cool two GPUs, GeForce GTX Titan employs a centrifugal blower to exhaust hot air out the back of the card. No more recirculating heat back into your case—that’s the good news. Unfortunately, Nvidia says the 690’s magnesium alloy fan housing was too expensive, so the entire cover is now aluminum (except for the polycarbonate window—another design cue that carries over from GeForce GTX 690).
Blower-style fans are sometimes derided for making more noise than axial-flow coolers. And if you manually force this card’s fan to the upper range of its duty cycle, it’ll scream at you. Let the board balance its own thermal situation, though, and the GeForce GTX Titan is hardly audible at all. Nvidia attributes this partly to vapor chamber cooling (a technology both Nvidia and AMD employ in their high-end cards), but also to more effective thermal interface material, the extended stack of aluminum cooling fins, and dampening material around its fan.
Similar aesthetics and acoustics aren’t the only two qualities GeForce GTX Titan and GeForce GTX 690 share. Like the flagship board, this new card sports the GeForce GTX logo along its top edge. The green text is likewise LED-backlit, and the lighting is controllable through bundled software.
You’ll also find SLI connectors above the Titan’s rear I/O bracket. Nvidia enables two-, three-, and four-way configurations, though company reps readily admit that scaling on the fourth GPU is better for making runs at performance records rather than real-world gaming gains.
Any combination of GeForce GTX Titan cards is able to output to four displays simultaneously—three screens in Surround, along with an accessory display. Obviously, with just one card plugged in, you’d need to use all of its outputs: two dual-link DVI connectors, one full-sized HDMI port, and one full-sized DisplayPort output. Two- and three-way SLI setups give you several other options for hooking up multi-monitor configurations.
Although the Titan’s GK110 processor is immense, it’s less power-hungry than two ~3.5 billion transistor GK104s working in tandem on a GeForce GTX 690. That card bears a 300 W TDP and consequently requires two eight-pin power leads.
GeForce GTX Titan is rated for 250 W—the same as a Radeon HD 7970 GHz Edition—necessitating one eight- and one six-pin connector. The math adds up to 75 W from a PCI Express x16 slot, 75 W from the six-pin plug, and 150 W from the eight-pin connection, leaving plenty of headroom to stay within spec. Nvidia recommends you match this card up to a 600 W power supply at least, though most of the shops building mini-ITX systems seem to be getting away with 450 and 500 W PSUs.
| GeForce GTX Titan | GeForce GTX 690 | GeForce GTX 680 | Radeon HD 7970 GHz Ed. | |
|---|---|---|---|---|
| Shaders | 2,688 | 2 x 1,536 | 1,536 | 2,048 |
| Texture Units | 224 | 2 x 128 | 128 | 128 |
| Full Color ROPs | 48 | 2 x 32 | 32 | 32 |
| Graphics Clock | 836 MHz | 915 MHz | 1,006 MHz | 1,000 MHz |
| Texture Fillrate | 187.5 Gtex/s | 2 x 117.1 Gtex/s | 128.8 Gtex/s | 134.4 Gtex/s |
| Memory Clock | 1,502 MHZ | 1,502 MHz | 1,502 MHz | 1,500 MHz |
| Memory Bus | 384-bit | 2 x 256-bit | 256-bit | 384-bit |
| Memory Bandwidth | 288.4 GB/s | 2 x 192.3 GB/s | 192.3 GB/s | 288 GB/s |
| Graphics RAM | 6 GB GDDR5 | 2 x 2 GB GDDR5 | 2 GB GDDR5 | 3 GB GDDR5 |
| Die Size | 551 mm2 | 2 x 294 mm2 | 294 mm2 | 365 mm2 |
| Transistors (Billion) | 7.1 | 2 x 3.54 | 3.54 | 4.31 |
| Process Technology | 28 nm | 28 nm | 28 nm | 28 nm |
| Power Connectors | 1 x 8-pin, 1 x 6-pin | 2 x 8-pin | 2 x 6-pin | 1 x 8-pin, 1 x 6-pin |
| Maximum Power | 250 W | 300 W | 195 W | 250 W |
| Price (Street) | $1,000 | $1,000 | $460 | $430 |
Cut away the beefy cooler and you’ll expose this card’s massive GPU, memory subsystem, and voltage regulation circuitry.
The Titan’s GK110 graphics processor runs at 836 MHz, minimum. However, it benefits from a reworked version of GPU Boost that Nvidia says can typically keep the chip operating at 876 MHz. We know from extensive testing in GeForce GTX 680 2 GB Review: Kepler Sends Tahiti On Vacation, however, that the behavior of GPU Boost depends heavily on a number of factors, right down to the temperature in your room. In our World of Warcraft benchmark, for example, GeForce GTX Titan barely crests 70% of its board power. So, the GPU ramps up to 993 MHz, even as its temperature hovers around 77 degrees Celsius. More on GPU Boost 2.0 shortly.
Twelve 2 Gb packages on the front of the card and 12 on the back add up to 6 GB of GDDR5 memory. The .33 ns Samsung parts are rated for up to 6,000 Mb/s, and Nvidia operates them at 1,502 MHz. On a 384-bit aggregate bus, that’s 288.4 GB/s of bandwidth.
Six power phases for the GPU and two for the memory are relatively easy to identify toward Titan’s back half. In comparison, the GeForce GTX 690 employed five phases per GPU (10 total) and one memory phase per GPU. This is significant because Nvidia enables overvolting on GeForce GTX Titan, explicitly affecting the card’s ability to hit higher GPU Boost frequencies. Maxing out the voltage settings exposed through EVGA’s Precision X software, we were able to increase our sample’s clock from a typical 876 MHz to nearly 1.2 GHz—all while keeping the GPU’s temperature under a defined threshold of 87 degrees Celsius.
Think back to Nvidia’s last generation of graphics cards, the Fermi-based 500-series. For each of its GPUs, the company’s marketing team came up with different battlefield classes: the tank, the hunter, and the sniper, each configuration optimized for a different role. The GeForce GTX 580’s GF110 was the heavy-hitting tank. Big, powerful, and expensive, it represented the Fermi architecture’s maximum potential.
In comparison, we knew right out of the gate that the GeForce GTX 680’s GPU was no GF110-successor, even though Nvidia wanted $500 for the privilege of owning one. GK104 is optimized for gaming and it sacrificed compute performance in a dramatic way, underperforming the 580 in our OpenCL-based tests. At the time, Nvidia downplayed the significance of GK104’s compromises, preferring to instead hammer home how well its 3.5 billion transistor chip did against AMD’s 4.3 billion transistor Tahiti GPU in games.

But then the company introduced its Tesla K20 family, powered by GK110—the true tank (even if Nvidia isn’t using that parallel any more).
Inside The SMX
A complete GK110 GPU consists of 15 Streaming Multiprocessors, which, remember, now go by the name SMX. These SMX blocks are largely the same as they are in GK104, powering GeForce GTX 680. They still include 192 CUDA cores, 16 texture units, and very similar cache structures. But there are obviously a lot more of them. GK104 includes eight SMX blocks. GK110 hosts 15. Because the chip is so big and complex, though, defects seriously affect yields. Perfectly-manufactured GPUs undoubtedly exist. However, even the highest-end GK110-based products have one disabled SMX. Multiply out 192 shaders 14 times, and you get a GPU with 2,688 CUDA cores. Moreover, 16 texture units for each of 14 SMXes gives you a total of 224 TMUs, up from GeForce GTX 680’s 128.
| Per SMX: | GF100 (Fermi) | GF104 (Fermi) | GK110 (Kepler) | GK104 (Kepler) |
|---|---|---|---|---|
| CUDA Compute Capability | 2.0 | 2.0 | 3.5 | 3.0 |
| Threads/Warp | 32 | 32 | 32 | 32 |
| Maximum Warps/SMX | 48 | 48 | 64 | 64 |
| Maximum Threads/SMX | 1,536 | 1,536 | 2,048 | 2,048 |
| Maximum Thread Blocks/SMX | 8 | 8 | 16 | 16 |
| 32-bit Registers/SMX | 32,768 | 32,768 | 65,536 | 65,536 |
| Maximum Registers/Thread | 63 | 63 | 255 | 63 |
| Maximum Threads/Thread Block | 1,024 | 1,024 | 1,024 | 1,024 |
Beyond simply piling on additional resources that accelerate gaming, GK110 addresses the “hunter’s” most glaring shortcoming (particularly if you consider GeForce GTX 680 a replacement for GeForce GTX 580): its compute potential. In GK104, each SMX features 192 FP32-capable cores, yielding more than 3 TFLOPS of peak floating-point performance. But you only get eight FP64 units, capping double-precision performance to 1/24 of the FP32 rate. A GK110 SMX incorporates 64 FP64 CUDA cores, narrowing that ratio to 1/3. Nvidia says a GeForce GTX Titan offers up to 4.5 TFLOPS of single-precision and 1.5 TFLOPS of peak double-precision compute power. In theory, that puts it just ahead of AMD’s Radeon HD 7970 GHz Edition card, rated for 4.3 TFLOPS of single- and 1.01 TFLOPS of double-precision performance.
GK110's SMX, with 64 FP64 CUDA cores
GK104's SMX: Not pictured, eight FP64 cores
We’re naturally happy to see GK110 bring an emphasis back onto compute. However, there’s no question that GeForce GTX Titan’s ability to cut through real-time graphics is top priority. In order to balance that 75% increase in shader and texture unit count, Nvidia also bolsters the GPU’s back-end. GK104’s four ROP partitions are able to output eight 32-bit integer pixels per clock, adding up to what the company calls 32 ROP units. GK110 leverages six of those blocks, increasing that number to 48.
Both the GeForce GTX 680 and Titan employ GDDR5 memory running at 1,502 MHz. But because GK110 features six 64-bit memory interfaces, rather than GK104’s four, peak bandwidth increases 50% from 192 GB/s to 288 GB/s. That matches AMD’s reference Radeon HD 7970 GHz Edition card, which also sports 1,500 MHz GDDR5 on a 384-bit bus.
As I was testing Nvidia’s GeForce GTX Titan, but before the company was able to talk in depth about the card’s features, I noticed that double-precision performance was dismally low in diagnostic tools like SiSoftware Sandra. Although it should have been 1/3 the FP32 rate, my results looked more like the 1/24 expected from GeForce GTX 680.
It turns out that, in order to maximize the card’s clock rate and minimize its thermal output, Nvidia purposely forces GK110’s FP64 units to run at 1/8 of the chip’s clock rate by default. Multiply that by the 1:3 ratio of double- to single-precision CUDA cores, and the numbers I saw initially turn out to be correct.
But Nvidia claims this card is the real deal, capable of 4.5 TFLOPS single- and 1.5 TFLOPS double-precision throughput. So, what gives?
It’s improbable that Tesla customers are going to cheap out on gaming cards that lack ECC memory protection, the bundled GPU management/monitoring software, support for GPUDirect, or support for Hyper-Q (Update, 3/5/2013: Nvidia just let us know that Titan supports Dynamic Parallelism and Hyper-Q for CUDA streams, and does not support ECC, the RDMA feature of GPU Direct, or Hyper-Q for MPI connections). However, developers can still get their hands on Titan cards to further promulgate GPU-accelerated apps (without spending close to eight grand on a Tesla K20X), so Nvidia does want to enable GK110’s full compute potential.

Tapping in to the full-speed FP64 CUDA cores requires opening the driver control panel, clicking the Manage 3D Settings link, scrolling down to the CUDA – Double precision line item, and selecting your GeForce GTX Titan card. This effectively disables GPU Boost, so you’d only want to toggle it on if you specifically needed to spin up the FP64 cores.
We can confirm the option unlocks GK110’s compute potential, but we cannot yet share our benchmark results. So, you’ll need to look out for those in a couple of days.
GPU Boost is Nvidia’s mechanism for adapting the performance of its graphics cards based on the workloads they encounter. As you probably already know, games exact different demands on a GPU’s resources. Historically, clock rates had to be set with the worst-case scenario in mind. But, under “light” loads, performance ended up on the table. GPU Boost changes that by monitoring a number of different variables and adjusting clock rates up or down as the readings allow.
In its first iteration, GPU Boost operated within a defined power target—170 W in the case of Nvidia’s GeForce GTX 680. However, the company’s engineers figured out that they could safely exceed that power level, so long as the graphics processor’s temperature was low enough. Therefore, performance could be further optimized.
Practically, GPU Boost 2.0 is different only in that Nvidia is now speeding up its clock rate based on an 80-degree thermal target, rather than a power ceiling. That means you should see higher frequencies and voltages, up to 80 degrees, and within the fan profile you’re willing to tolerate (setting a higher fan speed pushes temperatures lower, yielding more benefit from GPU Boost). It still reacts within roughly 100 ms, so there’s plenty of room for Nvidia to make this feature more responsive in future implementations.
Of course, thermally-dependent adjustments do complicate performance testing more than the first version of GPU Boost. Anything able to nudge GK110’s temperature up or down alters the chip’s clock rate. It’s consequently difficult to achieve consistency from one benchmark run to the next. In a lab setting, the best you can hope for is a steady ambient temperature.

Vendor-Sanctioned Overvoltage?
When Nvidia creates the specifications for a product, it targets five years of useful life. Choosing clock rates and voltages is a careful process that must take this period into account. Manually overriding a device’s voltage setting typically causes it to run hotter, which adversely effects longevity. As a result, overclocking is a sensitive subject for most companies—it’s standard practice to actively discourage enthusiasts from tuning hardware aggressively. Even if vendors know guys like us ignore those warnings anyway, they’re at least within their right to deny support claims on components that fail prematurely due to overclocking.
Now that GPU Boost 2.0 is tied to thermal readings, the technology can make sure GK110 doesn’t venture up into a condition that’ll hurt it. So, Nvidia now allows limited voltage increases to improve overclocking headroom, though add-in card manufacturers are free to narrow the range as they see fit. Our reference GeForce GTX Titans default to a 1,162 mV maximum, though EVGA’s Precision X software pushed them as high as 1,200 mV. You are asked to acknowledge the increased risk due to electromigration. However, your warranty shouldn’t be voided.
There are few different scenarios where I personally think Nvidia’s GeForce GTX Titan makes sense (and many where it doesn’t). A very high-end gaming PC is, naturally, one of the applications for this board. I’m not talking about a $2,000 machine with one really nice graphics card. Think bigger—like multiple flagship GPUs operating in concert. That’s the rarified air of gaming-oriented hardware.
See, if you’re satisfied with a single dual-slot card, and you plan to game at 2560x1600 or 5760x1080, one GeForce GTX 690 is the most elegant option (albeit at $1,000). Beyond that, you’re looking at two GeForce GTX 690s or two/three GeForce GTX Titans as the most beastly combinations available. Of course, we could also get into three or four GeForce GTX 680s and Radeon HD 7970s, but until we can get capture-based frame time analysis going in the lab, it’s going to be difficult to quantify the experience those super-parallel setups deliver.
Nevertheless, in order to demonstrate the very upper limits of what GeForce GTX Titan can do, Nvidia had Geek Box build us one of its Ego Maniacal X79 gaming systems with three cards. The box is a monster, centering on SilverStone’s Temjin TJ11B-W. Aluminum though the big case might be, it still weighs darned near 40 pounds completely empty. Packed with hardware, it took a FedEx semi-trailer, a pallet, and a big lift gate to get it into my garage.
All set up, though, the finished product clearly reflects the Geek Box crew’s attention to detail. Cable management is easily handled through the use of individually-sleeved power supply wires, while water-cooling applied to the CPU and motherboard voltage regulation circuitry helps address the typical standing-air issues we bring up any time we debate cooling strategies.
Not that air flow is an issue in the TJ11B-W. In fact, fan noise is perhaps this setup’s most prescient weakness. It’s not loud, per se. But when Nvidia is trying to demonstrate the sound of three Titans running quietly, and all you can hear is air moving through radiators, well, we call that an exercise in futility. The good news, of course, is that three Titans, back to back to back, are incredibly quiet.
To be fair, the Ego’s acoustic output is likely a product of its Core i7-3970X processor running at a stable 4.6 GHz—likely set in an effort to mitigate the potential for platform bottlenecks behind three potent GPUs. The six-core CPU sits on Asus’ Rampage IV Extreme motherboard, and is fed data through 32 GB of Corsair Dominator memory in a quad-channel configuration. Two of the company’s Neutron GTX SSDs form a 480 GB striped array, pretty much guaranteeing that you won’t spend any time waiting for levels to load.

Because the machine’s acoustic properties would have interfered with some of our benchmarks, we transplanted the Titans into our test bed. But not before jumping into 3DMark 11 and generating scores in excess of (redacted—look out for benchmarks in a couple of days) using the Extreme preset and GPU clock rates in excess of 1,100 MHz.
Completely unrelated to the Ego Maniacal’s performance, it’s really interesting that Geek Box is borrowing liberally from BMW M GmbH with its logo. Although I’m an AMG guy myself, the colors and lines are very classy. Still, I’d think someone in Germany might take issue with the homage.
There’s one other place I see a grotesquely expensive single-GPU card making more sense than a pair of value-oriented boards: in an environment only able to accommodate one dual-slot solution. You may have already read my opinions of Falcon Northwest’s Tiki—in fact, I liked that system enough to buy it when my eval was up. Now I have iBuyPower’s Revolt in the lab, and Digital Storm wants to tell the Tom’s Hardware audience how its Bolt came to be (and then came to be improved upon).
The sample sent to our lab includes a Core i7-3770K on iBuyPower’s own branded iBP-Z77E/S Revolt motherboard. A single 8 GB DDR3-1600 memory module leaves one of the platform’s channels unpopulated, unfortunately (though we’re told our system should have come with two 4 GB sticks). From there, storage is tiered, with 1 TB of disk-based capacity complementing a 120 GB Intel SSD 330. It took me by some surprise that such a capable list of specs came out to a fairly modest $1,515.
Officially, iBuyPower hasn’t qualified the GeForce GTX Titan. But it does have the card running in the lab and says it works. That was all of the encouragement I needed to transplant a Titan into the place previously occupied by my review system’s GeForce GTX 670. This isn’t something I’d recommend to anyone, by the way. The Revolt is somewhat user-serviceable, but getting it back together without smashing down on its water-cooling system is challenging.
At any rate, the Titan is a drop-in replacement. Its centrifugal fan pulls air from the same place as a reference 670. It’s a little longer than the GK104-based board, but there’s plenty of room in the Revolt for it to fit. Really, the only spec you need to worry about is Titan’s 250 W TDP, which gobbles up much of the Revolt’s peak output (our system includes a 500 W 80 PLUS Gold-rated supply from FSP). Nvidia recommends a 600 W PSU.
Both the GeForce GTX 670 and Titan fit inside iBuyPower's Revolt
Fortunately, companies like iBuyPower, Falcon Northwest, and Digital Storm will take care of guaranteeing their mini-ITX machines support Titan before they start selling the card. But what I’m hearing is that GeForce GTX Titan will replace 680 as the highest-end option offered in the most diminutive systems. The allure of small form factor gaming naturally gets even stronger as a result.
What’s the experience like? Faster, without an impact on acoustics. Neither the Tiki nor the Revolt are silent. They’re both quiet, though. Where you can tell a difference between them is in the pitch they generate. Falcon Northwest’s box makes noise at a lower frequency, while the Revolt’s output is a little more noticeable (perhaps as a result of its 40 mm power supply fan). Consequently, the graphics card’s fan isn’t perceptible at idle, and incredibly quiet under load.
So long as system builders are able to give it enough power, GeForce GTX Titan minds its manners in the confines of very cramped cases. The card exhausts all of its waste heat out of an externally-facing vent using a fan that cuts through air almost silently. Pretty amazing for a 250 W card hosting a 7 billion transistor GPU.
Nobody likes a tease. Unfortunately, Nvidia is asking that GeForce GTX Titan’s benchmark results remain confidential for another couple of days. We’re naturally using that time to generate as much data as possible: comparisons against GeForce GTX 680, GeForce GTX 690, and Radeon HD 7970 GHz Edition; two-way and three-way SLI configurations to pit against 690s in four-way SLI; power consumption; heat; noise; and of course, compute performance. Even when we are able to publish the outcome of our testing, GeForce GTX Titan won't be available for you to buy. The company says to expect availability the week of February 25.
Given the numbers we’ve already run using earlier drivers, along with the information presented today, what can we say about Nvidia’s GeForce GTX Titan? I’ve actually seen enough to draw my conclusions; the upcoming data dump is only going to serve to support my opinion.
Enthusiasts shopping for ultra-high-end cards like this one know they’re not going to get a good deal. Two GeForce GTX 680s sell for about $920. Better still, two Radeon HD 7970s (which are faster) can be had for $800. And as you slide down the scale, every dollar spent tends to stretch further. That’s not what the GeForce GTX Titan is about, though.
Titan: At home in big, beefy gaming PCs and mini-ITX enclosures
Rather, this thing incorporates a GPU currently found in the Tesla K20X (which HP will sell you for $7,700), a cooler clearly derived from the marquee GeForce GTX 690, and a staggering 6 GB of GDDR5 memory. As you’ll soon see, the combination generally falls between a GeForce GTX 690 and Radeon HD 7970 GHz Edition in our benchmarks. That means:
- Pay the same $1,000 for a GeForce GTX 690 if you only want one dual-slot card and your case accommodates the long board. It remains the fastest graphics solution we’ve ever tested, so there's no real reason not to favor it over Titan.
- The Titan isn’t worth $600 more than a Radeon HD 7970 GHz Edition. Two of AMD’s cards are going to be faster and cost less. Of course, they’re also distractingly loud when you hit them with a demanding load. Make sure you have room for two dual-slot cards with one vacant space between them. Typically, I frown on such inelegance, but more speed for $200 less could be worth the trade-off in a roomy case.
- Buy a GeForce GTX Titan when you want the fastest gaming experience possible from a mini-ITX machine like Falcon Northwest’s Tiki or iBuyPower’s Revolt. A 690 isn’t practical due to its length, power requirements, and axial-flow fan.
- Buy a GeForce GTX Titan if you have a trio of 1920x1080/2560x1440/2560x1600 screens and fully intend to use two or three cards in SLI. In the most demanding titles, two GK110s scale much more linearly than four GK104s (dual GeForce GTX 690s). Three Titan cards are just Ludicrous Gibs!
We appreciate Nvidia’s continued attention to acoustics. We’re glad to see the GeForce GTX Titan exhausting all of its hot air. And as you’ll see in a couple of days, there’s a lot to like about this card’s performance. Our beef is with its stratospheric price tag, which limits the Titan to small form factor gaming boxes and multi-card configurations in ultra-high-end PCs. Most enthusiasts will rightly balk at this card. But if you’re in its target demographic, the GeForce GTX Titan is essentially unrivaled.















