Sign in with
Sign up | Sign in

GK110: The True Tank

Nvidia GeForce GTX Titan 6 GB: GK110 On A Gaming Card
By

Think back to Nvidia’s last generation of graphics cards, the Fermi-based 500-series. For each of its GPUs, the company’s marketing team came up with different battlefield classes: the tank, the hunter, and the sniper, each configuration optimized for a different role. The GeForce GTX 580’s GF110 was the heavy-hitting tank. Big, powerful, and expensive, it represented the Fermi architecture’s maximum potential.

GK110 block diagramGK110 block diagram

In comparison, we knew right out of the gate that the GeForce GTX 680’s GPU was no GF110-successor, even though Nvidia wanted $500 for the privilege of owning one. GK104 is optimized for gaming and it sacrificed compute performance in a dramatic way, underperforming the 580 in our OpenCL-based tests. At the time, Nvidia downplayed the significance of GK104’s compromises, preferring to instead hammer home how well its 3.5 billion transistor chip did against AMD’s 4.3 billion transistor Tahiti GPU in games.

But then the company introduced its Tesla K20 family, powered by GK110—the true tank (even if Nvidia isn’t using that parallel any more).

Inside The SMX

A complete GK110 GPU consists of 15 Streaming Multiprocessors, which, remember, now go by the name SMX. These SMX blocks are largely the same as they are in GK104, powering GeForce GTX 680. They still include 192 CUDA cores, 16 texture units, and very similar cache structures. But there are obviously a lot more of them. GK104 includes eight SMX blocks. GK110 hosts 15. Because the chip is so big and complex, though, defects seriously affect yields. Perfectly-manufactured GPUs undoubtedly exist. However, even the highest-end GK110-based products have one disabled SMX. Multiply out 192 shaders 14 times, and you get a GPU with 2,688 CUDA cores. Moreover, 16 texture units for each of 14 SMXes gives you a total of 224 TMUs, up from GeForce GTX 680’s 128.

Per SMX:
GF100 (Fermi)
GF104 (Fermi)
GK110 (Kepler)
GK104 (Kepler)
CUDA Compute Capability
2.0
2.0
3.5
3.0
Threads/Warp
32
32
32
32
Maximum Warps/SMX
48
48
64
64
Maximum Threads/SMX
1,536
1,5362,048
2,048
Maximum Thread Blocks/SMX
8
8
16
16
32-bit Registers/SMX
32,768
32,768
65,536
65,536
Maximum Registers/Thread
63
63
255
63
Maximum Threads/Thread Block
1,024
1,0241,0241,024


Beyond simply piling on additional resources that accelerate gaming, GK110 addresses the “hunter’s” most glaring shortcoming (particularly if you consider GeForce GTX 680 a replacement for GeForce GTX 580): its compute potential. In GK104, each SMX features 192 FP32-capable cores, yielding more than 3 TFLOPS of peak floating-point performance. But you only get eight FP64 units, capping double-precision performance to 1/24 of the FP32 rate. A GK110 SMX incorporates 64 FP64 CUDA cores, narrowing that ratio to 1/3. Nvidia says a GeForce GTX Titan offers up to 4.5 TFLOPS of single-precision and 1.5 TFLOPS of peak double-precision compute power. In theory, that puts it just ahead of AMD’s Radeon HD 7970 GHz Edition card, rated for 4.3 TFLOPS of single- and 1.01 TFLOPS of double-precision performance.

GK110's SMX, with 64 FP64 CUDA coresGK110's SMX, with 64 FP64 CUDA coresGK104's SMX: Not pictured, eight FP64 coresGK104's SMX: Not pictured, eight FP64 cores

We’re naturally happy to see GK110 bring an emphasis back onto compute. However, there’s no question that GeForce GTX Titan’s ability to cut through real-time graphics is top priority. In order to balance that 75% increase in shader and texture unit count, Nvidia also bolsters the GPU’s back-end. GK104’s four ROP partitions are able to output eight 32-bit integer pixels per clock, adding up to what the company calls 32 ROP units. GK110 leverages six of those blocks, increasing that number to 48.

Both the GeForce GTX 680 and Titan employ GDDR5 memory running at 1,502 MHz. But because GK110 features six 64-bit memory interfaces, rather than GK104’s four, peak bandwidth increases 50% from 192 GB/s to 288 GB/s. That matches AMD’s reference Radeon HD 7970 GHz Edition card, which also sports 1,500 MHz GDDR5 on a 384-bit bus.

Display all 121 comments.
This thread is closed for comments
Top Comments
  • 27 Hide
    Trull , February 19, 2013 12:19 PM
    Dat price... I don't know what they were thinking, tbh.

    AMD really has a chance now to come strong in 1 month. We'll see.
  • 26 Hide
    tlg , February 19, 2013 12:27 PM
    AMD already said in (a leaked?) teleconference that they will not respond to the TITAN with any card. It's not worth the small market at £1000...
  • 22 Hide
    jaquith , February 19, 2013 12:19 PM
    Hmm...$1K yeah there will be lines. I'm sure it's sweet.

    Better idea, lower all of the prices on the current GTX 600 series by 20%+ and I'd be a happy camper! ;) 

    Crysis 3 broke my SLI GTX 560's and I need new GPU's...
Other Comments
  • 22 Hide
    jaquith , February 19, 2013 12:19 PM
    Hmm...$1K yeah there will be lines. I'm sure it's sweet.

    Better idea, lower all of the prices on the current GTX 600 series by 20%+ and I'd be a happy camper! ;) 

    Crysis 3 broke my SLI GTX 560's and I need new GPU's...
  • 27 Hide
    Trull , February 19, 2013 12:19 PM
    Dat price... I don't know what they were thinking, tbh.

    AMD really has a chance now to come strong in 1 month. We'll see.
  • 3 Hide
    firefyte , February 19, 2013 12:20 PM
    Anyone else having problems with the 7th page?
  • 10 Hide
    tlg , February 19, 2013 12:23 PM
    The high price OBVIOUSLY is related to low yields, if they could get thousands of those on the market at once then they would price it near the gtx680. This is more like a "nVidia collector's edition" model. Also gives nVidia the chance to claim "fastest single gpu on the planet" for some time.
  • 26 Hide
    tlg , February 19, 2013 12:27 PM
    AMD already said in (a leaked?) teleconference that they will not respond to the TITAN with any card. It's not worth the small market at £1000...
  • -8 Hide
    wavebossa , February 19, 2013 12:36 PM
    wavebossa"Twelve 2 Gb packages on the front of the card and 12 on the back add up to 6 GB of GDDR5 memory. The .33 ns Samsung parts are rated for up to 6,000 Mb/s, and Nvidia operates them at 1,502 MHz. On a 384-bit aggregate bus, that’s 288.4 GB/s of bandwidth."12x2 + 12x2 = 6? ..."That card bears a 300 W TDP and consequently requires two eight-pin power leads."Shows a picture of a 6pin and an 8pin...I haven't even gotten past the first page but mistakes like this bug me


    Nevermind, the 2nd mistake wasn't a mistake. That was my own fail reading.
  • 21 Hide
    infernolink , February 19, 2013 12:46 PM
    Titan.. for those who want a Titan e-peen
  • 5 Hide
    ilysaml , February 19, 2013 12:47 PM
    Quote:
    The Titan isn’t worth $600 more than a Radeon HD 7970 GHz Edition. Two of AMD’s cards are going to be faster and cost less.

    My understanding from this is that Titan is just 40-50% faster than HD 7970 GHz Ed that doesn't justify the Extra $1K.
  • 15 Hide
    battlecrymoderngearsolid , February 19, 2013 1:11 PM
    Can't it match GTX 670s in SLI? If yes, then I am sold on this card.

    What? Electricity is not cheap in the Philippines.
  • 14 Hide
    Fulgurant , February 19, 2013 1:20 PM
    Ninjawithagun$1000 per Titan card is a bit hard for most mid-range gamers and even high-end gamers to afford.


    Titan is a luxury product. It's not supposed to offer a competitive price/performance ratio, just as a Ferrari's price isn't based on its horsepower or fuel efficiency. Titan is a statement moreso than it is a bona-fide money maker for nVidia.

    The idea of status-symbol computer components strikes me as a little silly, of course, but I'm not in the target market. Neither are most gamers, whether high end or not.

    If you generally spend $1600 on the graphics' subsystem of your computer, then I'm not even sure you fit in the so-called high-end. Super-high-end, maybe. You are the 1%. :) 
  • 2 Hide
    azraa , February 19, 2013 1:22 PM
    Waaaay too much hype :/ 
    Its an engineering beauty, but what could make us wish it? Most gamers already have enough with 7970Ghz or 670s so... not a smart choice.
  • 19 Hide
    mindless728 , February 19, 2013 1:23 PM
    Quote:
    "Twelve 2 Gb packages on the front of the card and 12 on the back add up to 6 GB of GDDR5 memory. The .33 ns Samsung parts are rated for up to 6,000 Mb/s, and Nvidia operates them at 1,502 MHz. On a 384-bit aggregate bus, that’s 288.4 GB/s of bandwidth."

    12x2 + 12x2 = 6? ...


    the chips are Gb (Gigabit) not GB (Gigabyte) which is a difference of 8x

    so 12x2Gb+12x2Gb = 48 Gb = 6GB

    chips are commonly refereed to in capacity as the bit size not byte size
  • 13 Hide
    oxiide , February 19, 2013 1:23 PM
    wavebossa12x2 + 12x2 = 6?

    Assuming proper notation is being observed (often its not), "b" is a bit and "B" is a byte.

    6 Gigabytes = 48 Gigabits as 1 Byte = 8 bits.
  • 9 Hide
    renz496 , February 19, 2013 1:31 PM
    to me this thing is in the same league as Asus ARES II. both product are not something you discuss about price/performance.

    btw very interested how far this 'beast' will overclock
  • 8 Hide
    bl1nds1de13 , February 19, 2013 1:41 PM
    When compared to the GTX690 I would have to differ on saying that " there's no real reason not to favor it over Titan " ..... Any SLI or crossfire solution, including dual board cards like the 690, will have microstutters when compared to a single card setup. This has been thoroughly shown in several tests, and have seen it myself. A single card will never have scaling issues or microstutters.

    BL1NDS1DE13
  • 2 Hide
    Au_equus , February 19, 2013 1:45 PM
    Quote:
    Unfortunately, Nvidia says the 690’s magnesium alloy fan housing was too expensive...
    o.O and $1000 is cheap? The 690 sold for around the same price and nothing was said then. Can they come up with a better excuse? Idk, like aliens stole our magnesium... smh.
  • 11 Hide
    hero1 , February 19, 2013 1:48 PM
    Pass! I don't think it's worth forking out $2000 for 2 of these cards no matter how good or rare or awesome they are. $2000 gets you a nice i7 rig with 2x AMD Radeon 7970 GHz Ed./ GTX 680 SLI that are more than capable of handling anything you throw at them. Time to go ahead and place that order for GHz cards. Nice to see your face Titan but you ain't selling at the price we are looking for!
  • 2 Hide
    mayankleoboy1 , February 19, 2013 2:01 PM
    if i really want ultra performance, i would get 2xHD7970. Would get better gaming and compute performance.
  • 3 Hide
    aberkae , February 19, 2013 2:02 PM
    I cant wait for the review
Display more comments