Skip to main content

Nvidia GeForce GTX 590 3 GB Review: Firing Back With 1024 CUDA Cores

GeForce GTX 590: Bringing The Heat

In this corner...

Today, the worst-kept secret in technology officially gets the spotlight. Hot on the heels of AMD’s Radeon HD 6990 4 GB introduction three weeks ago, Nvidia is following up with its GeForce GTX 590 3 GB. According to Nvidia, it could have introduced this card more than a month ago. However, we know it continued revising its plans for a new flagship well into March. The result is a board deliberately intended to emphasize elegance, immediately after the Radeon HD 6990 bludgeoned us over the head with abrasive acoustics.

Pursuing quietness might sound ironic, given that GPUs based on Nvidia’s Fermi architecture are notoriously hot and power-hungry. To think the company could put two on a single PCB and not out-scream AMD’s dual-Cayman-based card is almost ludicrous. And yet, that’s what Nvidia says it did.

It admits that getting there wasn’t an easy task, though. Compromises were made. For example, Nvidia uses the same mid-mounted fan design for which we chided AMD. It dropped the clocks on its GPUs to help keep thermals under control. And the card still uses more power than any graphics product we’ve ever tested.

And in the other corner...

But it’s quiet. Crazy-freaking quiet. The quietest dual-GPU board I’ve tested since ATI’s Rage Fury Maxx (how’s that for back-in-the-day?). Mission accomplished on that front. The question remains, though: was Nvidia forced to give up the farm just to show AMD that hot cards don't have to make lots of noise?

Under The Hood: Dual GF110s, Both Uncut

In my discussions with Nvidia, the company made it clear that it wanted to use two GF110 processors, and it didn’t want to hack them up. Uncut GF110s, as you probably already know from reading GeForce GTX 580 And GF110: The Way Nvidia Meant It To Be Played, employ four Graphics Processing Clusters, each with four Streaming Multiprocessors. You’ll find 32 CUDA cores in each SM, totaling 512 cores per GPU. Each SM also offers four texturing units, yielding 64 across the entire chip. Of course, there’s one Polymorph engine per SM as well, though as we’ve seen in the past, Nvidia’s approach to parallelizing geometry doesn’t necessarily scale very well.

As in our GTX 580 review, GF110 doesn't get cut-back here

The GPU’s back-end features six ROP partitions, each capable of outputting eight 32-bit integer pixels at a time, adding up to 48 pixels per clock. An aggregate 384-bit memory bus is divisible into a sextet of 64-bit interfaces, and you’ll find 256 MB of GDDR5 memory at all six stops. That adds up to 1.5 GB of memory per GPU, which is how you arrive at the GeForce GTX 590’s 3 GB.

Nvidia ties GTX 590’s GF110 processors together using its own NF200 bridge, which takes a single 16-lane PCI Express 2.0 interface and multiplexes it out to two 16-lane paths—one for each GPU.

GeForce GTX 590GeForce GTX 580Radeon HD 6990Radeon HD 6970Radeon HD 6950
Manufacturing Process40 nm TSMC40 nm TSMC40 nm TSMC40 nm TSMC40 nm TSMC
Die Size2 x 520 mm²520 mm²2 x 389 mm²389 mm²389 mm²
Transistors2 x 3 billion3 billion2 x 2.64 billion2.64 billion2.64 billion
Engine Clock607 MHz772 MHz830 MHz880 MHz800 MHz
Stream Processors / CUDA Cores1024512307215361408
Compute Performance2.49 TFLOPS1.58 TFLOPS5.1 TFLOPS2.7 TFLOPS2.25 TFLOPS
Texture Units128641929688
Texture Fillrate77.7 Gtex/s49.4 Gtex/s159.4 Gtex/s84.5 Gtex/s70.4 Gtex/s
Pixel Fillrate58.3 Gpix/s37.1 Gpix/s53.1 Gpix/s28.2 Gpix/s25.6 Gpix/s
Frame Buffer2 x 1.5 GB GDDR51.5 GB GDDR52 x 2 GB GDDR52 GB GDDR52 GB GDDR5
Memory Clock853 MHz1002 MHz1250 MHz1375 MHz1250 MHz
Memory Bandwidth2 x 163.9 GB/s(384-bit)192 GB/s (384-bit)2 x 160 GB/s (256-bit)176 GB/s (256-bit)160 GB/s (256-bit)
Maximum Board Power365 W244 W375 W250 W200 W

What changed from the ill-received GF100-based GeForce GTX 480 to GF110? From my GeForce GTX 580 review:

The GPU itself is largely the same. This isn’t a GF100 to GF104 sort of change, where Shader Multiprocessors get reoriented to improve performance at mainstream price points (read: more texturing horsepower). The emphasis here remains compute muscle. Really, there are only two feature changes: full-speed FP16 filtering and improved Z-cull efficiency.

GF110 can perform FP16 texture filtering in one clock cycle (similar to GF104), while GF100 required two cycles. In texturing-limited applications, this speed-up may translate into performance gains. The culling improvements give GF110 an advantage in titles that suffer lots of overdraw, helping maximize available memory bandwidth. On a clock-for-clock basis, Nvidia claims these enhancements have up to a 14% impact (or so).”

That's a 12-layer PCB with 10-phase power, and NF200 in the middle

Other than that, we’re still talking about two pieces of silicon manufactured on TSMC’s 40 nm node and composed of roughly 3 billion transistors each. At 520 square millimeters, GF110 is substantially larger than AMD’s Cayman processor, which measures 389 mm² and is made up of 2.64 billion transistors.

Now, it’s great to get all of those resources (times two) on GeForce GTX 590. However, while the GeForce GTX 580 employs a 772 MHz graphics clock and 1002 MHz memory clock, the GPUs on GTX 590 slow things down to 607 MHz and 853 MHz, respectively.

As a result, this card’s performance isn’t anywhere near what you’d expect from two of Nvidia’s fastest single-GPU flagships. That might be alright, though. After all, AMD launched Radeon HD 6970 as a GeForce GTX 570-contender; the 580 sat in a league of its own. So, although AMD’s Radeon HD 6990 comes very close to doubling the performance of the company’s quickest single-GPU cards, GeForce GTX 590 doesn’t have to do the same thing in order to be competitive at the $700 price point AMD already established and Nvidia plans to match.

We already know what AMD had to do in order to deliver “the fastest graphics card in the world.” Now, how does Nvidia counter?