Kepler in the Workstation: Nvidia Rolls Out New Quadros
A Kepler-based Quadro line
Last summer at SIGGRAPH, Nvidia introduced its initial releases of the Kepler-based Quadros, in the K5000 desktop card and the Kx000M series of Kepler-based Quadro GPUs for mobile workstations.
| CUDA Cores | Memory | Memory Bandwidth | Dual-Link DVI | Displayport | Monitors | |
|---|---|---|---|---|---|---|
| Quadro K4000 | 768 | 3 GB | 134 GBps | 1 | 2 | 4 |
| Quadro K2000 | 384 | 2 GB | 64 GBps | 1 | 2 | 4 |
| Quadro K2000D | 384 | 2 GB | 64 GBps | 2 | 1 | 4 |
| Quadro K600 | 192 | 1 GB | 29 GBps | 1 | 1 | 2 |
Now Nvidia is introducing the rest of the family for its desktop Kepler-based Quadro cards: the entry-level K600 (which is also configurable to fit in small form factor chassis), the K2000 (and K2000D dual-DVI variant), and the K4000, all filling in the space in the product line left underneath the K5000. The update brings more CUDA cores, more memory and faster memory to the entire product line.
More CUDA Cores
The Kepler architecture's increased single-precision floating point performance means the new cards will provide triple (or better) the CUDA performance for compute tasks over their predecessors. Single-precision floating point tasks include most GPU-based 3D rendering, and tasks like CUDA effects rendering in Premiere. For instance, with the Quadro K5000, single-precision floating -point performance is increased to 2,150 GFLOPS over its predecessor's 718 GFLOPS, and the Quadro K4000 has 1,246 GFLOPS, a 2.56x improvement over its predecessor's 486 GFLOPS. The additional memory in the new cards will also directly affect compute performance, as one of the primary limiting factors in GPU-based 3d rendering is the amount of memory on the card... see how easily the GPU memory of the Fermi-based Quadro 2000 was overwhelmed in the system we reviewed here.
More Displays, More Flexible Arrangement
Except the entry-level K600, all of the new cards double the number of active displays from two to four displays. NVIDIA is also updating their Mosaic technology to make it easier and better to manage large multi-display workspaces, and with four diaplays on four cards, you can get sixteen displays working as a single workspace.
Maximus Expanded
The release of the new Kepler-based Quadro cards also expands the configurations possible to use as next-generation Maximus configurations accompanied by the Tesla K20. Technically, the current preferred configuration for Maximus is a Quadro K5000 with the Tesla K20, but NVIDIA is now supporting configurations with the other Kepler-based Quadro cards for those that need more compute capability and less real-time 3d capability in their workstations.
Availability and Pricing
These new Kepler-based Quadro cards are available immediately. The Quadro K4000 is priced at $1,269; the Quadro K2000 and K2000D at $599; and the K600 is $199

That is where Titan comes in.
It would be nice to understand how much real world difference there is between: high end gaming cards, last gen workstation cards, and current gen workstation cards.
greatms, doing detailed reviews using Solidworks has... licensing issues.
Quadro Fx4800 in single precision computation is truly comparable to the k2000, minus the open Gl 4.1 kapability, the older fx 4800 has open Gl 3.3. That's about it.
For some tasks the memory bandwidth available per core is much more important than the number
of cores. This is why the newer GTX 600 series cards are not as fast as one might expect compared
to the 500 series - a lot more cores, but not the mem bw to feed them. And that's why using several
cheaper cards with fewer cores but more bw per core often results in much better performance than
a single card with loads of cores. Try using four GTX 460s and see how it compares to a 670 or 680
for AE.
Transistor count doesn't mean anything though. A newer design might have more elements purely
because of a more sophisticated power delivery system and power management system, which
doesn't translate to better performance, but reduces cost re power consumption. Transistor count
is as useful as the old pointless MIPS metric for CPUs.
Ian.
Note these cards can vary greatly in their performance depending on the available main CPU power.
The potential of a good Quadro can be wasted if the CPU is poor. Thus, for a proper comparison,
tests should be done using both a typical generic 'professional' config (standard XEON or two, with
no oc'ing) and an enthusiast-style normal mbd (eg. 5GHz 2700K, or even an older oc'd dual-core).
The differences can be amazing, especially for CPU-sensitive apps like ProE.
Ian.
I forgot to include references for my comments. See:
http://www.sgidepot.co.uk/misc/viewperf.txt
AE data still under construction (working on 3930K and dual-X5570 tests atm).
Ian.