Benchmark System, Drivers, And Software
Our Benchmark System
Choosing the supporting hardware to benchmark with put us in a bit of a quandary. The dual-CPU workstation based on a pair of AMD Opteron 4284 processors (Valencia, 3.0 GHz with Turbo Core) that we used last year turned out to bottleneck the higher-end graphics cards in this round-up. The same went for a system with two Intel Xeon E5-1660 processors running at 3.3 GHz. Under both platforms, the top-end boards we benchmarked landed very close together in the SPECapc tests, along with apps that didn't benefit from heavy threading, but instead needed big instruction-per-clock throughput from the host processor. Differences between the lower-end cards were much smaller, indicating the bottleneck.
We finally settled on a desktop-oriented processor with a high clock rate and fast memory. This might not reflect the average workstation very well (though Intel does sell a Xeon E3 configured very similarly that's often used in workstations), but it does give us a less constrained look at the differences between high-end graphics cards, some of which went from almost identical performance to huge performance deltas after transitioning to the new machine. In the end, the fastest cards sped up almost 20 percent in my new benchmark system than in the old workstation.
|CPU||Intel Core i7-3770K (Ivy Bridge), Four Cores, Eight Threads, 8 MB Shared L3 Cache, Overclocked to 4.5 GHz (Water Cooled)|
|RAM||32 GB Corsair Dominator Platinum at 2133 MT/s|
|Motherboard||Gigabyte G1 Sniper 3 Rev 1.0, Z77 Express, BIOS F7|
|SSD||2 x Corsair Neutron 480 GB|
|OS||Windows 7 Ultimate x64|
|Driver||Depending on Application|
Our Driver Selection
Driver selection is easy when it comes to benchmarking desktop graphics cards in games. Generally, we simply use the latest driver. Increasingly, this involves using beta software, since AMD and Nvidia are both trying to stay ahead of each other and the latest game titles before their software can be WHQL-certified.
This is done differently in the workstation space, and our round-up reflects that fact. Professionals pay more for these graphics cards because they need drivers certified by application vendors, guaranteeing the best possible reliability. You won't always get the best possible performance, but the certification does ensure proper functionality. Naturally, that's imperative, since any error can cost a company far more than the price of the hardware. Stability trumps performance in this space.
The same narrative applies to AMD and Nvidia equally, so our results don’t just speak to the theoretical performance of each GPU, but also to both companies' driver optimization. That can be more important than the raw potential of a card, creating interesting benchmark results that wouldn't seem to make sense otherwise.
Our Benchmark Software
We believe in a healthy blend of synthetic tests (for isolating specific performance attributes) and real-world application benchmarks. It’s also important to make sure that the selection of metrics is balanced so that the results don't get skewed toward one card vendor or the other. Finally, our selection of professional applications is limited to those for which we're able to attain legal licensing. I tried to include all of the titles that our readers wanted to see, and was mostly successful. There are some exceptions, though. For example, V-Ray’s makers never got back to me, so we aren't including their software. Also, some applications check the graphics hardware and simply won’t work with desktop cards.
Otherwise, we're only limited by time. Testing one card takes up to about 10 hours, which is to say an entire day. Clearly, putting this piece together was a labor of love.
A quick word on reading the graphs: the red bars always represent AMD's FirePro cards, and the green ones are always Nvidia Quadros. The black bars are reserved for the desktop graphics cards. We didn’t use different colors for the Radeons and GeForces because they mostly amount to a footnote when they're compared in professional applications.
A valiant effort, but in my view, a very important aspect of the comparisons has been neglected, namely, image quality,
It is useful to make quantitative comparisons of workstation cards performing the same tasks, but when gaming / consumer cards are also compared only in terms of speed, the results are not necessarily reflective of these cards' use in content creation. Yes, speed is critical in navigating 3D models- shifting polygons, but the end result of those models is likely to be renderings or animations in which the final quality- refinement of detail and subtlety is more critical than in games.
A fundamental aspect that reflects on the results in this comparison is that the test platform using an i3-3770K is not indicative of a workstation platform for which the workstation cards were designed and the drivers optimized. There are a number of very good reason for Xeons and Opterons and especially, for the existence of dual CPU's with lots of threads. There are other aspects of these components that bear on results, e.g., the memory bandwidth of the i7-3770K is only about half of a Xeon E5-1660. Note too, that that there are good reasons why Xeons have locked multipliers and can not be overclocked- speed is not their measure of success in priority to precision and extreme stability. Also important in this comparison is the presence of ECC RAM which is present in both the system and workstation GPU memory, which was treated a bit lightly, but that is essential for precision, especially in simulations and tasks like financial analysis. Also, ECC affects system speed in it's error correcting duties and parity checks and therefore runs slower than non-ECC. Again, to be truly indicative of workstation cards, it would be more useful to use a workstation to make the comparisons.
An aspect of this report that was not sufficiently clarified, is that the rendering based applications are entirely reflective of CPU performance. Rendering is one of the few tasks that can use all the available system threads and anyone who renders images from 3D models and especially doing animations will today have a dual CPU six or eight core Xeon. I That comparisons were made involving rendering applications on a four-core machine in conditions of which the number of cores / threads matters significantly. believe that some of the dramatic differences in Maya performance in these tests may have been related to the platform used. I have a Previous generation dual four-core system yielding eight cores and sixteen threads at 3.16GHz (Xeon X5460) and during rendering, all eight cores go from 58C to 93C and the RAM (DDR2-667 ECC)from 68C to 85C in about ten minutes.
Also, it's possible that the significant variation in rendering performance then may be due to system throttling and the GPU drivers that are finishing every frame under error-correcting RAM. In this task, the image quality is dependent on precision polygon calculation and i.e, particle placement, such that there are no artifacts, that shadows and color gradients are accurate and refined. Gaming cards emphasize frame rates and are optimized to finish frames more "casually" to achieve higher frame rates. This is why a GTX can't be used for Solidworks modeling either as tasks like structural, thermal, and gas flow simulations must have error correcting memory and Solidworks can produce as much as 128X anti-aliasing where a GTX will produce 16X. When a GTX is pushed in this way, especially on a consumer platform they perform poorly. Again, the image precision and quality aspect was lost in favor of a comparison of speed only.
The introduction of tests involving single and double precision and comments regarding the fundamental differences of priority in the drivers were useful and in my view might have been more extensive as this gets more to the heart of the differences between consumer and workstation cards.
Making quantitative comparisons of image quality is contradictory by definition, but in my view, quality is fundamental to an understanding of these graphics cards. As well, this would assist in explaining to content cobsumers the most important reason content creators are willing to spend $3,500 on a Quadro 6000 when an $800 GTX will make some things faster. Yes, AutoCad 2D is purposely made to run on almost any system- but when the going gets tough***, the tough get a dual Xeon, a pile of ECC, and a Quadro / Firepro!
In the tests I found the CUDA numbers disappointing, but you would get a Tesla card for CUDA not a workstation card.
On the OpenCL numbers it paints a different picture where there is almost no difference between the consumer card and the workstation card. I was actually expecting the workstation cards to perform better, but once again I think that's an avenue of FireStream and Tesla cards.
People buy workstation cards for better viewport performance and better image quality and as you can see from specviewperf numbers, gaming GPUs are completely useless for that.
I totally get that in some work-areas you want ecc, you want certified drivers, you want as much stability and security and / or extra performance in specific areas. Compared to the work the hardware cost is of little importance, so I totally agree, get a pro workstation with a pro card. You want to be on the safest side while doing big engineering projects, parts for planes, scientific and / or financial calculations etc.
But that being said, and especially for the content creation / entertainment / media sector you really need think if a pro card is useful and worth it. Most 3D apps work great on game cards, and as you can see as far as rendering is concerned game cards are your best choice for speed if you can live with the limitations. Also for a lot of CAD work you can get away fine with a game card.
So it's not just Autocad or Inventor which don't need a pro card. Most people will be just fine with them on 3ds max or alike, rhino and solidworks.
I don't get why there are no test scores with Solidworks and game cards in this article? Game cards work fine mostly and pro cards offer little extra featurewise in this app. The driver issue really seems like a bad excuse not to have some game scores in there.
Also, I have never really looked at Specview. It's seems to heavily favor pro cards while it doesn't tell you most apps will work fine with game cards.
CPUs, we can finally see these cards show their true potential. You're
getting results that more closely match my own this time, confirming what
I suspected, that workstation CPUs' low clock rates hold back the
Viewperf 11 tests significantly in some cases. Many of them seem very
sensitive to absolute clock rate, especially ProE.
And interesting to compare btw given that your test system has a 4.5GHz
3770K. Mine has a 5GHz 2700K; for the Lightwave test with a Quadro 4000,
I get 93.21, some 10% faster than with the 3770K. I'm intrigued that you
get such a high score for the Maya test though, mine is much lower
(54.13); driver differences perhaps? By contrast, my tcvis/snx scores are
I mentioned ProE (I get 16.63 for a Quadro 4K + 2700K/5.0); Igor, can you
confirm whether or not the ProE test is single-threaded? Someone told me
ProE is single-threaded, but I've not checked yet.
FloKid, I don't know how you could miss the numbers but in some cases
the gamer cards are an order of magnitude slower than the pro cards,
especially in the Viewperf tests. As rmpumper says, pro cards often give
massively better viewport performance.
bambiboom, although you're right about image quality, you're wrong about
performance with workstation CPUs - many pro apps benefit much more from
absolute higher speed of a single CPU with less threads, rather than just
lots of threads. I have a dual-X5570 Dell T7500 and it's often smoked for
pro apps by my 5GHz 2700K (even more so by my 3930K); compare to my
Viewperf results as linked above. Mind you, as I'm sure you'd be the
first to point out, this doesn't take into account real-world situations
where one might also be dealing with large data sets, lots of I/O and
other preprocessing in a pro app such as propprietory database traversal,
etc., in which case yes indeed a lots-of-threads workstation matters, as
might ECC RAM and other issues. It varies. You're definitely right though
about image precision, RAM reliability, etc.
falchard, the problem with Tesla cards is cost. I know someone who'd
love to put three Teslas in his system, but he can't afford to. Thus, in
the meantime, three GTX 580s is a good compromise (his primary card
is a Quadro 4K).
catmull-rom, if I can quote, you said, "... if you can live with the
limitations.", but therein lies the issue: the limitation is with
problems such as rendering artifacts which are normally deemed
unacceptable (potentially disastrous for some types of task such as
medical imaging, financial transaction processing and GIS). Also, to
understand Viewperf and other pro apps, you need to understand viewport
performance, and the big differences in driver support that exist between
gamer and pro cards. Pro & gamer cards are optimised for different types
of 3D primitive/function, eg. pro apps often use a lot of antialiased
lines (games don't), while gamer cards use a lot of 2-sided textures (pro
apps don't). This is reflected in the drivers, which is why (for example)
a line test in Maya can be 10X faster on a pro card, while a game test
like 3DMark06 can be 10X faster on a gamer card.
Also, as Teddy Gage pointed out on the creativecow site recently, pro
cards have more reliable drivers (very important indeed), greater viewport
accuracy, better binned chips (better fault testing), run cooler, are smaller,
use less power and come with better customer support.
For comparing the two types of card, speed is just one of a great many
factors to consider, and in many cases is not the most important factor.
Saving several hundred $ by buying a gamer card is pointless if the app
crashes because of a memory error during a 12-hour render. The time lost
could be catastrophic it means one misses a submission deadline; that's
just not viable for the pro users I know.