OpenCL: General-Purpose Computing
Measuring the general-purpose compute performance of multi-GPU solutions is a challenge because not every app knows how to exploit more than one graphics processor at a time. We also have to strike CUDA- or Stream/APP-only software from our list. That doesn’t leave many options, which is why we’re limiting our search to OpenCL-accelerated applications.
The most obvious benefit to OpenCL is that both vendors’ cards compete on a playing field that is as level as we can make it. Besides, a comparison using real-world metrics covering floating-point (FP32) and double-precision (FP64) math is much more interesting than a huge field of synthetic benchmarks. As usual, we also include a number of current workstation-class cards to see how they fare relative to their consumer siblings.
We chose two different renderers that take almost opposing approaches to optimization. On one hand, we have the well-known LuxMark benchmark based on the LuxRender engine. On the other, we use the integrated benchmark of RatGPU, an application that tends to favor Nvidia cards but isn’t really optimized for either architecture. LuxMark reports its result in samples per second, while RatGPU measures the time per run.
There’s really not much to say about LuxMark that the chart doesn’t already tell us. AMD’s GCN architecture dominates, and an OpenCL-optimized application able to exploit two Tahiti GPUs simply screams.
Meanwhile, RatGPU shows us what many CUDA-enabled renderers have proven in the past, namely none of the Kepler-based GeForce cards can keep up with the Fermi-based GeForce GTX 580 in compute-heavy software. It’s a little strange that the VLIW4-based Radeon HD 6970 is faster than Radeon HD 7970 GHz Edition, though.
The software we’re using for this test treats the multi-chip cards as if they have one GPU, so performance scales very well. AMD’s Radeon HD 7990, which seems to excel in integer-based hashing operations, performs really well, followed by a number of other GCN-based boards.
Financial Analysis Performance (Float/FP32)
We see the same sort of near-ideal scaling from the Radeon HD 7990 in our four financial analysis benchmarks (two benchmarks with two levels of precision each). Indeed, AMD’s flagship almost delivers two times the performance of the single-GPU Radeon HD 7970 GHz Edition, despite slightly lower clock rates. Meanwhile, the GeForce GTX Titan and 690 can’t even compete.
Financial Analysis Performance (Double/FP64)
Repeating those two benchmarks using double-precision math makes the differences even more apparent. While Nvidia’s other cards struggle with FP64, the Titan actually does quite decently, especially compared to the GK104-based GeForce GTX 690 and GTX 680. The trick is to activate CUDA’s dual-precision mode in the card’s driver, which also extends functionality to OpenCL. Although this negatively affects clock rates, the card is faster overall in FP64-based workloads.
Meanwhile, the Radeon HD 7990 doesn’t need any tweaking to achieve its impressive and chart-topping performance.
Current page: OpenCL: General-Purpose ComputingPrev Page Noise Analysis: Frequency Spectrum And Videos Next Page OpenGL: Synthetic Gaming Performance
Stay on the Cutting Edge
Join the experts who read Tom's Hardware for the inside track on enthusiast PC tech news — and have for over 25 years. We'll send breaking news and in-depth reviews of CPUs, GPUs, AI, maker hardware and more straight to your inbox.
If I had 1,000 dollars... I would buy a Titan. Its power efficiency, drivers and uber-chip goodness is unmatched.Reply
Thats some nice gains from the prototype driver.
Nice article!! Unbeatable performance out of the box.Reply
Sort of seems like a mess to me. The game bundle is nice.Reply
Here's an idea. Take away the 8 games at 40 bucks a piece and deduct that from the insane 1000 price tag.Reply
this test was 99% useless to the average gamer,Test the card at 1900x1080 like most of us use to get a real ideal of what its like,only your unigine benchmarks helped the average gamer,who cares what any card can do at a resolution we cant use anyway?Reply
whysoPower usage?Thats some nice gains from the prototype driver.Power is the one thing I didn't have time for. We already know the 7990 is a 375 W card, while GTX 690 is a 300 W card, though. We also know AMD has Zero Core, which is going to shave off power at idle with one GPU shut off. I'm not expecting any surprises on power that those specs and technologies don't already insinuate.Reply
nice article! here comes the Competitor of gtx 690!Reply
donquad2001this test was 99% useless to the average gamer,Test the card at 1900x1080 like most of us use to get a real ideal of what its like,only your unigine benchmarks helped the average gamer,who cares what any card can do at a resolution we cant use anyway?If you're looking to game at 1920x1080, I can save you a ton of money by recommending something less than half as expensive. This card is for folks playing at 2560 *at least.* Next time, I'm looking to get FCAT running on a 7680x1440 array ;)Reply
Nice article. I was hopping that they would have addressed the whining but they haven't and that's a shame. Performance wise it can be matched by GTX 680 SLI and GTX 690 without the huge time variance and runt frames. Let's hope they fix their whining issue and FPS without forcing users to turn on V-sync. For now I know where my money is going consider that I have dealt with AMD before:XFX and Sapphire and didn't like the results (whining, artifacts, XF stops working etc). Sorry but I gave the red team a try and I will stick with Nvidia until AMD can prove that they have fixed their issues.Reply