Benchmark Results: OpenCL
Alright. We know that Piledriver represents a respectable improvement over Bulldozer, lending Trinity competitive performance versus its previous-generation Llano-based APUs. And we now know that the more efficient VLIW4 architecture, coupled with higher clock rates, translates into anywhere from 15 to 30%-higher frame rates in a number of mainstream games.
But AMD is trumpeting this message of heterogeneous computing—exploiting processing resources, wherever they may be, to maximize performance. We’ve been working on a series of stories with AMD to quantify the effects of open standards like DirectCompute and OpenCL in different software environments, but it remains a challenge to benchmark some of the applications currently being optimized to exploit the hardware AMD is developing.
We’ve really done video transcoding to death. Although we haven’t yet circled back to cover the quality implications of Intel’s second-gen Quick Sync implementation, Nvidia’s NVEnc, or AMD’s VCE, we know that Ivy Bridge’s fixed-function logic is some of the fastest we’ve tested. Moreover, we still haven’t seen VCE enabled in an optimized application (though UVD3 and VCE are fixed-function components of Trinity).
Short of titles like MediaConverter and MediaEspresso, we’ve been at a loss for incorporating productivity-oriented software into our benchmark suite. That’s starting to change more quickly, as companies like Adobe tie OpenCL support into their offerings. Perhaps the biggest win thus far for AMD is Corel’s WinZip 16.5. I mentioned a few pages ago that Corel is deliberately locking out Intel and Nvidia, and I don’t particularly approve of that. However, the compression utility is still immensely popular, making it a great example of how graphics hardware can be applied to a workload not previously associated with graphics.
I have FX-8150 in there so you can see how long it takes the eight-core chip to finish a workload that’s now supposedly optimized for parallelized hardware.
As you can see, though, enabling OpenCL acceleration has a huge impact on performance. What once took 2:11 on the A10-5800K only takes 1:28 when the APU’s Devastator graphics core contributes to the effort. That’s a 32.8% improvement, and likely what AMD is hoping to see across the board as software developers begin figuring out how much of their code can be sped up using graphics resources.
LuxMark, which centers on the SmallLuxGPU2 rendering engine, is another OpenCL-based measurement tool we’ve been using.
In it, we see an A10-5800K trailing a discrete GeForce GTS 450 graphics card in an FX-8150-based machine.
Remember, Trinity employs AMD’s VLIW4 architecture, not GCN, which bolsters compute performance substantially. As such, it’s not surprising to see the Llano-based A8-3850 outrun the A8-5600K with fewer shaders. The next-gen APU family, Kaveri, will employ GCN, though.
We also had plans to run Musemage 1.9—introduced to us in William Van Winkle’s most recent exploration of GPU-accelerated image editing apps. However, the software’s licensing scheme is such that, after three hardware changes, it is revoked. Paraken Technology, the company responsible for Musemage, sent up a handful of licenses to use, but I didn't have time to get everything set back up again. We do plan to test Musemage going forward, though.