Benchmark Results: GIMP
Our early GIMP testing threw us a bit of a curve ball. We originally set out to test with the GEGL effects bilateral filter, edge-laplace, and motion-blur. However, in repeated testing, we found that the edge-laplace and motion-blur tests were coming back with identical results on the A8-based desktop platform when running with OpenCL enabled, regardless of whether we were testing with APU graphics or our discrete Radeon HD 7970 card. The 7970 should have blown the APU out of the water, or at least been decisively faster.
Discussions with AMD and developers confirmed our suspicions: we were hitting a CPU bottleneck on the A8. There simply wasn’t enough compute work happening for the GPU to make its presence felt. This raises an interesting value point: if your workloads aren’t sufficiently demanding, depending on how your app is coded, you may not realize as much GPU-assist benefit as expected.
For our purposes, we had to modify our tests in order to increase the processing load to demonstrate GPU compute scaling. We replaced edge-laplace with Gaussian blur, cranking up the Size X and Y variables to 20.0 each. We kept the motion-blur filter, but increased the Length parameter to 100 and Angle to 45. This gave us the following GIMP results.
In these and subsequent tests, you’ll notice the obvious results gap next to our HP notebook where OpenCL results should be—because today’s Sandy Bridge-based HD Graphics engines don’t support OpenCL (and we still haven't been able to get our hands on any Ivy Bridge-based Core i5 machines). Still, we left the Intel platform in this mix for comparison, because there are some cases in which the performance of Intel’s CPU working only in software makes for an interesting counterpoint to GPU-based acceleration. After all, with GPU-assist still in its toddler stage and many applications not yet optimized for the new technology, it’s important to keep one eye on how non-accelerated platforms behave.
In these GIMP tests, though, the benefits of OpenCL-based GPU acceleration are glaring. Even stating the difference as a percentage or multiple seems irrelevant. The point is that without acceleration, these filters are nearly unusable on any system. Workflow comes to a complete stop as the system creeps through adding the blur one block at a time. With OpenCL turned on, suddenly we see very even, expected performance scaling as we edge up from mobile to desktop APU and APU into discrete. Note how it’s not just the graphics processor doing all of the work. Depending on the test, the CPU side still contributes another 20% to 40% to the end result.
Of course, this is true when a suitable workload is present. Remember that we had to modify our original testing in order to expose more noticeable scaling from the GPU. Without that deliberate pressure, AMD's x86 cores stand in the way of greater utilization of graphics resources. We're certain that software developers know what they're up against when it comes to balancing resource utilization, and have to assume that what we're presenting might become a more prevalent condition as developers code for highly parallel operations. Today, expect the impact to be somewhat more muted.