Results: OpenCL and HSA
While we don’t want to rely on synthetic benchmarks, we'll use a couple to illustrate the potential gains of OpenCL and AMD's HSA initiative as those efforts start to take hold.
LuxMark 2.0
In order to assess the A10-7800's capabilities, we run three separate tests: CPU-only, GPU-only, and both combined. While we might have expected the more resource-laden APUs to win, Intel's Core i3-4330 and its superior IPC still puts in a strong showing. Let’s start with just the CPU:
We naturally expect the Haswell design to fare well in a CPU-only measure of performance. But while the win is unsurprising, the magnitude of Intel's advantage is fairly overwhelming. Favor is shifted to AMD by only using the on-chip graphics engine:
Intel's HD Graphics implementation is outclassed by Graphics Core Next in the Kaveri design. But what happens when CPU and GPU resources are utilized simultaneously?
Just as Bill Murray likes to photobomb pictures, the Core i3-4330 sneaks right in the middle of AMD’s family photo. One of the reasons appears to be that the Core i3's CPU and GPU scores are added together, yielding an aggregate, while the APUs are weighed differently. Perhaps one on-chip complex or the other isn't running at peak performance under a combined full load.
HSA: A Great Idea in Need of Software Support
Leading up to the Kaveri APU introduction, AMD put a lot of effort into evangelizing the benefits of its HSA (Heterogeneous System Architecture) initiative. Again, if you want to know more, read our Kaveri launch article. However, even now, several months later, the number of applications exploiting HSA remains small. That's a disappointing state of affairs, since the fundamentals of HSA are enticing.
Moving on from theory to practice, the LibreOffice benchmark originally provided by AMD impressively documents the value proposition of HSA. Once again, we conduct three separate test runs: CPU-only, OpenCL-only, and HSA. Since Intel CPUs do not support HSA, they aren't represented in the third chart. And we didn’t bother benchmarking the outdated Core i3-2100 at all.
Let’s start with the CPU-only chart, which, as we might expect, the Core i3-4330 dominates:
Thirty-seven percent higher performance. Such an advantage cries for an OpenCL-based rematch. Said and done. Surprisingly, Intel’s relatively modest graphics engine is slower than its CPU. Or perhaps the company's drivers are to blame.
It'd be easy to guess that a seeded test like this one would favor AMD's hardware. And it does, demonstrating what a highly parallelized workload can do on hardware with the necessary support. Those gray bars are for comparison only. They depict the execution times of Intel's Core i3-4330 and AMD's A10-7800 in software and OpenCL modes.
Looking at the HSA-based results after the numbers generated with OpenCL turned on reminds us of a quote attributed to General LeMay. When asked about the purpose of a nuclear second strike capability, he replied, "To make the cinders dance". However, not every application is suitable for HSA optimizations, and adoption of the initiative by the software industry is thus far disappointing.