At a high-level, the A72 looks nearly identical to the A57, but at a lower level there are a significant number of changes throughout the entire pipeline that appear to make the A72 a decent upgrade. The most notable changes affecting performance are the improved branch prediction, increased dispatch bandwidth, lower-latency execution units, and higher bandwidth L2 cache. All of these enhancements, and many more which we did not discuss here, lead to better performance—between 16-50% across a range of synthetic benchmarks, according to ARM. Real-world performance gains will be less, of course, but A72 is definitely an improvement over the A57, especially with floating-point workloads.
The A72 is not a pure performance play, however. ARM is targeting a much higher power efficiency with this architecture than with any previous high-end CPU core. It’s clear from the lengthy list of optimizations discussed above that reducing power consumption was paramount; many of the changes are purely focused on power with no net performance gain.
Reducing power and area—the A72 achieves a 10% core area reduction overall—obviously has a positive effect on battery life and cost, but it has a secondary effect on performance too. Normally, reducing latency in the execution units puts pressure on the max attainable core frequency due to increased circuit complexity and tighter timing windows; however, the A72’s power and area optimizations elsewhere, not to mention the move to FinFET, actually allow the A72 to reach a slightly higher frequency. Reducing power also reduces thermal load, allowing for higher sustained performance, something the A57 struggles with at 20nm.
The Cortex-A72 may not be a revolutionary design that catapults it above Apple’s Twister CPU in the A9 SoC for single-core performance or undercuts the A53 in power consumption, but it’s a significant update nonetheless, addressing the A57’s issues by enabling higher peak and sustained performance while using less power.
Update, 1/12/16, 10:55am PT: Clarified how the Decode/Rename/Dispatch pipeline works and added some information about the issue queues.