ARM announced the Cortex-A72, the high-end successor to the Cortex-A57, near the beginning of 2015. For more than a year now, SoC vendors have been working on integrating the new CPU core into their products. Now that mobile devices using the A72 are imminent, it’s a good time to discuss what makes ARM’s flagship CPU tick.
With the A57, ARM looked to expand the market for its CPUs beyond mobile devices and into the low-power server market. Using a single CPU architecture for both smartphones and servers sounds unreasonable, but according to ARM’s Mike Filippo, lead architect for the A72, high-end mobile workloads put a lot of pressure on caches, branch prediction, and the translation lookaside buffer (TLB), which are also important for server workloads. Where the A57 seemed skewed towards server applications based on its power consumption, the A72 takes a more balanced approach and looks to be a better fit for mobile.
The Cortex-A72 is an evolution of the Cortex-A57; the baseline architecture is very similar. However, ARM tweaked the entire pipeline for better power and performance. Perhaps the A57’s biggest weakness was its relatively high power consumption, especially on the 20nm node, which severely limited sustained performance in mobile devices, relegating it to short, bursty workloads and forcing SoCs to use the lower-performing Cortex-A53 cores for extended use.
ARM looks to correct this issue with the A72, going back and optimizing nearly every one of the A57’s logical blocks to reduce power consumption. For example, ARM was able to realize a 35-40% reduction in dynamic power for the decoder stage, and by using an early IC tag lookup, the A72’s 3-way L1 instruction and 2-way L1 data caches also use less power, similar to what direct-mapped caches would use. According to ARM, all of the changes made to the A72 result in about a 15% reduction in energy use compared to the A57 when both cores are running the same workload at the same frequency and using the same 28nm process. The A72 sees an even more significant reduction when using a modern FinFET process, such as TSMC’s 16nm FinFET+, where an A72 core stays within a 750mW power envelope at 2.5GHz, according to ARM.
[Image Source: Hiroshige Goto PC Watch]