PAO Steps In
Sometimes the race to satisfy Moore's Law leaves promising technologies and optimizations on the table. Faster developmental cycles require more trade-offs in terms of capabilities (there isn't enough time to implement all of them) and it also doesn't allow the manufacturers to fully exploit the lessons learned during the first step on the microarchitecture ladder.
Intel's additional 14nm cycle, the "Optimize" portion of its new PAO scheme, allows the company to make promising claims by tweaking its existing Skylake architecture. Intel tunes the transistors to provide more performance on the same node, but is applying much of the increased headroom to Turbo Boost, rather than upping its base clock rates.
The tactic works well for a mobile-first approach. But it's hard to determine how Intel will apply the faster transistors to desktop-oriented SKUs. Higher-TDP models typically aren't as useful in battery-powered applications, so we may see more substantial base frequency increases there. We also expect a broader implementation of other technologies, like the software-based Turbo Boost 3.0 first seen in Broadwell-E. Of course, Intel won't comment on any of that until later this year.
Intel also made, at least on the surface, relatively minor changes to the encode and decode engines in its graphics engine, which we're calling Gen9+. These targeted adjustments should yield impressive gains in specific tasks. Decoupling the encode/decode process from the CPU during the majority of HEVC and VP9 workloads should have a tangible impact on performance during content creation and consumption, not to mention battery life.
Intel provided several compelling demonstrations during its briefing sessions, and playing Overwatch on a 15W platform at 32 FPS with maximum FOV and HD resolution is impressive. It certainly bodes well for the more powerful mobile-oriented SKUs that will come next year.
Of course, some will be dismayed that the slow cadence of incremental upgrades only appears to be getting slower. But the economics of the semiconductor design and manufacturing process dictate that there will be trade-offs at some point. Intel delayed its 10nm Cannonlake release when it switched to the PAO flow, while some foundries skip 10nm FinFET entirely. GlobalFoundries recently indicated that it is transitioning directly from 14nm to 7nm. That company suggests its drastic change of course is due to the limited performance improvements available from 10nm products.
The resurgent AMD claims its Zen architecture is competitive with current-generation Skylake processors, and Intel's relatively small performance jumps (at least with the mobile Kaby Lake products) might allow AMD to assume a more competitive stance. However, the semiconductor clock keeps ticking, and a jump to 10nm could provide Intel with additional breathing room. Of course, that hinges on how fast both companies can bring their designs to market.
Intel's incremental performance increases may not appear to be cataclysmic, but the first products with the 14nm+ process target a majority of workloads that are relevant to mobile users. In all, the refinements will give many users a reason to upgrade from older systems, but they surely will not spur tech enthusiasts to discard their Skylake mobile devices in favor of Kaby Lake designs. Of course, that isn't Intel's intention. The key is to give the technology "laggards" that are still on older platforms a reason to upgrade, and the improvements might provide enough incentive.