14nm+ Overview, Tri-Gate And Speed Shift
Moore's law predicts a doubling of transistor density every 18 months. Unfortunately, the oft-quoted Moore's Law is intersecting with the laws of economics, or more specifically, Rock's law, which predicts that the cost of building a new semiconductor fab will double every four years. A typical fab incurs a capital expenditure of roughly $14 billion, so new process shrinks will require either higher retail prices or longer amortization windows to offset the increased investment. The trick is to find the right balance between transistor density and cost, and Intel is quite bullish that it could maintain its race against physics as it continues to shrink its chips. However, increased fab and R&D costs are the likely culprits behind the relaxed tick-tock-tock schedule.
Kaby Lake recycles Skylake's microarchitecture, which means that the pipeline (and its IPC throughput) remains unchanged. Instead, Intel's 14nm+ process optimizations focused on developing faster transistors, yielding higher available clock rates. These frequency increases are important to single-threaded applications, and in a mobile environment, that means finishing a workload quickly to get back down to idle. Coupled with Intel's aggressive clock gating techniques, this naturally promotes longer battery life.
Tri-Gate Gets A Facelift
Intel began using its 3D tri-gate FETs (similar to FinFETs) when it transitioned to the 22nm process, which allowed it to increase performance within the same power envelope. Unfortunately, 3D transistors add cost and complexity to the already-expensive design and manufacturing process.
Intel notes that it has the industry's highest transistor density, and since 14nm+ doesn't involve a lithography shrink, the density metric likely remains unchanged. Instead, Intel is optimizing its transistors by improving their fin profile with taller fins and a wider gate pitch. It's also improving the transistor channel strain.
Of course, Intel does not provide exact measurements for the new fin profile and gate pitch for comparison, but a glance at an IDF 2014 presentation illustrates the company's previous advances and the scale of the problem. Intel hasn't officially named the new process as its next-gen tri-gate, but it is safe to assume that it is.
Interconnects, the small sandwiched filaments that connect the transistors, are also becoming more of a challenge as lithography shrinks. Transistors tend to become faster as they get smaller, but copper interconnects become slower because they have a reduced ability to carry current. Many of the recent advances in interconnect technology have centered on improving the interconnect insulators, but Intel notes that it increases interconnect performance on the 14nm+ process through pitch and aspect ratio optimizations.
According to the company, the net effect of its 14nm+ transistor and interconnect optimizations yields a 12% performance increase.
Higher Clocks Power Faster Speed Shifts
Switching efficiently into and out of the various power states is one of the most important techniques to reduce power consumption. In the past, the operating system signaled the processor to switch in and out of P-states using EIST. However, the signaling latency limited the technology's efficiency, so Intel introduced Speed Shift with Skylake. Speed Shift allows the processor to control its own P-states, reducing command latency by 30x.
Speed Shift technology remains unchanged with the Kaby Lake generation, but the chart above outlines the impact of faster clock rates. The horizontal axis is time-based, and each plot line represents the completion time for the same workload with various settings. The vertical axis plots frequency during the test as it fluctuates.
The orange line plots the amount of time required to complete the task with a Skylake-based Core i7-6500U's slower EIST implementation. Turning control over to the same Skylake processor (green line) via Speed Shift technology decreases the latency involved with switching into faster clock rates, so the workload completes in less than half the time.
Combining Speed Shift with the higher Turbo Boost frequencies on the Kaby Lake-based Core i7-7500U (yellow line) reduces the time to complete the task even more. The higher frequency allows the processor to return to idle faster after completion, resulting in longer battery life.
Intel also provides unique features for the mobile devices, such as its Intel Adaptive Performance (APT) technology. The implementation uses sensors that feed information back to the system to enhance power management at the hardware level. Intel acquiesces that vendors already employ some APT features in existing devices, but contends that Kaby Lake devices more tightly integrate the technology. This may indicate that the CPU itself uses the sensor feedback to make Turbo Boost and Speed Shift decisions, but we await further details.
The company demoed a 7mm thick Asus Transformer 3 that adapts its frequency and performance based on sensor feedback. The "skin" temperature sensors allow the device to detect and adjust frequencies, and the device can choose to stay in a Turbo Boost state for longer periods of time based upon the thermal headroom. Accelerometers also allow the device to adjust performance based on device orientation. For instance, the device will switch to a higher-power mode when it is in a static 45-degree orientation (which indicates docking), as opposed to a 90-degree orientation, which indicates a user is holding it.