Intel Kaby Lake: 14nm+, Higher Clocks, New Media Engine

14nm+ Overview, Tri-Gate And Speed Shift

14nm+ Overview

Moore's law predicts a doubling of transistor density every 18 months. Unfortunately, the oft-quoted Moore's Law is intersecting with the laws of economics, or more specifically, Rock's law, which predicts that the cost of building a new semiconductor fab will double every four years. A typical fab incurs a capital expenditure of roughly $14 billion, so new process shrinks will require either higher retail prices or longer amortization windows to offset the increased investment. The trick is to find the right balance between transistor density and cost, and Intel is quite bullish that it could maintain its race against physics as it continues to shrink its chips. However, increased fab and R&D costs are the likely culprits behind the relaxed tick-tock-tock schedule.

Kaby Lake recycles Skylake's microarchitecture, which means that the pipeline (and its IPC throughput) remains unchanged. Instead, Intel's 14nm+ process optimizations focused on developing faster transistors, yielding higher available clock rates. These frequency increases are important to single-threaded applications, and in a mobile environment, that means finishing a workload quickly to get back down to idle. Coupled with Intel's aggressive clock gating techniques, this naturally promotes longer battery life.

Tri-Gate Gets A Facelift

Intel began using its 3D tri-gate FETs (similar to FinFETs) when it transitioned to the 22nm process, which allowed it to increase performance within the same power envelope. Unfortunately, 3D transistors add cost and complexity to the already-expensive design and manufacturing process.

Intel notes that it has the industry's highest transistor density, and since 14nm+ doesn't involve a lithography shrink, the density metric likely remains unchanged. Instead, Intel is optimizing its transistors by improving their fin profile with taller fins and a wider gate pitch. It's also improving the transistor channel strain.

Of course, Intel does not provide exact measurements for the new fin profile and gate pitch for comparison, but a glance at an IDF 2014 presentation illustrates the company's previous advances and the scale of the problem. Intel hasn't officially named the new process as its next-gen tri-gate, but it is safe to assume that it is.

Interconnects, the small sandwiched filaments that connect the transistors, are also becoming more of a challenge as lithography shrinks. Transistors tend to become faster as they get smaller, but copper interconnects become slower because they have a reduced ability to carry current. Many of the recent advances in interconnect technology have centered on improving the interconnect insulators, but Intel notes that it increases interconnect performance on the 14nm+ process through pitch and aspect ratio optimizations. 

According to the company, the net effect of its 14nm+ transistor and interconnect optimizations yields a 12% performance increase.

Higher Clocks Power Faster Speed Shifts

Switching efficiently into and out of the various power states is one of the most important techniques to reduce power consumption. In the past, the operating system signaled the processor to switch in and out of P-states using EIST. However, the signaling latency limited the technology's efficiency, so Intel introduced Speed Shift with Skylake. Speed Shift allows the processor to control its own P-states, reducing command latency by 30x.

Speed Shift technology remains unchanged with the Kaby Lake generation, but the chart above outlines the impact of faster clock rates. The horizontal axis is time-based, and each plot line represents the completion time for the same workload with various settings. The vertical axis plots frequency during the test as it fluctuates.

The orange line plots the amount of time required to complete the task with a Skylake-based Core i7-6500U's slower EIST implementation. Turning control over to the same Skylake processor (green line) via Speed Shift technology decreases the latency involved with switching into faster clock rates, so the workload completes in less than half the time.

Combining Speed Shift with the higher Turbo Boost frequencies on the Kaby Lake-based Core i7-7500U (yellow line) reduces the time to complete the task even more. The higher frequency allows the processor to return to idle faster after completion, resulting in longer battery life.

Intel also provides unique features for the mobile devices, such as its Intel Adaptive Performance (APT) technology. The implementation uses sensors that feed information back to the system to enhance power management at the hardware level. Intel acquiesces that vendors already employ some APT features in existing devices, but contends that Kaby Lake devices more tightly integrate the technology. This may indicate that the CPU itself uses the sensor feedback to make Turbo Boost and Speed Shift decisions, but we await further details. 

The company demoed a 7mm thick Asus Transformer 3 that adapts its frequency and performance based on sensor feedback. The "skin" temperature sensors allow the device to detect and adjust frequencies, and the device can choose to stay in a Turbo Boost state for longer periods of time based upon the thermal headroom. Accelerometers also allow the device to adjust performance based on device orientation. For instance, the device will switch to a higher-power mode when it is in a static 45-degree orientation (which indicates docking), as opposed to a 90-degree orientation, which indicates a user is holding it.

MORE: Intel & AMD Processor Hierarchy
MORE: All CPU Content

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • What a boring release. LGA 2011-v3 is my platform of interest.
  • dgingeri
    In other words, total yawnfest.
  • ComputerSecurityGuy
    Yup, Skylake Refresh. Higher clocks. Slightly lower power consumption. Probably saw a demo of Overwatch on Iris Pro 680 (580 with 200mHz or so higher clocks. Overall, probably not a very interesting release. The only hopeful thing is they might bring Iris/Iris Pro to lower end or lower power SKUs.
  • 80-watt Hamster
    Sheesh, why all the negativity? As someone who put together a Skylake platform with an i3, I'm looking forward to more capable -K processors being released for the same socket.
  • txhorn
    Kaby Lake's hardware encode and decode of 4k codecs is significantly better than Skylake. VP9 4k decode is down to 10-20% cpu usage from 70-80% usage. That's pretty awesome for HTPC's and high-res portable battery life.
  • AndrewJacksonZA
    Typo on page two: "Intel is optimizing its transistors by improving their fin profile with taller fins and a wider gate pitch" That should be a NARROWER gate pitch.
  • digitalgriffin
    $389 for a 2 core 4 thread processor....thanks but no thanks.
  • goblinissimus
    Nothing about USB 3.1 Gen 2, TB3, DP1.3, HDMI 2.0b/HDCP 2.2?
    Also, 12 bit (aka Dolby Vision or DV) HEVC decode would have been nice.
  • digitalgriffin
    You know intel keeps bragging about how you get more battery life. Yet manufacturers keep shrinking the Wh on the batteries and intel keeps raising their prices. You really gain nothing in terms of battery life or cost.
  • 80-watt Hamster
    18520394 said:
    $389 for a 2 core 4 thread processor....thanks but no thanks.

    4C8T perhaps you mean? I can't see Intel trying to charge nearly $400 for an i3.