Intel unwraps Lunar Lake architecture: Up to 68% IPC gain for E-cores, 14% IPC gain for P-Cores

Lunar Lake Architecture
(Image credit: Intel)

Intel pulled the covers back on its Lunar Lake architecture during its Intel Tech Tour 2024, delivering deep dive architectural details in Taipei, Taiwan in advance of the company’s Computex 2024 keynote as its newest chips race to a Q3 launch. Intel’s Lunar Lake will have significant improvements in every facet of its design. Lunar Lake will primarily target mobile designs, powering some of the best laptops, though many of the fundamental changes will likely carry over to Arrow Lake and will be in some of the best CPUs for gaming.

Every component of the Lunar Lake architecture has been optimized for a refined blend of power and performance that intel says will redefine what we expect from x86 PCs. Some of the biggest improvements come in the E-cores, with 38% and 68% IPC gains in the new Skymont architecture. There's also a 14% IPC gain for the Lion Cove P-cores — though these projections come with caveats. Graphics will see a 50% improvement in iGPU performance with the new Xe2 integrated graphics engine.

Lunar Lake incorporates Intel’s new neural processing unit (NPU) for AI workloads that delivers 48 TOPS of performance, easily providing enough performance to satisfy Microsoft’s requirement of 40 NPU TOPS for next-gen AI PCs. In fact, the Lunar Lake platform has far more AI performance under the hood — in total, it offers 120 TOPS when factoring in the CPU and iGPU.

The resulting Lunar Lake mobile chips employ an entirely new design methodology that focuses on ensuring power efficiency as a first-order priority, and this base architecture will be used as the building block for Intel’s future products, like Arrow Lake and Panther Lake. This new design focus is key to fending off a bevy of strong competitors in the laptop market from AMD, Apple, and now Qualcomm.

Surprisingly, intel turned to TSMC for its leading-edge 3nm N3B process node for its compute tile, which houses the CPU, GPU and NPU. It also uses the TSMC N6 node for the platform controller tile that houses the external I/O interfaces. In fact, the only Intel-fabbed silicon on the chip is the passive 22FFL Foveros base tile that facilitates communication between the tiles and the host system.

Intel says it chose TSMC’s nodes because they were the best available when the company began designing the chip, a nod to its delays on the manufacturing side of the operation as it looks to regain its lead in foundry technology through its five nodes in four years initiative. However, Intel designed the architectures to be easily portable to other process nodes, so we can expect it to return to using its own nodes with many of these same architectures in its future products.

Lunar Lake’s new microarchitectures pave the way for the company’s soon-to-be-announced Arrow Lake processors for the desktop, and even its Xeon lineup, too. Let’s dive into the details.

Lunar Lake SoC Overview

Intel’s Lunar Lake will come with four P-cores and four E-cores in the top-tier SKU. The chip is comprised of two logic tiles, a TSMC N3B compute tile and an N6 platform controller tile, along with a stiffener (a non-functional piece of filler silicon) placed atop a 22FFL base Foveros tile. The logic tiles are connected to the base tile with solder bonding with a 25-micron bump pitch (a critical measurement of interconnect density), an improvement over the 36-micron pitch used for Meteor Lake. This smaller pitch enables denser communication pathways between the units and helps reduce power consumption.

Intel places two stacks of LPDDR5X-8500 memory directly on the chip package, in 16GB or 32GB configurations, to reduce latency and board area while lowering the memory PHY’s power consumption by up to 40%. The memory communicates over four 16-bit channels and delivers up to 8.5 GT/s of throughput per chip.

(Image credit: Intel)

The compute tile houses the CPU P- and E-cores, Xe2 GPU, and NPU 4.0. It also incorporates a new 8MB ‘side cache’ that can be shared among all the various compute units to improve hit rates and reduce data movement, thus saving power. However, it doesn’t technically fit the definition of an L4 cache because it is shared between all of the units.

Intel has also moved the power delivery subsystem from the chip to the board, with four PMICs spread across the motherboard to provide multiple power rails and increased control. Overall, Intel claims a 40% reduction in SoC power over Meteor Lake.

Let’s dive into the core microarchitecture for the CPU, GPU and NPU, along with details about the platform controller tile, on the following pages.

TOPICS
Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • dimar
    I'd like to see extensive power consumption/performance benchmarks for the next gen. Intel, AMD, ARM CPUs and how slower and faster SSDs affect the battery life.
    Reply
  • cyrusfox
    Fascinating the disparity in improvement between Floating point vs integer single thread uplift. Huge FP uplift! Eye popping improvements!!! That is lame though, comparing to the gimped e-cores...
    but the comparisons are once again being made to the low-power Meteor Lake E-core instead of the full E-core
    iGPU uplift is looking really solid, this seems to check all the boxes of what I need in a device. For toting around I really don't need 24+ cores, 8 is plenty and with this much GPU as well as hopefully maturing platform. Excited to see the products and where the pricing lands. Hope Framework gets a flavor of this.
    Reply
  • jenci8888
    68% ipc gain e-core? That doesn't seem right... I think they meant 68% gain e-core on specfp. It should be 30% ipc around.
    Reply
  • usertests
    cyrusfox said:
    That is lame though, comparing to the gimped e-cores...
    They also compare it to Raptor Cove, and if I'm reading it right, claim +2% IPC over that. With the caveat repeatedly mentioned in the article that Intel is giving itself a 10% margin of error.
    Reply
  • thestryker
    Well it seems we have an answer regarding HT and that is it depends. It'll be interesting to see which version of the Lion Cove desktop ARL uses. I wouldn't be surprised if mobile ARL went without HT, but it seems like desktop could probably keep it even though they emphasize hybrid when referring to dropping it.

    Will be looking forward to seeing real world performance on LNL and ARL.
    cyrusfox said:
    That is lame though, comparing to the gimped e-cores...
    Just a guess based upon the testing Chips and Cheese did on the MTL LP E-cores removal from the ring bus and thus the L3 cache can have an outsized impact on performance. This would in theory be the closest comparison to an existing product.
    Reply
  • Giroro
    I think cheap Mini PCS using Intel's N100 all E-core processor are a perfectly usable office machine for many people, and I would love to see those upgraded with the new E cores.

    That said, I would never buy one, because they would definitely spec those machines with a worthless amount of non-upgradeable memory.
    One of the major features adding value of the N100 machines, is that you can usually upgrade the RAM.

    Now for the high end Lunar Lake products... There is no high end. If Intel doesn't convince manufacturers to keep ultrabooks with the highest available configuration under $1200, then they're going to have a problem
    Reply
  • Evildead_666
    Can people please stop using the word Architect as a verb please ?
    "Redesigned" would have been perfect for this article.
    Architected or Rearchitected do not exist.
    Cheers.
    Reply
  • Dragos Manea
    Evildead_666 said:
    Can people please stop using the word Architect as a verb please ?
    "Redesigned" would have been perfect for this article.
    Architected or Rearchitected do not exist.
    Cheers.
    That would to similar with the article from which they copied and pasted, they had to change some words even if it is with words that does not exist.
    Reply
  • bit_user
    The article said:
    38% and 68% IPC gains in the new Skymont architecture.
    This is based on a somewhat biased performance comparison (see below).

    cyrusfox said:
    Fascinating the disparity in improvement between Floating point vs integer single thread uplift. Huge FP uplift! Eye popping improvements!!!
    That is lame though, comparing to the gimped e-cores...
    Initially, I missed what you probably meant by "gimped". As I see @thestryker has pointed out, comparing to the LP E-cores is indeed quite lame of them, since its lack of L3 cache has been shown to disadvantage it relative to the Crestmont cores on Meteor Lake's CPU tile.

    Dang. I was really excited for a minute, there.
    : (
    Reply
  • bit_user
    Giroro said:
    I think cheap Mini PCS using Intel's N100 all E-core processor are a perfectly usable office machine for many people, and I would love to see those upgraded with the new E cores.
    Yeah, they seem to be somewhere around the performance of a Sandybridge or Haswell i5, which is still pretty usable. Of course, their iGPU is much better than those CPUs'.

    Giroro said:
    That said, I would never buy one, because they would definitely spec those machines with a worthless amount of non-upgradeable memory.
    One of the major features adding value of the N100 machines, is that you can usually upgrade the RAM.
    You can find some that take a DDR5 SO-DIMM. I have 32 GB in my N97 machine. It doesn't need that much, but I did it just to get dual-rank, for the small performance boost it provides.
    Reply