Benchmarks published by hardware tester David Huang reveal that Intel's latest Meteor Lake CPUs may actually have lower instructions-per-clock (IPC) than 13th and 14th Gen Raptor Lake CPUs. Meteor Lake's somewhat disappointing CPU performance already confirmed that it didn't feature an impressive IPC boost, but it's looking more like these cutting-edge chips have actually seen a regression.
IPC is a key indicator of architectural improvements when comparing CPUs of one generation to another. Being able to do more work in a clock cycle is generally good, and it indicates a design has improved over its predecessor. Measuring IPC greatly depends on the workload, however, and is difficult to nail down — cache sizes and the instruction mix can greatly impact the throughput. Huang used SPECint 2017 for his testing, where performance hinges almost entirely on raw horsepower.
|Header Cell - Column 0
|Performance Per GHz
|Apple M3 Pro
|Apple M2 Pro
|Ryzen 7 7840HS
|Ryzen 7 7840U
|Core Ultra 7 155H
Huang tested the Core Ultra 7 155H and a variety of other CPUs using just one core, leaving the CPU clock speeds to run at default speeds. He then derived a form of IPC by dividing the performance by the clock speed. One caveat to this testing is that the 155H is being compared to a Core i7-13700H with DDR5 memory as opposed to LPDDR5, which Huang admits can throw off the results. Still, that shouldn't impact things enough to change the conclusion that Meteor Lake's IPC is lower than Raptor Lake's.
Meteor Lake's IPC is roughly on par with AMD's Zen 4 Ryzen 7040 APUs, which is a silver lining, especially since upcoming Ryzen 8040 APUs will have identical CPU performance. However, there's a very clear gap between Meteor Lake and Apple silicon, one that Intel should be working to shrink rather than letting it grow wider.
On a technical level, it's hard to say why Meteor Lake has regressed in this test, but the CPU's performance characteristics elsewhere imply that Intel simply might not have cared as much about IPC. Meteor Lake is primarily designed to excel in AI applications and comes with the company's most powerful integrated graphics yet. It also features Foveros technology and multiple tiles manufactured on different processes. So while Intel doesn't beat AMD or Apple with Meteor Lake in IPC measurements, there's a lot more going on under the hood.
While raw CPU performance is likely a lower priority for Intel as it focuses on AI and graphics performance, the company may still improve IPC with future Lunar Lake and Arrow Lake CPUs. There may also be other scenarios where Meteor Lake does better than in the SPECint 2017 testing shown here.
Stay on the Cutting Edge
Join the experts who read Tom's Hardware for the inside track on enthusiast PC tech news — and have for over 25 years. We'll send breaking news and in-depth reviews of CPUs, GPUs, AI, maker hardware and more straight to your inbox.
Matthew Connatser is a freelancing writer for Tom's Hardware US. He writes articles about CPUs, GPUs, SSDs, and computers in general.
Shocked... I am not.Reply
Intel's Bulldozer moment, we know what happened afterwards.Reply
the CPU's performance characteristics elsewhere imply that Intel simply might not have cared as much about IPC.Rubbish. I'm sure they care enough to try and avoid regressions, even if they weren't planning on making outright improvements.
Pretty outlandish claim, right there.
Meteor Lake is primarily designed to excel in AI applicationsNo, it wasn't. That happens to be one of its big selling points, but I pretty sure their top priority was to improve efficiency & battery life. They're probably worried about competition from Apple, Qualcomm, and AMD on that front.
While raw CPU performance is likely a lower priority for Intel as it focuses on AI and graphics performanceThese parts are worked on by different teams.
There may also be other scenarios where Meteor Lake does better than in the SPECint 2017 testing shown here.The only variables can be power & thermal thresholds. If the author divided by max boost clock, and the CPU wasn't allowed to stretch its legs very much or for very long, then testing on a laptop with more generous power limits and better cooling could produce better results.
Edit: Using Google Translate, the original webpage has this to say about heat & power:
"The CPU frequency is all default, and will be marked when the power consumption and heat dissipation performance are low enough to affect single-thread performance."
Still, there could be a time limit on boosting. Otherwise, sounds like the power & thermal parameters of the test platform probably shouldn't have much effect.
BTW, in addition to the 155H's "P-core" entry, there's another weird entry for it, called "LP P-core". I have no idea what that means, because I think there's only one set of P-cores and that result is still well above the 155H's E-core result.
Interestingly, its E-core scored 5.55 (absolute), while the i7-13700H's E-core scored 6.0. This is notable because Intel claimed the Crestmont E-cores actually did improve IPC over Gracemont. Normalizing by the specified "Efficient-core Max Turbo Frequency", which is 3.8 GHz for the 155H and 3.7 GHz for the i7-13700H, yields IPC scores of 1.46 and 1.62 for Meteor Lake and the previous-gen CPU, respectively. So, far from an improvement, it looks like quite a regression - even slightly more than the P-cores (11.1% vs. 9.1%)!
He measured the 155H's LP E-core at 3.15, but remember that its frequency is capped at just 2.5 GHz. It should have nearly identical IPC to the regular E-cores, but when I use the specified frequencies, I get 1.26 and 1.46 for the LP and regular E-cores, respectively.
Probably added latency due to disaggregation and it's associated steps.Reply
So long as the typical use case battery life is improved enough then it is still a win.
I can't see why not. It's a new node and Intel has everything to play for. No reason to think they'd take their foot off the gas.peachpuff said:Shocked... I am not.
I wouldn't go that far. Not least because its perf/W indeed appears to have improved. Being a laptop part, that + its low idle power are definitely relevant to its success.Kamen Rider Blade said:Intel's Bulldozer moment,
I'd love to see some in-depth analysis of what's going on. I think it's mighty suspicious that Intel apparently didn't send out any review samples, which is probably why Toms has yet to review it. Phoronix, whom they've been quoting in other articles, bought his own Meteor Lake laptop.
I doubt it. AMD disaggregated its cores and memory controllers, using far more primitive interconnect technology, and didn't seem to suffer much from it. The memory controller is located in the SoC die, so it's just one hop away from the CPU tile.rluker5 said:Probably added latency due to disaggregation and it's associated steps.
LPDDR5 does have meaningfully higher latency, though. David Huang was wise to point that out. He notes several other CPUs where LPDDR5 seems to have a notable performance impact.
Former Intel contract employee here... Meteor Lake was a hassle and a half. Samsung drives wouldn't work unless you dropped the pcie speed to 1. They were always damn slow, even the desktop cpus (yes they existed).Reply
The next bit is important, and perhaps we should have emphasized this more but I didn't feel it was necessary: "It also features Foveros technology and multiple tiles manufactured on different processes."bit_user said:Rubbish. I'm sure they care enough to try and avoid regressions, even if they weren't planning on making outright improvements. Pretty outlandish claim, right there.
We don't have a good way to measure "pure IPC" as caches, interfaces, etc. all affect instruction throughput. The cores on Meteor Lake could be identical or even improved over Raptor Lake, but all of the changes to implement the new multi-tile packaging could reduce the final real-world throughput.
Or if you want to read the statement in a different way: Intel seems to have cared more about getting Foveros working and proving it as a viable approach over just pure CPU IPC.
Thanks for considering my post and taking the time to reply. I always appreciate your dedication to the work and to us, Jarred. Also, Happy New Year!Reply
I understand that, but I'm also well aware that this isn't Intel's first chiplet/tile rodeo. They had Lakefield, Ponte Vecchio, and Sapphire Rapids. By now, they should have at least as much experience with disaggregated architectures as AMD had, when it did Zen 3 (i.e. Ryzen 5000).JarredWaltonGPU said:The next bit is important, and perhaps we should have emphasized this more but I didn't feel it was necessary: "It also features Foveros technology and multiple tiles manufactured on different processes."
For a while, now, I think industry standard practice has basically been to use a benchmark like SPEC2006, SPEC2017, or GeekBench (CPU) and normalize by clock speed. It's not a strict count of the instruction rate, but different instructions have different throughputs and latencies anyhow. The key point is just to look at clock-normalized performance, as a measure of microarchitecture width and the efficacy of things like branch predictors and prefetchers.JarredWaltonGPU said:We don't have a good way to measure "pure IPC" as caches, interfaces, etc. all affect instruction throughput.
If you really wanted to try and isolate different aspects of the microarchitecture, one could even do this with microbenchmarks designed to stress different parts of the CPU. And some people do.
His data provides a couple interesting points of comparison.JarredWaltonGPU said:The cores on Meteor Lake could be identical or even improved over Raptor Lake, but all of the changes to implement the new multi-tile packaging could reduce the final real-world throughput.
Let's consider the Ryzen 5 3600 and Ryzen 7 4800U. Both are Zen 2, but the first is disaggregated and the latter is monolithic. Their raw scores were 6.57 and 5.95. As they both feature turbo frequencies of 4.2 GHz, we can just directly compare the scores, with the chiplet-based CPU coming in at 10.4% faster! Maybe there's some difference in boost behavior, but I found a review which confirms the 4800U can stay under 15W, with one thread at 4.2 GHz, so I'm assuming that was what happened. If they did both sustain 4.2 GHz, then leaves only cache & perhaps memory latency as the distinguishing factors (both were tested using DDR4-3200). The Ryzen 5 3600 has 32 MB of L3 cache, while the Ryzen 7 4800U has only 8 MB. However, since the 3600 is subdivided into two CCX's, a single thread probably has access only to 16 MB of L3.
I think the main takeaway here is that there are some factors bigger than chiplet vs. monolithic.
I don't believe Intel would've rolled out such a technology for its bread-and-butter products that it didn't think was ready for prime time. The stakes are too high for them just to "push it out the door", before it's ready. Intel has taken its time to migrate to chiplets, and I assume that was just so they could perfect the technology. Sapphire Rapids showed they can deliver solid IPC using chiplets, and Meteor Lake is a whole generation beyond that and actually has far fewer tiles.JarredWaltonGPU said:Or if you want to read the statement in a different way: Intel seems to have cared more about getting Foveros working and proving it as a viable approach over just pure CPU IPC.
I'd encourage folks to read some of what David Huang said about LPDDR5, because it stands out as one of the more plausible explanations for the regression. I don't understand a word of Chinese, but Google Translate seems to work pretty well on the page.
"Zen 3 desktop performance is 12% higher than the strongest Zen 3 mobile (DDR5), and more than 30% higher than the strongest LPDDR5 mobile performance"
"when Alder Lake mainstream SoC i5-1240P is paired with LPDDR5-4800 The overall performance is only slightly better than the desktop Skylake++."
Doesn't Meteor Lake have a couple of extra-small extra-weak (extra-misleading to customers) "platform" e-cores thrown in the mix?Reply
Meteor lake is like a weird big.LITTLE.TINY architecture as Intel tries everything to make their core count look larger, isn't it?
Maybe they're dragging down the averages, and I can't imagine today's schedulers are that great at simultaneously managing 3 different types of cores in a single processor, considering they still aren't the best at managing 2.
After seeing the Hardware Canucks preview I'm not really trusting any early release hardware results. Not to mention this reviewers spec numbers appear to be off (and I don't mean just the raw numbers, but the differences between them) when compared to those AnandTech runs which they standardized a long while ago when Andrei was there.Reply
After CES and when there are a lot more results to compare I think we'll get a better idea. Until then assumptions based on what little is available now are nebulous at best.
From everything Intel has indicated IPC on the P-cores should be basically the same as RPL its the E-cores we'll want to take note of.