Ryzen 5 9600X benchmarks show doubled cache bandwidth improvements — leaked AIDA64 benchmarks point to much faster L1 and L2 cache

(Image credit: AMD)

One of the many Zen 5 architectural changes announced at AMD's Ryzen 9000 Computex keynote was the architecture's bandwidth doubling of the L1 and L2 caches. HLX on X (formally Twitter) discovered an Aida64 benchmark run of a Ryzen 5 9600X engineering sample, confirming AMD's L1 and L2 cache tweaks.

The Aida64 benchmark reveals that the Ryzen 5 9600X offers almost 3,800 GB/s of read bandwidth on the L1 cache. Write bandwidth was rated at nearly 1,900 GB/s, and copy speed was tested at almost 3,800 GB/s, just like the read speed, all with a blazing sub-1 nanosecond cache latency, which we would expect from a first-level cache.

9600X ES vs 7600X2x L1 and L2 BandwidthSource: QQ pic.twitter.com/G2c1Q1bjbjJune 10, 2024

Swipe to scroll horizontally

CPUs:	Cache Read	Cache Write	Cache Copy
Ryzen 5 9600X - L1 Cache	3,756.4 GB/s	1,884.4 GB/s	3,755.9 GB/s
Ryzen 5 9600 X - L2 Cache	1,874.6 GB/s	1,795.1 GB/s	1,859.7 GB/s
Ryzen 5 7600X - L1 Cache	2,029.6 GB/s	1,026.9 GB/s	2,048.1 GB/s
Ryzen 5 7600X - L2 Cache	1,028.5 GB/s	1,017.0 GB/s	1017.6 GB/s

The L2 cache was rated at almost 1,900 GB/s for read bandwidth, write speeds was rated at almost 1,800 GB/s, and copy speed was rated at almost 1,900 GB/s. Latency was 2.8ns.

The same source that provided the Aida64 benchmark also provided an Aida64 memory benchmark of a Ryzen 5 7600X Zen 4 processor for comparison. The results confirm what AMD disclosed in its Computex announcement. The Zen 4 part boasts nearly 2x less raw bandwidth in its L1 and L2 cache, compared to the Zen 5-based Ryzen 5 9600X.

The 7600X boasts a 2,000 GB/s read and copy speed and a 1,000 GB/s write speed (with the same latency). The L2 cache is rated at just barely over 1,000 GB/s across the board, read, write, and copy speeds (with a slightly less but indistinguishable 2.6ns latency result).

CPU cache is one of the most important components in modern CPUs. A good caching system will keep the CPU cores fed with data consistently in a number of workloads with minimal downtime. A poor caching system, or a lack thereof, will lead to poor CPU performance as the cores have to wait for data to be transferred from much slower system memory.

We have yet to see how Zen 5's L1 and L2 bandwidth improvements apply to real-world performance, or even synthetic benchmark performance. But undoubtedly, these heavy bandwidth improvements are part of what helps boost Zen 5's IPC improvement by a reported 16% over Zen 4.

The Ryzen 5 9600X is AMD's latest mid-range desktop CPU based on the Zen 5 architecture. The chip boasts 6 cores and 12 threads, a peak turbo frequency of 5.4GHz, and 38MB of cache.

TOPICS

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

27 Comments Comment from the forums

And......this is just an early engineering sample being tested here, so the final performance should end up even better, both in synthetic and real world gaming tests.

Some rumors I would like to point out here. Kind of off topic though.

First off, as per Club386, AMD is already prepping for its next-gen 3D V-Cache variants of the Granite Ridge Ryzen 9000 series processors. But to me, this seems highly 'unlikely', because there would only be a 2 months release window time gap, between non-X3D variants, and next-gen 3D V-Cache chips.

Makes little practical sense, since there won't be any serious competition from Intel until late 2024, since Arrow Lake-S chips aren't coming out in next few months or so.

IDK, but this is what they had to say:

"Our source on the Computex show floor tells us AMD plans to launch 9000X3D processors in September. This matches the staggered release of X870E motherboards that we anticipate will arrive the same month.

There’s no hard indication of which CPUs will arrive first, but an educated guess from historical releases suggests it’ll likely start with Ryzen 9 9950X3D and possibly Ryzen 9 9900X3D.

There was a level of confidence behind our source’s words, and they’re trustworthy enough to put stock into. That said, familiarity with how nebulous AMD’s launch schedules are urges taking this with a pinch of salt. Nothing’s confirmed until you have a sample in your hand as dates can easily change." ----- via Club386

Next up, AMD's senior technical marketing manager, Donny Woligroski, revealed that the firm is indeed working on future "X3D" processors. It appears AMD is working on some cool new features for its next-gen Ryzen 9000X3D "3D V-Cache" family.

He mentions "cool differentiators" here. Not sure what AMD means by this, but an educated guess would imply that the firm might be working on varied and different 3D V-Cache configurations for its Ryzen 9000X3D CPU lineup.

Varied sizes of 3D V-Cache to further segment the 9000 series lineup, or something else entirely ?

The X3D stuff, we have a lot to say about it. The best part about it is we're not just resting on laurels. We're improving what we can do with X3D, it's really exciting and I'm super looking forward to talking to people about that.

It's not like, hey, we've also added X3D to a chip. We are working actively on really cool differentiators to make it even better. We're working on X3D, we're improving it.
Donny Woligroski - AMD Senior Technical Marketing Manager (via PC Gamer)

EDIT:

BTW, we have already seen LAB samples featuring dual 3D V-Cache stack Ryzen chips from AMD before, in their labs, but I doubt AMD would go this route even for the Ryzen 9000 series lineup, due to higher cost, power, and thermal drawbacks.
Reply
Evildead_666

Metal Messiah. said:
And......this is just an early engineering sample being tested here, so the final performance should end up even better, both in synthetic and real world gaming tests.

Some rumors I would like to point out here. Kind of off topic though.

First off, as per Club386, AMD is already prepping for its next-gen 3D V-Cache variants of the Granite Ridge Ryzen 9000 series processors. But to me, this seems highly 'unlikely', because there would only be a 2 months release window time gap, between non-X3D variants, and next-gen 3D V-Cache chips.

Makes little practical sense, since there won't be any serious competition from Intel until late 2024, since Arrow Lake-S chips aren't coming out in next few months or so.

IDK, but this is what they had to say:

Next up, AMD's senior technical marketing manager, Donny Woligroski, revealed that the firm is indeed working on future "X3D" processors. It appears AMD is working on some cool new features for its next-gen Ryzen 9000X3D "3D V-Cache" family.

He mentions "cool differentiators" here. Not sure what AMD means by this, but an educated guess would imply that the firm might be working on varied and different 3D V-Cache configurations for its Ryzen 9000X3D CPU lineup.

Varied sizes of 3D V-Cache to further segment the 9000 series lineup, or something else entirely ?
Could they layer an AI die with the X3D die ?
Or another compute type die ?

That would certainly be a differentiator.

Companies have talked about having "extra's" within memory die's, so there's thaat ...
Reply
TerryLaze

Metal Messiah. said:
He mentions "cool differentiators" here. Not sure what AMD means by this, but an educated guess would imply that the firm might be working on varied and different 3D V-Cache configurations for its Ryzen 9000X3D CPU lineup.

Varied sizes of 3D V-Cache to further segment the 9000 series lineup, or something else entirely ?
Senior Technical Marketing Manager

Is super looking forward to actively work on something really cool...

GyV_UG60dD4View: https://www.youtube.com/watch?v=GyV_UG60dD4
Reply
jeremyj_83

Metal Messiah. said:
BTW, we have already seen LAB samples featuring dual 3D V-Cache stack Ryzen chips from AMD before, in their labs, but I doubt AMD would go this route even for the Ryzen 9000 series lineup, due to higher cost, power, and thermal drawbacks.
Could they be doing something like putting the 3D V-Cache on the IOD instead of the cores. While that would have longer latency than standard 3D V-Cache, it would make the latency the same for due CCD chips, allow for clocks to stay the same, and be lower latency than going to RAM. Heck might even be able to act like Infinity Cache and give more effective RAM bandwidth.
Reply
TechyIT223

Superb! Keep them rumors incoming
Reply
artk2219

jeremyj_83 said:
Could they be doing something like putting the 3D V-Cache on the IOD instead of the cores. While that would have longer latency than standard 3D V-Cache, it would make the latency the same for due CCD chips, allow for clocks to stay the same, and be lower latency than going to RAM. Heck might even be able to act like Infinity Cache and give more effective RAM bandwidth.
I could see that being something thats done on the server side, but not something that they would do on the client side just yet. Those 3D Vcache die's are not the cheapest things to manufacture, its why its only on some CPU's and why they cost extra. Making that part of the IO die that every cpu uses would be more expensive than its worth for the consumer side. On the professional side they could probably make it work if they can get the margins to work out. That said, 3D Vcache is still fickle, not everything benefits from the extra cache, and for many uses its just something extra that isn't being used. Sure you could offer models with it fused off or fuse it off on die's that have a defective cache portion. But you could also just make it a separate die and only sell it on certain SKU's for people that know they need it, and for whom it would be a benefit. Maybe in the future if it becomes cheap enough they may throw it on every CPU, but probably not for a while.
Reply
jxdking

jeremyj_83 said:
Could they be doing something like putting the 3D V-Cache on the IOD instead of the cores. While that would have longer latency than standard 3D V-Cache, it would make the latency the same for due CCD chips, allow for clocks to stay the same, and be lower latency than going to RAM. Heck might even be able to act like Infinity Cache and give more effective RAM bandwidth.
The bandwidth of the infinity fabric between IOD and CCD is not great. If 3D V cache is on the IOD, it won't benefit much comparing access DDR5 directly.

I believe there are couple improvements that they can do based on the 7000x3D. They can put v cache under the CCD, instead of on top of the CCD. That will solve the most thermal issue. Also, they can combine v-cache with Zen5C CCD. As Zen5C won't clock high anyway, there will be minimum clock drop.
Reply
abufrejoval

The biggest issue with V-cache (VC) at the moment is that while the production makes it a hard coded choice, your use cases might benefit from more clocks (~VC) or bigger caches (VC), both can't be had, evidently...

What we don't know is how much of the clock gap between VC and ~VC chips is a) added operational power for the VC b) impeded heat dissipation from the extra layer c) voltage restrictions imposed via VC.

So if it's mostly a) and c) but not so much b), you could turn a VC chip in pretty near a ~VC chip simply by deactivating the extra cache, flushing it first and then setting some magic registers to have it go passive.

It might take a moment, it might only work at the granularity of a CCD, but especially in EPYCs it would make some HPC customer really happy to have that choice.

And since that makes it scot-free also for desktops, it's another ace someone might just want to play a little earlier this time around, especially since it really would push those EPYC sales.
Reply
Lucky_SLS

The 7000X3D parts were hamstrung by voltage and thermal limitations, If AMD finds a workaround to let the 9000X3D chips to boost to its full potential, its a great win for them!
Reply
abufrejoval

Twice the bandwidth is definitely an interesting tidbit. Now I'd only want the sizes of the L1/2 quoted inside the article or the table (not everyone has them in their head), and an honorable mention or comparison to Intel would also have the reader run around a little less.

Doubling the bandwidth clearly means doubling the wires and that is a rather horrible expense in a multi-masking process, but reduces it much more to a surface area sacrifice in EUV.

So it's one of the things where we can see how process transitions also shift chip design decisions in rather distinct manners (if I'm not totally mistaken).
Reply

Show more comments