Zen 4 Microarchitecture and IPC Measurements
AMD says the Zen 4 architecture is an iterative advance over Zen 3, but Zen 5, which arrives in 2024, will be a ground-up redesign. Zen 4 has plenty of advancements, though. As you can see throughout the album above, AMD made several advances, but widening the front end to better feed the execution units and improving branch prediction account for 60% of the IPC gain. AMD also increased the op-cache by 1.5x, moved to a two-branch-per-cycle prediction, improved the load/store units, and doubled the L2 cache capacity. The increased L2 cache capacity results in an additional 2 cycles of L2 latency and adds 4 cycles to L3 latency. AMD says this increased latency isn’t too detrimental because the increased cache capacity provides higher hit rates that largely offset the penalty.
AMD has enabled support for AVX-512 instructions, giving it a curious advantage over Intel, which pioneered the SIMD instructions but ended up disabling them with Alder Lake. AMD describes its AVX-512 implementation as a 'double-pumped' execution of 256-bit wide instructions. This means that it actually takes two clock cycles to execute an AVX-512 instruction. However, this provides compatibility with AVX-512 and still boosts performance. This approach also saves die area and defrays the frequency and thermal penalties typically associated with Intel's processors when they execute AVX-512 workloads.
Speaking of the die, the die size for the new 6nm I/O die (IOD) is 122mm^2, or roughly the same size as the 124.94mm^2 12nm IOD present on the Ryzen 5000 chips. It has 3.4 billion transistors. Additionally, the Zen 4 compute die (CCD) measures 70mm^2, which is somewhat smaller than the 83.74mm^2 die on the Ryzen 5000 processor. Given that we're looking at a much denser N5 process for Ryzen 7000 compared to the 7nm process for Ryzen 5000, the smaller die has 6.5 billion transistors for the Zen 4 CCD compared to 4.15B transistors for Zen 3 CCDs (a 36% increase for Zen 4).
AMD’s implementation results in lower throughput per clock than Intel's method, but the higher clocks obviously offset at least some of the penalty. AMD says AVX-512 provides a 30% increase in multi-core FP32 workloads over Zen 3 and a 2.5X speedup for multi-core int8 operations. As we saw in our own benchmarks, the approach provides significant performance uplift.
AMD says the net effect of its Zen 4 architectural enhancements is a 13% increase in IPC over Zen 3. AMD also claims to deliver better power efficiency in a much smaller package than Intel's Alder Lake processors. The company compared its Zen 4 core to Intel's Golden Cove to highlight that it is half the size at 3.84mm2, yet wrings out 47% more power efficiency.
Measuring IPC is tricky, largely because it varies based on the workload. AMD calculated its 13% IPC improvement from 22 different workloads, including gaming, which seems a curious addition due to possible graphics-imposed bottlenecks. AMD also included some multi-threaded workloads. AMD's results show that the IPC improvements vary, with improvements spanning from 39% in wPrime to a 1% improvement in the CPU-z benchmark.
We tested a limited subset of single-threaded workloads to see the clock-for-clock improvements, locking all chips to a static 3.8 GHz all-core clock with the memory dialed into the officially supported transfer rate. As you can see, Zen 4 does deliver solid IPC improvements in a multitude of workloads. The y-cruncher and Geekbench 5 crypto scores experience rather disproportionate gains, but that comes as a result of Zen 4’s support for AVX-512. However, as we saw in the single- and multi-threaded y-cruncher benchmarks, this performance doesn’t scale linearly to higher core loadings.
Ryzen 9 7950X and Ryzen 5 7600X Benchmark Test Setup
We tested the Ryzen 7000 processors with an ASRock X670E Taichi motherboard. We tested all Intel configurations with DDR5 memory, but you can find performance data for DDR4 configurations in our CPU Benchmark hierarchy. We also tested with secure boot, virtualization support, and fTPM/PTT active to reflect a properly configured Windows 11 install.
Our overclocks were rather straightforward — we enabled the auto-overclocking Precision Boost Overdrive (PBO) feature with 'advanced motherboard' settings and adjusted the scalar setting to 10X. For our overclocked configurations, we enabled the DDR5-6000 EXPO profile for the memory kit. This also automatically enables the AMD-recommended Auto setting for the fabric and a 1:1 ratio for the memory frequency and memory controller (Auto:1:1 is the recommended setting for memory overclocking with Ryzen 7000).
AMD Socket AM5 (X670E) | Ryzen 9 7950X, Ryzen 5 7600X |
Row 1 - Cell 0 | ASRock X670E Taichi |
Row 2 - Cell 0 | G.Skill Trident Z5 Neo DDR5-6000 - Stock: DDR5-5200 | OC/PBO: DDR5-6000 |
Intel Socket 1700 DDR5 (Z690) | Core i9-12900K, i7-12700K, 15-12600K, i5-12400 |
Row 4 - Cell 0 | MSI MEG Z690 Ace |
Row 5 - Cell 0 | G.Skill Trident Z5 DDR5-6400 - Stock: DDR5-4400 | OC DDR5-6000 |
AMD Socket AM4 (X570) | Ryzen 9 5950X, 5900X, 5700X, 5600X, 5800X3D |
MSI MEG X570 Godlike | |
Row 8 - Cell 0 | 2x 8GB Trident Z Royal DDR4-3600 - Stock: DDR4-3200 | OC/PBO: DDR4-3800 |
All Systems | Gigabyte GeForce RTX 3090 Eagle - Gaming and ProViz applications |
Row 10 - Cell 0 | Nvidia GeForce RTX 2080 Ti FE - Application tests |
2TB Sabrent Rocket 4 Plus, Silverstone ST1100-TI, Open Benchtable, Arctic MX-4 TIM, Windows 11 Pro | |
Cooling | Corsair H115i, Custom loop |
Overclocking note | All configurations with overclocked memory also have tuned core frequencies and/or lifted power limits. |
- MORE: How to Overclock a CPU
- MORE: How to check CPU Temperature
- MORE: All CPUs Content