New microarchitectures from Intel are always greeted with a mixture of excitement and trepidation, ever since Netburst arrived on the scene. While the Pentium 4 generation was arguably a successful one from a revenue standpoint, it proved to be something of a heat-prone dead end. Conroe redeemed Intel’s reputation, and the next generation, Nehalem, enhanced the company’s reputation even more. Will the third new redesign since Netburst continue to improve?
When it comes to the overall product mix, Sandy Bridge, or as Intel is calling it, “Second Generation Core Architecture,” will eventually consolidate the mainstream and mobile product lines into one process generation. The current mainstream desktop lineup consists of Lynnfield (45 nm quad-core) and Clarkdale (32 nm dual-core.) Mobile CPUs are similarly bifurcated as Clarksfield (45 nm quad-core) and Arrandale (32 nm dual-core.)
Sandy Bridge will be available in both dual- core and quad-core versions for both desktop and mobile PCs. Intel’s next-generation HD graphics will be fully integrated onto the CPU die--not just the package, as was the case with Arrandale/Clarkdale.
As with Nehalem and Westmere, Sandy Bridge has a split L1 cache with separate data and instruction caches, and a dedicated 256 KB L2 cache per core.
Built on Intel’s existing 32 nm process, the microarchitecture includes a variety of key enhancements to the current Westmere/Nehalem architecture:
- A new cache was added for decoded micro-ops (uOps). When loading decoded uOps from this cache, the x86 decode pipeline is turned off, saving power. Improvements were made to the branch prediction engine, improving overall throughput.
- The architecture now supports two load/store ports, instead of just one. The data cache can handle two reads and one store per clock cycle.
- The out-of-order execution engine was rebuilt from scratch, which was needed because Intel wanted to integrate support for 256-bit AVX floating point instructions into the pipeline. The AVX pipeline now includes a physical register file, decreasing data duplication and transfers. Intel estimates that use of the new instructions will increase floating point throughput 2x over the current SSE implementation. Note that Windows users will need Windows 7 SP1 (currently in beta) in order for apps to make use of AVX.
The overall CPU is highly modular, allowing Intel to easily build chips with differing numbers of cores, cache sizes, and even GPU execution units.