Benchmark Results: Sandra 2011
Sandy Bridge-E has little trouble jumping to the top of the Arithmetic test, ahead of Intel’s outgoing -990X. Given Sandra 2011’s synthetic nature, it’s no surprise to see it exploiting all aspects of these eight processors.
Using SSE 4.1 (integer) and 2 (floating-point), Core i7-3960X slides right past Core i7-990X for the number one spot. Those figures improve dramatically with the implementation of AVX, though.
One of the things I noticed in Intel Core i7-3960X (Sandy Bridge-E) And X79 Platform Preview was that Sandy Bridge-E enabled significantly better AES256 bandwidth than Gulftown or Sandy Bridge. That advantage persists in the C1 stepping, nearly doubling Core i7-2600K’s result in the Cryptography benchmark. Intel confirms that it made changes to enhance AES throughput, but doesn’t expound on what it did.
Not satisfied, I did a little digging and started pulling memory modules. With three channels of memory, Sandy Bridge-E achieves 8 GB/s AES256 bandwidth. Two channels facilitate 5.43 GB/s. And a single channel of memory installed yields 2.72 GB/s. It seems that AES-NI is very much constrained by throughput (given that it's accelerated in hardware, and consequently very easy to execute), so it looks like the changes Intel suggested are tied to its memory controller, rather than its AES-NI implementation.
Hoping for some correlation to real-world performance (and a reason to get more excited about four 64-bit channels on the desktop), I ran a few tests in the latest stable build of TrueCrypt using the built-in benchmark and a 1 GB buffer. Despite a mean result of 3.8 GB/s in single-channel mode and 5.2 GB/s using two channels, performance fails to scale beyond that, indicating a bottleneck other than the speed at which the processor can encrypt and decrypt data.
The memory bandwidth advantage of a quad-channel DDR3-1600 bus is incredibly evident in Sandra 2011, which manages to realize around 37 GB/s from a potential 51.2 GB/s theoretical maximum.
Impressive though that number is, keep it in context. Sandy Bridge, with its dual-channel DDR3 memory controller, already showed that it wasn’t particularly starved for memory bandwidth in most desktop software. Practically, there won’t be many apps able to exploit those big throughput numbers. Perhaps that’ll change in the first quarter of next year when Sandy Bridge-E turns into Xeon E5 for dual-socket servers.