Benchmark Results: Sandra 2012
The theoretical gains moving from two prior-generation Xeon 5600s to a pair of Xeon E5s is impressive, just as the shift from Xeon 5500 to Xeon 5600 was.
A single Core i7-3960X does extremely well compared to a pair of Xeon W5580s. However, the Xeon E5-2687Ws, based on the same Sandy Bridge architecture, benefit from an additional two cores each.
Sandra 2011’s multimedia suite similarly shows the Xeon E5s dominating. We even turned AVX instructions off to make the results more comparable. Applications optimized for the x86 extensions enjoy even greater throughput.
I didn’t bother running standalone AVX numbers this time around because the core architecture we’re dealing with here is identical to the desktop implementation. If you’d like a comparison of Intel’s AVX implementation compared to AMD’s, check out this page in AMD Bulldozer Review: FX-8150 Gets Tested, where Cakewalk’s CTO Noel Borthwick gave us access to AVX-optimized routines from Sonar X1 for testing.
Three of the CPUs in this test should support AES-NI. As I discovered when I wrote Intel Xeon 5600-Series: Can Your PC Use 24 Processors?, the company’s Xeon 5600 engineering samples didn’t yet support the feature, though. As a result, only the Core i7-3960X and Xeon E5s reflect acceleration.
Why the huge performance gap? Well, we have two processors cranking on cryptography versus one, for starters. What might you expect to see from a pair of retail Xeon 5600s in the same test? Lower performance than the E5s, almost certainly. A hardware-based feature like AES-NI is incredibly easy to execute, and we know from tests that I ran in Intel Core i7-3960X Review: Sandy Bridge-E And X79 Express that memory bandwidth is actually the bottleneck in measures of AES256 performance. Thus, a quad-channel memory controller with support for DDR3-1600 has an inherent advantage over a triple-channel controller limited to DDR3-1333.
And here’s a perfect illustration. Although registered DDR3-1600 modules are hard to come by, as mentioned on the previous page, Crucial sent over 64 GB (8 x 8 GB) of PC-12800 memory for our E5-based workstation, enabling close to two times the effective bandwidth on Xeon E5 compared to the Xeon 5600s.
Interestingly, the Core i7-3960X, armed with unbuffered DDR3-1600 is the second-place finisher, even though its four memory channels are theoretically less capable than a pair of triple-channel Xeons armed with DDR3-1333.
After back and forth emails with Adrian Silasi over SiSoftware, we couldn’t figure out why the cache performance results for the Xeon 5600-series processors were turning out so low (particularly L2 cache bandwidth, which we'd expect to be far higher). One suspicion is that this routine is tripping a throttle due to repeated use of the cache and rapidly-escalating temperatures, though Intel's engineers claim the Xeon 5500s and 5600s don't have this mechanism in place.
It’s clear, however, that the Sandy Bridge-E and Sandy Bridge-EP architectures make big improvements to L3 cache throughput by virtue of their ring buses.