Sign in with
Sign up | Sign in

Results: Sandra 2014 And 3DMark

Intel Xeon E5-2600 v2: More Cores, Cache, And Better Efficiency
By

In Intel Xeon E5-2600: Doing Damage With Two Eight-Core CPUs, we saw just how much faster a pair of Sandy Bridge-EP-based Xeon E5s were than Westmere-EP- or Nehalem-EP-based Xeons. More so than on the desktop, Intel is aggressive with ramping up the core count of its business-oriented products. So, stepping up from four to six and then to eight cores per socket turns into big gains in threaded software.

The transition to 22 nm manufacturing allows Intel to create up to 12-core Xeon E5-2600 v2 CPUs. However, the replacement for its original Xeon E5-2687W is another eight-core model. Instead of adding more processing resources, Intel increases shared L3 cache to 25 MB and bumps up clock rates. Those alterations, folded in on top of the architectural changes to Ivy Bridge, result in a minor improvement to Sandra’s integer math benchmark, and a more marked speed-up in double-precision calculations.

Of course, both dual-processor setups demonstrate a significant advantage in raw processing power compared to one Core i7-4960X.

As we know from Intel Core i7-3770K Review: A Small Step Up For Ivy Bridge, the company didn’t make a ton of compelling architectural changes to its IA cores. The Xeon E5-2687W v2 does enjoy the advantage of more aggressive clock rates compared to its predecessor, though AVX support across the board means all three configurations benefit.

Even in single-processor configurations, Intel’s quad-channel memory controller facilitates lots of bandwidth. The Core i7-4960X manages more than 40 GB/s at DDR3-1866. Two Xeon E5-2687W CPUs almost double that number using DDR3-1600, achieving 74 GB/s. The Xeon E5-2687W v2s increase maximum throughput almost 10%, cresting 80 GB/s.

We also know that the inclusion of AES-NI in all three of these workstations means that instructions are executed as fast as they’re fed from RAM, making this a bandwidth-constrained task. As we’d expect, performance scales accordingly.

The hashing benchmark is handled by the x86 cores, so the six-core -4960X understandably manages less than half of the throughput posted by both 16-core configurations.

Given the older workstation-oriented GPU in our test system, the only data point worth looking at from 3DMark is the threaded Physics test outcome. Clearly the benchmark doesn't scale according to core count. But the newer Xeon E5-2687W v2 does appear to gain from its larger shared L3 cache and higher stock clock rates.

Ask a Category Expert

Create a new thread in the Reviews comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 47 comments.
This thread is closed for comments
  • 1 Hide
    GL1zdA1 , January 7, 2014 12:54 AM
    Does this mean, that the 12-core variant with 2 memory controllers will be a NUMA CPU, with cores having different latencies when accessing memory depending on which MC is near them?
  • 1 Hide
    Draven35 , January 7, 2014 1:43 AM
    The Maya playblast test, as far as I can tell, is very single-threaded, just like the other 3d application preview tests I (we) use. This means it favors clock speed over memory bandwidth.

    The Maya render test seems to be missing O.o
  • 1 Hide
    Cryio , January 7, 2014 2:43 AM
    Thank you Tom's for this Intel Server CPU. I sure hope you'll make a review of AMD's upcoming 16 core Steamroller server CPU
  • 2 Hide
    Draven35 , January 7, 2014 3:08 AM
    Tell AMD that.
  • 4 Hide
    cats_Paw , January 7, 2014 3:09 AM
    Dat Price...
  • 0 Hide
    voltagetoe , January 7, 2014 3:10 AM
    If you've got 3ds max, why don't you use something more serious/advanced like Mental Ray ? The default renderer tech represent distant past like year 1995.
  • 0 Hide
    lockhrt999 , January 7, 2014 3:13 AM
    "Our playblast animation in Maya 2014 confounds us."@canjelini : Apart from rendering, most of tools in Maya are single threaded(most of the functionality has stayed same for this two decades old software). So benchmarking maya playblast is as identical as itunes encode benchmarking.
  • 2 Hide
    daglesj , January 7, 2014 3:33 AM
    I love Xeon machines. As they are not mainstream you can usually pick up crazy spec Xeon workstations for next to nothing just a few years after they were going for $3000. They make damn good workhorses.
  • 1 Hide
    InvalidError , January 7, 2014 6:20 AM
    @GL1zdA1: the ring-bus already means every core has different latency accessing any given memory controller.Memory controller latency is not as much of a problem with massively threaded applications on a multi-threaded CPU since there is still plenty of other work that can be done while a few threads are stalled on IO/data. Games and most mainstream applications have 1-2 performance-critical threads and the remainder of their 30-150 other threads are mostly non-critical automatic threading from libraries, application frameworks and various background or housekeeping stuff.
  • 2 Hide
    mapesdhs , January 7, 2014 8:03 AM
    Small note, one can of course manually add the Quadro FX 1800 to the relevant file
    (raytracer_supported_cards.txt) in the appropriate Adobe folder and it will work just
    fine for CUDA, though of course it's not a card anyone who wants decent CUDA
    performance with Adobe apps should use (one or more GTX 580 3GB or 780Ti is best).

    Also, hate to say it but showing results for using the card with OpenCL but not
    showing what happens to the relevant test times when the 1800 is used for CUDA
    is a bit odd...

    Ian.

    PS. I see the messed-up forum posting problems are back again (text all squashed
    up, have to edit on the UK site to fix the layout). Really, it's been months now, is
    anyone working on it?

  • 0 Hide
    Draven35 , January 7, 2014 1:48 PM
    Quote:
    If you've got 3ds max, why don't you use something more serious/advanced like Mental Ray ? The default renderer tech represent distant past like year 1995.


    The 3dsMax test does use mental ray. Our Maya render test also uses mr, and the other Max render test uses VRay.

  • 0 Hide
    ddpruitt , January 7, 2014 7:22 PM
    Quote:
    Our Core i7 and dual Sandy Bridge-EP-based Xeons score similarly. Meanwhile, the -2687W v2 crushes this test.
    There are a number of reasons this could occur, without knowing what the exact input and output are any attempts to explain this are guessing (my first guess is that the output is different, bug perhaps). The fact that these benchmarks are run the same way as Tom's runs desktop CPU benchmarks shows that whomever was running these really didn't know what they were doing. These CPUs are fast but this article does them a disservice. I would like to see power measurements when the systems run for the same workload and are running for the same period of time. True they are rarely loaded 24x7 but they are generally running 24x7. The workloads should also match the hardware. These things had 64Gb RAM and not a single one of the benchmarks would have stressed that amount of RAM appropriately. The problem is that if your actually reading and writing this much RAM it can have an impact on power consumption.
  • 0 Hide
    Draven35 , January 7, 2014 7:46 PM
    We don't have any tests specifically oriented towards 64 GB of RAM, but the AE test will use as much as you throw at it. At some point though, the AE test runs into the storage limitation, which is what i suspect was happening here. I also don't know if Chris was using the SD or HD version of the AE test. If the machine had 128 GB of RAM, he could have run the AE test from a RAM disk, thus removing storage from the equation. Without just throwing random items into a scene, it is difficult to create a test meant for 64 GB systems without a 64 GB system to create the test on- some of these tests were originally created on the HP z400 we reviewed a few years back, and the newer Maya and Max tests I use on workstations were authored on our baseline workstation. I am now developing updated test s on an HP z600 with 12 GB of RAM... but the extra RAM will be nearly inconsequential in the actual tests because I'll still need to be able to run them on the baseline workstation (8 GB RAM). My personal feeling is that the Premiere test is getting a little long in the tooth, as is the AE test (hence why i went back and redid it for HD).
  • 2 Hide
    oxiide , January 7, 2014 9:12 PM
    I know this chip is absolutely inappropriate for gaming, but it would still be somewhat interesting to me to see how it stacks up in the usual gaming benchmarks. Multithreading in games is getting a little better, and the results might show Intel's potential if they weren't so heavily focused on power consumption in their recent consumer-level products. On the other hand, it might be completely predictable and boring. But I think it would still be interesting to find out for sure.

    Again, obviously, I know the productivity benches are what's important here. I know no one's gaming on a server processor, like ever. But while you've got a review sample, why not experiment a little? :) 

    Great review as always.
  • 0 Hide
    puppetMaster3 , January 7, 2014 9:27 PM
    What is 'segmented' models? It is all over the article and not explained.
  • 0 Hide
    Crass Spektakel , January 8, 2014 2:07 AM
    Right now I am using a system build in 2008 with two Xeon 5450 (thats basically eight cores at 3.0Ghz from the last Core2 Q9000 series), 16GB RAM and a Radeon 280X for Work, Development and also Gaming. Runs quite well, even a slightly overclocked i7-4770 doesn't beat this old war horse (at least not always). The i7-49xx on the other hand is quite superior to my old Xeon, especially when overclocked (My Xeon board offers not overclocking at all).
  • 0 Hide
    daglesj , January 8, 2014 3:30 AM
    Yes I'm running a 2008 Dell Precision T5400 with a single 2.8GHz quad Xeon (soon to be a double 2.8Ghz quad setup this week) with 16GB of ECC ram and 1TB SSHD drives.Fantastic machine. Takes everything I throw at it and will take even more once I get up to 8 cores.Might treat it to a matched pair of 3.3GHz chips in the summer.
  • 1 Hide
    Draven35 , January 8, 2014 4:01 AM
    Quote:
    I love Xeon machines. As they are not mainstream you can usually pick up crazy spec Xeon workstations for next to nothing just a few years after they were going for $3000. They make damn good workhorses.


    I have an HP z600 with 2x 2.26 Ghz Xeon 5520s and 12 GB RAM, 2x 500 GB hard drives... total invested: $550. Its my personal 3d machine and benchmark development machine. Going to put it up to 24 GB shortly.
  • 1 Hide
    DoDidDont , January 8, 2014 4:10 AM
    I think for people that already own a pair of E5-2687W processors, the upgrade to the V2 version isn’t really worth it from a price point. If its purely for work purposes and multi-threaded apps then the E5-2690V2 being in the same price bracket, would be a better choice. Very slight drop in clocks, but more cores. Shame Tom’s didn’t include the 2690V2 and 2697V2 in the multithreaded tests, for people considering whether or not to upgrade their SandyBridge platforms. Definitely something wrong with Tom’s CineBench results.Every benchmark I have seen online scores a pair of E5-2687w’s at around the 26 mark for the multiple test, and my own E5-2687w’s lowest score has been 25.6 and highest 26.22, so I’m guessing the results should be 26 for the E5-2687w and 28 for the V2.
  • 0 Hide
    Michael Robinson , January 8, 2014 5:13 AM
    I still find it difficult understanding where you'd use them though. Sure the idea of 12 cores on two CPU's (and I assume 48 threads) sounds like a geek dream but with so much graphic work capable of being offloaded onto the GPU I'm not sure what the point is. It would of been interesting to see some figures for CPU bound games as well - I know these things aren't for gaming but it allows us non-workstation users to be able to contrast and compare. It would also of been interesting to try and run tests with various programs (and limiting the cores being used) to see whether huge cache sizes make much of a difference and how. A similar thing could be done for the number of cores to see what games really can use each and every core.
Display more comments