Sandy Bridge-E: Core i7-3960X Is Fast, But Is It Any More Efficient?

Is Core i7-3960X An Efficiency Winner?

Because we just reviewed Core i7-3960X, most of the performance results in this article don’t come as a surprise. Intel’s latest desktop processor is a beast in that it delivers unmatched performance per clock with the added benefit of six cores. As a result of its architecture and scalable ring bus, the beefed-up Sandy Bridge-E enables an additional two cores on the desktop, and will soon facilitate up to two more on the Xeon E5 family. Although a quarter of its available shared L3 cache is disabled, access to 15 MB represents a substantial increase from Gulftown's 12 MB. And clock rates as high as 3.9 GHz with one or two cores active (by virtue of Turbo Boost) augment performance in less-optimized apps. Otherwise, the technology manages to push a 3.3 GHz base clock up to 3.6 GHz, even when the chip is fully-loaded.

All of this is made possible at lower idle power use compared to last generation's hexa-core flagship. And, armed with a modest Radeon HD 6850, system power use drops as low as 62 W. Even a high-end GeForce GTX 580 doesn't push the machine beyond 90 W. Considering that older systems often incur more than 120+ W of consumption, it's clear that Intel's Sandy Bridge design pushed beyond simply increasing performance.

However, peak system power consumption goes up quite a bit. This is attributable to the large, complex die and the fact that it's pushed hard by technologies like Turbo Boost to maximize performance in every conceivable workload. Because we used a capable closed-loop liquid cooling system, the processor's maximum speeds were maintained for longer intervals than on average air-cooled solutions. We're not looking at an affordable quad-core LGA 1155-based chip here; expensive cooling is the price enthusiasts have to pay for the extra performance.

Sandy Bridge-E is the undisputed performance winner, and, for the first time, I'd label this high-end configuration reasonable with regard to idle power use. However, two additional cores and the extra cache on a very large die impose greater power requirements under load than they add to the performance charts. As a result, Intel's mainstream Sandy Bridge processors end up outshining Sandy Bridge-E in every measure of efficiency. Unless you really need a hexa-core platform for its raw performance (power be damned), existing Core i5 and Core i7 chips able to drop into LGA 1155 are more sensible solutions.

For obvious reasons, efficiency does not scale linearly with core count, necessitating a revised answer to my initial question: Sandy Bridge-E delivers more efficiency than other six-core processors. However, it's hard to make that -E represent efficiency. If that's the metric you're looking to optimize, drop the -E suffix altogether, and save money now (on the hardware) and over time (on power) with Intel's year-old mainstream Sandy Bridge processors.

This thread is closed for comments
57 comments
    Your comment
  • fstrthnu
    Aand yet more evidence that most people looking for a high-end processor will be perfectly fine with the i5-2500K or the 2600K
  • sam_fisher
    fstrthnuAand yet more evidence that most people looking for a high-end processor will be perfectly fine with the i5-2500K or the 2600K


    I guess it just depends on what you're doing. If you have a high end workstation and are using programs that are going to utilise all 12 threads, quad channel memory and 40 lanes of PCIe, and you need that processing power then it's probably not a bad investment. Whereas for most users the 2500K or the 2600K will do fine.
  • benikens
    Quote:
    Ironically, when it comes to performance, Intel’s Core i7-9360X is the real Bulldozer. Since its power consumption levels are lower than the Gulftown-based Core i7, it should also deliver amazing performance per watt as well. Is that really the case?


    It's i7-3960x, not i7-9360x
  • pwnorbpwnd
    Correct me if I'm wrong but isn't the 6850 a Barts card? Unless I am wrong but I own a 6850.
  • one-shot
    There is a small typo on Page 9

    "Total power used drops again relative to Cor ei7-3960X's predecessor, the Core i7-980X (Gulftown)."
  • Shape
    Quote:
    Ironically, when it comes to performance, Intel’s Core i7-9360X is the real Bulldozer.



    ROFL!!! Very well said!

    Nice!
  • de5_Roy
    another informative, in-depth article about efficiency. great work guys!
    3960x might very well be the $1k cpu that's worth the (over)price unlike the older 980x.
    sb-e shows that both single threaded and multi threaded performance as well as efficient power use can be ahcieved by a 32nm, 6 core, 130 tdp cpu (but you gotta pay a lot for that).
    when you bring price into the equation, quad core sb i5 and i7(95w tdp) are the best way to go (i wonder how an i7 2700k fare if it was tested alongside these cpus).
  • agnickolov
    And I was so hoping Visual C++ had made it into the regular benchmark set. Sadly, it's missing here...
  • giovanni86
    Looking forward to seeing what type of Air/liquid cooled Overclocks can be achieved with these newly released processors.
  • I wanna know how it performs on DAW apps. I hope it will be included in future benchmarks.
  • AstroTC
    Excellent review but has anyone else noticed of good looking the LGA 2011 Platform setup is? I really like to see it.
  • jemm
    Great article! I loved that mobo with 4 dimms at each side of the processor.
  • ukee1593
    I am still very pleased with my i5 2500 after reading this article. Sandy Bridge-E's efficiency might be impressive for a high end CPU ... but it still cannot beat the practicality of the Standard Sandy Bridge.

    I can't wait until Ivy Bridge!
  • gsxrme
    by no means am i replacing my 2600k @ 5ghz w/ HT ON and GSkill 2200Mhz Cas7 anytime soon.
  • aldaia
    fstrthnuAand yet more evidence that most people looking for a high-end processor will be perfectly fine with the i5-2500K or the 2600K

    Agreed, 2500k is still the sweet spot in the triple trade-off performance/power/cost. This is what I will choose if i needed a replacement, considering the applications I run.
    sam_fisherI guess it just depends on what you're doing. If you have a high end workstation and are using programs that are going to utilise all 12 threads, quad channel memory and 40 lanes of PCIe, and you need that processing power then it's probably not a bad investment. Whereas for most users the 2500K or the 2600K will do fine.

    Right, but if i have a truly highly parallel application, then, a server with several interconnected nodes offers more bang for the buck. I would consider 4 nodes based on 2500k that probably are cheaper than a single 3960X and offer me much more computing power. It all depends on your appliccation.
    But clearly the 3960X is for a niche market, either because it really fits your needs or the "bragging rights" niche market.
  • AppleBlowsDonkeyBalls
    aldaiaAgreed, 2500k is still the sweet spot in the triple trade-off performance/power/cost. This is what I will choose if i needed a replacement, considering the applications I run.Right, but if i have a truly highly parallel application, then, a server with several interconnected nodes offers more bang for the buck. I would consider 4 nodes based on 2500k that probably are cheaper than a single 3960X and offer me much more computing power. It all depends on your appliccation.But clearly the 3960X is for a niche market, either because it really fits your needs or the "bragging rights" niche market.


    The i7-3930K is pretty decent for the price, though. At the same clocks as the 3960X it's the same speed, and all reviews featuring both have it achieving the same overclocks, sometimes at lower voltage. Unless it's for bragging rights or epeen the 3930K is clearly a better choice since the extra cache seems to be useless for desktops and it isn't even better binned.
  • CaedenV
    The more benchmarks I read the happier I am with my i7 2600 :) It is right behind the new big boy, and only cost $250 at my local computer hardware store compared to $1000 to get a few extra seconds off.

    What will be really interesting to see is what happens with the IB release. Last time the mainstream SB could meet or beat the old high end chips, for 1/3 the price. I wonder if the IB release will do the same thing, or if Intel will downplay the performance so as not to piss off their high-end buyers again.
  • AppleBlowsDonkeyBalls
    CaedenVThe more benchmarks I read the happier I am with my i7 2600 It is right behind the new big boy, and only cost $250 at my local computer hardware store compared to $1000 to get a few extra seconds off.What will be really interesting to see is what happens with the IB release. Last time the mainstream SB could meet or beat the old high end chips, for 1/3 the price. I wonder if the IB release will do the same thing, or if Intel will downplay the performance so as not to piss off their high-end buyers again.


    Ivy Bridge is a die shrink that is based mostly on lowering power consumption and getting higher IGP performance. CPU performance improvements will be few: according to Anandtech 4-6% higher IPC than Sandy Bridge, and since Intel is focusing on power consumption clock speeds won't be much higher than SB, so about a 5% improvement there too. About 10% more CPU performance max, so don't expect too much. Sandy Bridge-E will still be significantly faster in multi-threaded.
  • TeraMedia
    Based on the results for some of the multi-threaded tests, it appears as if the turbo boost on SB-E is getting modulated more often than the turbo boost on 2600K. It would be very interesting to see a multi-threaded test in which turbo boost was turned off, and the clocks of both were set at the same rate, e.g. 3.6 or 3.9 GHz, whatever the cooler will bear. Also supporting this idea is that several of the configurations appear to max out at right around 200-210 watts peak power. So if the thermal limiter threshold is kicking in for SB-E to keep it within its power budget, that could explain the "better, but not way better" performance between SB-E and 2600K. Would such a test be feasible, Toms?
  • danraies
    I work in engineering and many of our employees have heavily multithreaded applications running at their personal machines sometimes for days on end. This is obviously the kind of place SB-E chips will thrive unless IB blows them out of the water. Obviously these $1K chips are not the right choice for enthusiast gaming PC's and they're arguably not the best choice for servers as they get outshined by cheaper chips over several nodes. However there are certainly applications where 6+ cores at 3.3ghz+ are worth $1K and SB-E steps in where Bulldozer failed.
  • @danraies

    actually no, even with this mass of seething power, if it's taking all day to finish one of their runs then this thing is only really going shave off an hour or 2, it wont make a drastic impact, and you might want to go back and revisit those bulldozer benchies cause if i recall correctly, much to my surprise, bulldozer did quiet well in productivity apps department

    if you really want to increase through put and money is no objective, look into a tesla setup or offload all the work to a grid setup
  • billj214
    There are a lot more cost conscious people reading this forum that know there is no reason to gain ~10% performance increase for $1000. Intel knows there are people who have money to burn and will always buy the fastest CPU just because they can.
    I have many friends who are speed junkies and want something just because it's the fastest, kind of like an addiction! Triple SLI or Crossfire with i7 990X water cooled etc.

    I'm sure everyone remembers the Sandy Bridge launch and were amazed, just wait for Ivy Bridge because it will be the real "Bulldozer" IMO.
  • andywork78
    I made good choice of Bulldozer....
    Ivy is great really good.
    Price fail so hard...
    Because looking for best for best cost more then 1k....
    eeee not for me....
  • mapesdhs
    aldaia... Right, but if i have a truly highly parallel application, then, a server with several interconnected nodes offers more bang for the buck. ...


    A few points to ponder:

    a) Many 'truly' parallel apps don't scale well when running across networked nodes, ie. clusters. It
    depends on the granularity of the code. Some tasks just need as much compute power as possible
    in a single system. This can be mitigated somewhat with Inifiniband and other low-latency network
    connection technologies, but the latencies are still huge compared to local RAM access. If the code
    on a particular chip doesn't need any or much access to the data held by a different node then
    that's great and some codes are certainly like this, but others are definitely not, ie. a cluster setup
    doesn't work at all. When 2-socket and the rarer 4-socket boards can't deliver the required
    performance, companies use shared memory systems instead, which are already available with
    up to 256 sockets (2560 cores max using the XEON E7 family), though often it's quite difficult to
    get codes to scale that well beyond 64 CPUs (huge efforts underway in the cosmological community
    atm to achieve good scaling up to 512 CPUs, eg. with the Cosmos machine).

    b) Tasks that require massive processing often require a lot of RAM, way more than is supported
    on any consumer board. Multi-socket XEON boards offer the required amount of RAM, at the expense
    of more costly CPUs, but it does the deliver the required performance too if that is also important.
    ANSYS is probably the most extreme example; one researcher told me his ideal workstation would
    be a single-CPU machine with 1TB RAM (various shared memory systems can have this much RAM
    and more, but more CPUs is also a given). X58 suffered from this somewhat, with consumer boards
    only offering 24GB max RAM (not enough for many pro tasks) and the reliability of such configs was
    pretty poor if you remember way back to when people first started trying to use non-ECC DIMMs
    to max out X58 boards.

    c) Many tasks are mission critical, ie. memory errors cannot be tolerated, so ECC RAM is essential,
    something most consumer boards can't use (or consumer chips don't support). Indeed, some apps
    are not certified to be used with anything other than ECC-based systems.

    d) Some tasks also require enormous I/O potential (usually via huge FC arrays), something again
    not possible with consumer boards (1GB/sec is far too low when a GIS dataset is 500GB+). Even
    modern render farms have to cope with such quantities of data as this now for single frame renders.
    It's often so much more than just the CPU muscle inside the workstation, or as John Mashey put it
    years ago, "It's the bandwidth, stupid!", ie. sometimes it's not how fast one can process, it's how
    much one can process (raw core performance less critical than I/O capability). Indeed, even
    render farm management systems may deselect cores in order to allow the remaining cores to make
    better use of the available memory I/O (depends on the job), though this was more of an issue for
    the older FSB-based XEONs.

    And then sometimes raw performance just rules, the cost be damned. Studio frame rendering is
    certainly one such example; I've no doubt IB XEON will be very popular for this. Thousands of cores
    is fairly typical for such places.

    SB-E is great, but it's at the beginning of the true high performance ladder, not the end.

    Ian.