Part 1: DirectX 11 Gaming And Multi-Core CPU Scaling

Ashes of the Singularity & Battlefield 4

Ashes of the Singularity

Ashes of the Singularity is a significant part of this series because it’ll appear in our DirectX 11 and DirectX 12 tests. For this first part, we see how the game fares with DirectX 11 between it and our graphics hardware. According to Brad Wardell in his postmortem on Gamasutra:

“The difference between DirectX 11 and DirectX 12 is very easy to explain: On DirectX 11, all your threads in your game can talk DirectX at once but DirectX 11 only talks to the GPU one thread at a time. By contrast, on DirectX 12, all the threads in the game can talk to the GPU at the same time. 

"In theory, a 10-Core Intel Broadwell-E, for instance, could do 10x the performance in DirectX 12 than in DirectX 11—provided the video card was fast enough to keep up.”

Ashes actually requires a quad-core CPU (though it’ll run on a dual-core processor, as you’ll soon see). Based on Wardell's interview, this appears related to the game’s AI subsystem. We don’t, however, expect sizeable gains from multiple cores at the resolutions that typically tax high-end graphics cards. At 4K, a GeForce GTX 1080 just isn’t “fast enough to keep up,” borrowing from Wardell’s caveat.

At 1920x1080, we clearly see the 10-core sample score a first-place finish. The ~2% lead may not justify a $1650 price tag, but there it is nonetheless. Intel’s eight- and six-core Broadwell-E processors take second and third place, while the quad-core Skylake-based configuration lands in fourth.

The dual-core example on our chart merits special attention, given its severe performance loss. Remember that we’re testing these CPUs with Hyper-Threading disabled. That feature would make a significant difference if we were comparing real-world configurations. Just because we’re using a Core i3 in this experiment doesn’t mean the outcome represents an actual Core i3. In reality, with HT technology enabled, you’d see a much higher number. If anything, our numbers come closest to what you might see from a 3.9 GHz Pentium.

For all of those enthusiasts who lobbied so hard for us to pull the Pentium G3258 from our Best CPUs column, you were right on the money. Average frame rate is clearly capped, and our frame time chart shows severe spikes that manifest as stuttering on-screen.

Low resolutions are great for evaluating platform bottlenecks because the graphics workload isn’t as taxing, reducing the GPU’s utilization. As you start pushing more pixels through the pipeline, however, the graphics card has a harder time keeping up.

Shifting from 1920x1080 to 2560x1440 is enough to erode away our 10-core sample’s gains (a ~2% lead shrinks to less than 1%). And instead of being >17% faster than the quad-core Skylake config, it’s now 8% quicker. This is a resolution a GTX 1080 is well suited for, so our results are more relatable to real-world gaming.

By the time we get to 3840x2160, there’s no longer a quantifiable reason to pair the GeForce GTX 1080 with a 10-core CPU. In fact, our modified Core i7-6950X is in second-to-last place, though for all intents and purposes, the top four finishers are tied.

Only the simulated dual-core CPU at 3.9 GHz is distinctly different, and that processor hasn’t budged since 1920x1080. It is a steadfast bottleneck across resolutions.

MORE: Best CPUs

MORE: Intel & AMD Processor Hierarchy

MORE: All CPU Content

Battlefield 4

According to DICE’s Johan Andersson, Battlefield 4’s Frostbite engine is parallelized to run on up to eight CPU cores. However, in a presentation he gave back in 2014, he conceded that it’s not possible to utilize all available CPU cores under the then-current implementations of DirectX and OpenGL. In the next chapter of this series, we’ll look at how Mantle affects performance in this older title. For now, however, we’re analyzing the single-player benchmark from our graphics card suite under DirectX 11.

For as long as we’ve used the Tashgar test, we’ve known that it’s graphics-bound. At the request of our readers, we explored creating a multi-player metric more indicative of how Battlefield 4 is still played years after its introduction. After exchanging emails with DICE, though, it became clear that a consistent, reproducible scenario was not in the cards.

As such, we see no scaling whatsoever above four cores, even if the game is claimed to utilize additional host processing resources (and it certainly does in the multi-player campaign, at least). The only data point that does stand out is our simulated dual-core Skylake-based CPU that lags the rest of the field and demonstrates much higher frame times throughout the 100-second recording.

The graphics workload at 2560x1440 is such that even the dual-core Intel processor averages similar frame rates as the other four configurations. Drill down a little further, though, and you’ll see its performance is less consistent, incurring higher frame times through more of the run. Otherwise, all of the CPUs average within 1 FPS of each other.

Stepping up to 4K is the great equalizer. Our benchmark on rails shows all five core configurations in a similar light, and even the frame time spikes aren’t as numerous or problematic. By this resolution, graphics processing is clearly the limiting factor in our performance measurement.

This thread is closed for comments
55 comments
    Your comment
  • ledhead11
    Awesome article! Looking forward to the rest.

    Any chance you can do a run through with 1080SLI or even Titan SLi. There was another article recently on Titan SLI that mentioned 100% CPU bottleneck on the 6700k with 50% load on the Titans @ 4k/60hz.
  • Nolonar
    Wouldn't it have been a more representative benchmark if you just used the same CPU and limited how many cores the games can use?
  • Traciatim
    Looks like even years later the prevailing wisdom of "Buy an overclockable i5 with the best video card you can afford" still holds true for pretty much any gaming scenario. I wonder how long it will be until that changes.
  • nopking
    Your GTA V is currently listing at $3,982.00, which is slightly more than I paid for it when it first came out (about 66x)
  • TechyInAZ
    111219 said:
    Looks like even years later the prevailing wisdom of "Buy an overclockable i5 with the best video card you can afford" still holds true for pretty much any gaming scenario. I wonder how long it will be until that changes.


    Once DX12 goes mainstream, we'll probably see a balanced of "OCed Core i5 with most expensive GPU" For fps shooters. But for CPU the more CPU demanding games it will probably be "Core i7 with most expensive GPU you can afford" (or Zen CPU).
  • avatar_raq
    Great article, Chris. Looking forward for part 2 and I second ledhead11's wish to see a part 3 and 4 examining SLI configurations.
  • problematiq
    I would like to see an article comparing 1st 2nd and 3rd gen I series to the current generation as far as "Should you upgrade?". still cruising on my 3770k though.
  • Brian_R170
    Isn't it possible use the i7-6950X for all of 2-, 4-, 6-, 8-, and 10-core tests by just disabling cores in the OS? That eliminates the other differences between the various CPUs and show only the benefit of more cores.
  • TechyInAZ
    1696401 said:
    Isn't it possible use the i7-6950X for all of 2-, 4-, 6-, 8-, and 10-core tests by just disabling cores in the OS? That eliminates the other differences between the various CPUs and show only the benefit of more cores.


    Possibly. But it would be a bit unrealistic because of all the extra cache the CPU would have on hand. No quad core has the amount of L2 and L3 cache that the 6950X has.
  • filippi
    I would like to see both i3 w/ HT off and i3 w/ HT on. That article would be the perfect spot to show that.
  • littleleo
    I think the price for GTA V is setting the gold standard in game pricing $3982, and it is a little... okay a it's a lot lot lot more then I would ever pay for a game. I've bought cars for less money, ouch!
  • littleleo
    I've sold more i5 gaming systems since the 1st iCore CPUs came out up to today. It would have been nice to have at least 1 i5 I don't think we needed 4 i7s. Since the ratio to i3s and especially i5s they are a much much smaller segment.
  • artk2219
    It would be nice to see a run with AMD's FX's in the mix since they give you threads, but its at the cost of IPC, and since you can get an FX 8320e for $89.99 (or an FX 6300, but why would you bother at that price) at Microcenter, for those of us lucky enough to be near one. You can spec out the main components of your build (mobo, cpu, mem, and cooler) for $200 to $220. Or a full build without a great graphics card for $350 to $400. With a good graphics card it can be a great value, atleast once you bump the clocks on the 8320e (4.0 ghz or so).
  • footman
    Great article, very important to add the results of the dual core cpu when hyperthreading was enabled. For all of the current games requiring quad core, i believe that a dual core that has hyperthreading will work just as well then.....
  • littleleo
    387420 said:
    It would be nice to see a run with AMD's FX's in the mix since they give you threads, but its at the cost of IPC, and since you can get an FX 8320e for $89.99 (or an FX 6300, but why would you bother at that price) at Microcenter, for those of us lucky enough to be near one. You can spec out the main components of your build (mobo, cpu, mem, and cooler) for $200 to $220. Or a full build without a great graphics card for $350 to $400. With a good graphics card it can be a great value, atleast once you bump the clocks on the 8320e (4.0 ghz or so).
    Microcenter is evil I tell you, EVIL!!! Plus they are a 2 hour drive in traffic from my house, yuk! Their CPU in store specials are awesome. I bought my CPU there back in the day cheaper then I could get it at cost wholesale.
  • TerryLaze
    Seeing GPU being bottlenecked at lower resolutions and going on to test up to 4k ... genius!
    Also agree that the i3 should have been tested with both HT on and off.
  • whtfish
    Great article, but I too would like to see where the i3 with HT on would slot in.
  • AlistairAB
    Bizarre to not take the opportunity to show the i3 with HT on and off in each graph.
  • none12345
    I wouldnt touch a 2 core at this point for a gaming computer. Sure you can get away with it, but no thanks.

    4 core is enough today, but it wont be tomorrow. Its not even enough today if you do something else besides 1 thing at a time on a computer. Ie if you are playing a game and doing something else on a 2nd monitor, or video capture while playing the game, or anyhting else. You need more cores if you multitask.

    Im in the market for a new cpu, but i will not consider a quad core at this point either. Quad core has been milked by intel for WAY too long; its 6+ core or nothing for me at this point. And seeing as how intel loves to ream you for its enthusiast platform, i guess its nothing for now.

    Help me zen kanobe, your my only hope! (assuming it doesnt suck, and assuming its priced reasonably)
  • iam2thecrowe
    1519327 said:
    Wouldn't it have been a more representative benchmark if you just used the same CPU and limited how many cores the games can use?


    Good point. I think it would have been an even better idea, to under-clock the cpu's to say 2-2.5ghz, and use the lowest possible resolution. This way you completely remove any bottlenecks, and the focus would be purely on the number of cores to determine core scaling. In most of these cases, an 8 core could possibly be just bottle-necked vs a 4 core, and that is the reason you don't see the performance increase.
  • bit_user
    134065 said:
    We test five theoretical Intel CPUs in 10 different DirectX 11-based games to determine what impact core count has on performance.
    Useful data, but the interview segments make this a real gem. Thanks!!

    2229740 said:
    Great article, but I too would like to see where the i3 with HT on would slot in.
    Definitely agree. HT scaling (at least up to 4c/8t) should be the next article.
  • bit_user
    111219 said:
    Looks like even years later the prevailing wisdom of "Buy an overclockable i5 with the best video card you can afford" still holds true for pretty much any gaming scenario. I wonder how long it will be until that changes.
    Perhaps, but didn't you see this?
    Quote:
    As a side note - measuring only throughput/framerate is not the right thing to do for gaming. Framerate stability/smoothness is of equal priority. For example, a higher-clocked i5 can give higher average framerate, but lower-clocked i7 can deliver more even framerate, depending on the machine config of course.
  • spentshells
    F1 2015 Pretty good above 1080P? I wasn't impressed with it at full settings 1080p
  • bit_user
    This struck me as a rather silly argument to make, in the context of gaming:
    Quote:
    It’s also worth considering this on a purely theoretical basis. If you have eight cores, you could reduce your processing time to 12.5% if everything shared perfectly, saving you 87.5% compared to one core. But if you add another eight cores, that only takes you down to 6.25%, only saving you a tiny amount. In fact, the biggest saving comes from the first few cores you add, because there will always be work for them to do.
    It's technically correct, but nobody is going to consider gaming with a single core. So, using that as the baseline is ridiculous. Secondly, it's not like this is some render which could either take 10 minutes or 5 minutes, and you just have to decide whether it's worth the $ to save that extra 5 minutes. What we're talking about is up to 2x the throughput. So, if a game is CPU-bottlenecked, then doubling the core count could mean up to 2x framerate improvement.

    That said, he's right that the benefit of adding cores decreases as a function of the number of cores, but more by virtue of the fact that scaling is always sub-linear (assuming well-written software). To his credit, he acknowledges this in his brick-layer analogy.

    BTW, the success of the i7-6700K, on Project Cars, suggests their load-balancing isn't great. Skylake cores simply aren't that much faster than Broadwell, per clock.

    Quote:
    Remember those huge pauses that plagued the i3? They’re ironed out when The Witcher 3 has four threads to work with.
    Those actually suggest lock contention or races involving lock-free data structures. Either way, I'd chalk it up to deficiencies in the software's design. That might also go some ways towards explaining why enabling HT caused the average framerate to increase quite so much.