Editor's Corner: Getting Benchmarks Right

Now We're Getting Somewhere...

So where’s the beef?

After a full day of breaking hardware down, reconfiguring software, downloading drivers, and endless loops of Far Cry 2, our results are still looking good. We ran them and re-ran them before Socket AM3 was launched, and we ran multiple iterations of them again in response to reader commentary on that story itself. AMD’s Phenom IIs are indeed running faster than the i7 920 in a number of our gaming tests where Nvidia's GeForce GTX 280 handles graphics.

Swipe to scroll horizontally
Far Cry 21920x1200, no AA2560x1600, no AA
Phenom II X4 940 and GeForce GTX 280 1 GB  (Numbers From The Launch)63.0648.09
Phenom II X4 940 and Radeon HD 4870 X2  (Numbers Added)68.9565.30
Core i7 920 and GeForce GTX 280 1 GB (Numbers From The Launch)53.2341.83
Core i7 920 and Radeon HD 4870 X2 (Numbers Added)85.8774.85

Surely, the most interesting benchmark results come here at the end, though. Compare the scores of the two configurations with GeForce GTX 280s (Phenom II comes out ahead) with the scores of the two Radeon HD 4870 X2-equipped setups (Core i7 920 comes out in front). There are the scores everyone was expecting.

Pinpointing The Bottleneck

Could it be, then, that at 1920x1200 and 2560x1600, today's processors are so powerful that even the once-mighty GeForce GTX 280 runs headlong into a brick wall? Is the Phenom II simply more efficient than Core i7 when there's a major graphics bottleneck being presented? In order to answer that set of questions, I again built up my X58-based Core i7 platform to test with more GPU muscle: a pair of GeForce GTX 280s in SLI. Unfortunately, I couldn't run an equivalent setup using the 790FX-based AM2 or AM3 motherboards, and we don't have any SLI-capable AM2 platforms in-house, so we'll have to be content connecting a few dots with additional testing here instead.

Swipe to scroll horizontally
Far Cry 21920x1200, no AA2560x1600, no AA
Core i7 920 @ 2.66 GHz with one GeForce GTX 280 (Numbers From The Launch)53.2341.83
Core i7 920 @ 2.66 GHz with two GeForce GTX 280s (New Results)86.6870.80

Pre-testing Hypothesis: It's hard to draw parallels to our Core i7: 4-Way CrossFire, 3-Way SLI, Paradise? piece because we didn't test the Phenom X4 9950 with multiple GeForce GTX 280s there. We did test with Radeon HD 4870s, though, and noticed a curious case of reverse scaling as AMD was trying to get its optimizations in place for the game. We're hoping that a pair of GTX 280s demonstrates clear scaling here so that we're not left with a more severe issue centering on Nvidia's drivers. At least solid scaling would support the guess that an individual GTX 280 is unable to keep up with these modern processors.

Post-testing Analysis: Those numbers are much more indicative. Notice how the two GTX 280s are able to get much closer to the single Radeon HD 4870 X2 (but not beat it, strangely enough, at 2560x1600). We're just about ready to call this one a wrap. But first, a few more sets of numbers pushing resolutions and graphics settings down to see if we can accomplish the equivalent of running an AMD-based SLI platform at these more demanding benchmark options.

Swipe to scroll horizontally
Far Cry 2, High Settings1280x1024, no AA1680x1050, no AA
Core i7 920 @ 2.66 GHz with one GeForce GTX 280 (New Data)84.0376.50
Phenom II X4 940 @ 3 GHz with one GeForce GTX 280 (New Data)91.7386.87

Pre-testing Hypothesis: If the hypothesis is correct, then we should see Intel's Core i7 920 assume its lead once the emphasis falls away from graphics muscle.

Post-testing Analysis: Indeed, it is not. Instead, the Phenom II maintains its advantage over Intel, even when you drop down to resolutions no gamer with this sort of hardware would really want to play at. If this is still a graphical bottleneck for the GeForce GTX 280, then our conclusion from the AM3 launch holds true, and the Phenom II really is faster in real-world gaming scenarios.

Swipe to scroll horizontally
Far Cry 2, Low Settings640x480, no AA
Core i7 920 @ 2.66 GHz with one GeForce GTX 280 (New Data)179.01
Phenom II X4 940 @ 3 GHz with one GeForce GTX 280 (New Data)124.69

Let's take it to an extreme, now. At 640x480 and ridiculously stupid-low settings, which make Far Cry 2 a synthetic metric for the most part, Core i7 finally takes its elusive lead. In other words, with absolutely all graphics load alleviated, the Core i7 starts replicating the sort of performance we saw in the A/V tests, handily trouncing the Phenom II.

Was it an Intel problem? An Nvidia problem? Simply Phenom II kicking ass and taking names in our gaming tests?

Our next order of business is to either pin this on Nvidia or vindicate the graphics vendor by running a single Radeon HD 4870 512 MB at 1920x1200, where we can expect it to get taxed pretty hard. We would have used a 1 GB card if there were any left around the lab. Leaving the 2560x1600 test out of this one should do the trick, though. 

Swipe to scroll horizontally
Far Cry 21920x1200, no AA
Core i7 920 @ 2.66 GHz with one Radeon HD 4870 512 MB (New Data)54.44
Phenom II X4 940 @ 3 GHz with one Radeon HD 4870 512 MB (New Data)54.59

Pre-testing Hypothesis: If this is, in fact, a processor thing, the Radeons will demonstrate the same behavior as Nvidia's more powerful GeForce GTX 280, leaning in the direction of AMD's Phenom II X4 940.

Post-testing Analysis: An identical graphics bottleneck suggests that this is not the same behavior seen before. We're seeing a definite graphics-bound condition, where the GeForce GTX 280 is hitting its limit at different places depending on processor architecture. The implication is that Nvidia's drivers are not allowing the i7 to reach its full gaming potential.

Bottom Line

The bottom line here, first and foremost, is that all of the data generated and seen in the Socket AM3 launch piece was, in fact, right on the money.

The data suggests that, using an AMD Radeon-based graphics card, you'll likely see the scaling that many other sites have presented, with Intel's Core i7 besting the Phenom II right up to 2560x1600 (refer to the first chart on this page for proof there).

At 640x480--a largely synthetic measure of processor performance, the Core i7 rules the roost under the power of a GeForce GTX 280, too. But again, the graphics load here is minimal. Anything higher--even 1280x1024, another resolution you'd expect to be CPU-bound on these cutting-edge platforms--and Nvidia's card cannot translate the Core i7's microarchitecture into the same performance advantage, giving AMD's Phenom II-series chips the advantage seen in the AM3 story and in the two pages you've just read.

I want to make it a point to thank Nvidia--specifically Nick Stam--for working with me in covering every possible base as we explored where and how the GTX 280 was being affected. As soon as we get more depth from the company's driver team, I'll post an update here. But, for the time being, I hope that the testing invested here helps clarify any questions about the validity of our benchmark results in the Phenom II / Socket AM3 launch story.

Chris Angelini
Chris Angelini is an Editor Emeritus at Tom's Hardware US. He edits hardware reviews and covers high-profile CPU and GPU launches.
  • dattimr
    Nice one. Tom's is getting its act together again. Keep it up, guys.
    Reply
  • Hamsterabed
    How very odd, when i saw the benches i immediately thought there was a problem. Glad you guys made an article to explain and backup you numbers and i hope we get some answers. don't have another driver fail Nvidia...
    Reply
  • Tindytim
    Wow...

    Just wow.

    Right when I considering leaving this site forever for it's over Mac loving, Tom flashes me a glimmer of hope.
    Reply
  • rdawise
    Thank you Chris for this follow-up article..now where is kknd to argue....

    I am sorry but we all know that at lower resolutions the Core i7 will beat the P2, but as the article states, but real world the PII is hitting the high notes. Could this be a driver screw up from Nvidia...probably since you're elimnating everything else. Are there any other x-factors out there...oh yes plenty more. However I think people will get the wrong impression if they read this and think the PII is "more powerful" than the Core i7. Some one who reads this should come away thinking that the PII will give you almost as great gaming as some of the Core i7s can for less money. (Time for a price cut intel).

    I do a question what if you tried using memory with different timings. I believe 8-8-8-24 was used last test, but how about 7-7-7-20? Just trying to help think of reasons. Either way it gives us something to look forward to in the CPU world. Good follow-up.
    Reply
  • rdawise
    Thank you Chris for this follow-up article..now where is kknd to argue....

    I am sorry but we all know that at lower resolutions the Core i7 will beat the P2, but as the article states, but real world the PII is hitting the high notes. Could this be a driver screw up from Nvidia...probably since you're elimnating everything else. Are there any other x-factors out there...oh yes plenty more. However I think people will get the wrong impression if they read this and think the PII is "more powerful" than the Core i7. Some one who reads this should come away thinking that the PII will give you almost as great gaming as some of the Core i7s can for less money. (Time for a price cut intel).

    I do a question what if you tried using memory with different timings. I believe 8-8-8-24 was used last test, but how about 7-7-7-20? Just trying to help think of reasons. Either way it gives us something to look forward to in the CPU world. Good follow-up.
    Reply
  • sohei
    "I believe 8-8-8-24 was used last test, but how about 7-7-7-20? Just trying to help think of reasons"

    wow 7-7-7-20? this is the performance...indeed
    P2 works with ddr2 great and you wary about timings
    Reply
  • great article!!
    just a thought: what about previous generation of nvidia cards? could be this is a GTX 260/285/280/... problem. maybe you could try with one of 9xxx series.
    Reply
  • StupidRabbit
    awesome article.. only two pages long but it changes the way i look at the previous benchmarks. good to see you focus not only on the hardware itself but also on the benchmarks with a real sense of objectivity.. its what makes this site great.
    Reply
  • cobra420
    so it looks like a gpu issue . why not try a gtx 295 ? or is that why you set the video so low ? now you found the issue theirs no need to try a different card ? ati sure did a good job on there 4870 series . nice job toms
    Reply
  • Maybe Farcry optimized it more on ATI, maybe Intel is throwing sticks at the wheels of nVidia at the hardware level, maybe, maybe ... :S
    Why is Intel supporting multi-ATI config, but not multi-nVidia? Why doesn't Intel let nVidia use its Atom freely? Why, oh why?
    There are so many factors. I think if you replace Farcry with a synthetic test, there will be less unknowns. Just maybe :)
    Reply