Force Versus Finesse
Tom's Hardware's Three-Part, 3-Way Graphics Scaling Series
Part 1, The Cards: Triple-GPU Scaling: AMD CrossFire Vs. Nvidia SLI
Part 2, The Slots: GeForce And Radeon On Intel's P67: PCIe Scaling Explored
Part 3, The Chipsets: P67, X58, And NF200: The Best Platform For CrossFire And SLI
The advantages and shortcomings of Intel’s mainstream platforms are well-known to anyone who follows technology. A total of sixteen PCIe 2.0 lanes originating from the CPU reduce latency (good) and total available bandwidth (bad) compared to Intel’s high-end X58 chipset.
Fortunately, an unlocked multiplier on the K-series processors makes overclocking a piece of cake. On the other hand, though, the CPU also only supports two graphics cards for SLI configurations. That's actually a limitation imposed by Nvidia. Technically, P67 enables the processor's 16 lanes and trio of PCIe controllers.
Artificial roadblocks aside, those limitations allow Intel’s high-end X58 chipset to remain a top choice for extreme enthusiasts, given 36 PCIe 2.0 lanes supporting up to four graphics cards in x8 arrangements with four lanes to spare. Too bad unlocked multipliers for that platform are limited to very expensive Extreme Edition CPUs. But even still, overclocking via the base clock gives less expensive processors access to faster interface speeds. And of course, there's the benefit of a triple-channel memory controller, providing up to 50% more bandwidth than any of Intel’s mainstream solutions (even if the advantage is largely academic).
Part one of our this three-part series answered questions about multiple-GPU scaling, while part two addressed PCIe bandwidth needs for a single card. Tying it all together, today we’re going to determine whether a 32-lane (or greater) PCIe controller is really a requirement for dual- and triple-GPU arrays, whether triple-channel memory and twice the base clock can help a 4 GHz Core i7 CPU based on the Bloomfield design (an overclocked Core i7-920) overcome the architectural advancements of a 4 GHz Core i7 processor based on Sandy Bridge (an overclocked Core i7-2600K), and how much of a difference Nvidia’s lane-multiplying NF200 PCIe bridge makes when 16 or 32 lanes aren’t enough.
Any comparison between slightly-different platforms is sure to lead proponents of one side to scream bias against the other. These parts were carefully chosen to make this a fair fight, though.
For example, fans of the LGA 1155 interface will point out that by having triple-channel memory, X58 Express also has 50% greater memory capacity, so long as the modules are identical. But we'd argue that triple-channel (and the extra memory) is simply a missing feature from the closer-to-mainstream platform. Anyone standing up for their LGA 1366-based board will point out that the Core i7-990X is age-appropriate for this comparison, yet we’ve found that ultra-expensive six-core CPUs offer no performance advantage in games. While a different Bloomfield model might have allowed a closer price match, we would have still picked 20 x 200 MHz (Bloomfield) vs 40 x 100 MHz (Sandy Bridge) clock settings to squeeze the greatest performance from both processors at the resulting 4 GHz comparison frequency.