After having looked at AMD’s six-core Phenom II X6 across all of its possible core configurations, it’s time to do the same with Intel’s Core i7-980X flagship. How do performance, power consumption, and power efficiency change when fewer cores are utilized? Is the 32 nm Core i7 top model best with all six cores, or does some combination of fewer core deliver the optimal experience?
The testing we did on AMD's Thuban-based six-core chip revealed two important things.
First, we had to realize that many applications still don't benefit from multiple cores. Users would realize much more performance if software did a better job of supporting the available hardware. We find it frustrating to see AMD and Intel deliver extremely powerful CPUs only to have their potential remain underutilized, especially in mainstream applications.
Our second finding was about efficiency. The Phenom II X6 shows the best performance per watt with all six cores active, as performance gains are more substantial than the additional power required to operate the higher core count.
Is this also the case on Intel’s flagship? Does using all six cores provide the best power efficiency? Will power consumption in idle decrease if we switch off individual cores? Let’s find out.
Both processor flagships from AMD and Intel are equipped with performance-enhancing features that allow the CPUs to increase clock speeds when two requirements are met. First, CPU load has to go through the roof, and secondly, there has to be sufficient thermal headroom for increasing the clock rate. The features, however, are implemented differently.
AMD’s Turbo CORE function only knows one acceleration mode, while Intel implements two (at least on this particular model; other CPUs are more dynamic). The first mode applies when all cores are accelerated (a 133 MHz boost). The second kicks in when only one or two cores are active and can benefit from additional clock speed (up to 266 MHz). The Turbo Boost implementation is more aggressive on Intel’s 32 nm processors, whether in dual- or multi-core models, but it's still notable on the 45 nm Core i5 and i7 processors. Note that Turbo Boost accelerates cores by increasing the CPU’s multiplier within a set range, but the feature can't always take advantage of maximum acceleration if the processor is already operating close to its thermal/power limits.
AMD’s Turbo CORE basically works like a reversed implementation of Cool'n'Quiet, AMD's power saving feature for CPUs. To make a long story short, Turbo CORE exploits thermal headroom if there's sufficient workload demand, and it does it for exactly three cores (unless you alter that through AMD's OverDrive software. In theory, this speaks to Intel’s configuration, since there should be higher clock speeds available if few cores are required. Only limited acceleration is available if all cores are involved, since there would be little headroom left. AMD’s feature, on the other hand, also kicks in when needed, but probably reaches thermal limits quicker because all cores are involved at all times. However, this is just a theory, and we need to put it to the test and directly compare Turbo Boost against Turbo CORE in a different article.
Regarding our test platform, we found that you can't just pick any socket LGA 1366 motherboard and expect to reduce the number of active cores. Fortunately, we found a feature for switching off individual cores on Gigabyte’s EX58-UD4P with the F12 BIOS version. Although this might not be a really important BIOS switch for most users, it's worth exploring, since power consumption does decrease if you switch off Intel cores. This wasn’t the case on our Phenom II X6 1090T test system. Here, idle power remained constant whether one or six cores were used.
| System Hardware | |
|---|---|
| Hardware | Details |
| Performance Benchmarks | |
| Motherboard (LGA 1156) | Gigabyte EX58-UD4P (Rev. 1.0), Chipset: Intel X58, BIOS: F12 (02/11/2009) |
| CPU Intel | Intel Core i7-980X (32 nm, 3.33 GHz, 6 x 256KB L2 and 12MB L3 Cache, TDP 130W) |
| RAM DDR3 (Dual) | 3 x 2GB DDR3-1600 (Corsair TR3X6G-1600C8D 8-8-8-24) |
| Graphics | Sapphire Radeon HD 5850 GPU: Cypress (725 MHz), Graphics RAM: 1024 MB GDDR5 (2000 MHz), Stream Processors: 1440 |
| Hard Drive | Western Digital VelociRaptor, 300GB (WD3000HLFS), 10,000 RPM, SATA 3Gb/s, 16MB Cache |
| Power Supply | PC Power & Cooling, Silencer 750EPS12V 750W |
| System Software And Drivers | |
| Operating System | Windows 7 Ultimate x64 Updated on 2010-03-03 |
| Drivers And Settings | |
| Intel Chipset Drivers | Chipset Installation Utility Ver. 9.1.1.1025 |
| Intel Storage Drivers | Matrix Storage Drivers Ver. 8.9.0.1023 |
| Audio Benchmarks And Settings | |
|---|---|
| Benchmarks | Details |
| iTunes | Version: 9.0.3.15 Audio CD ("Terminator II" SE), 53 min. Convert to AAC audio format |
| Lame MP3 | Version 3.98.3 Audio CD "Terminator II SE", 53 min. Convert WAV to MP3 audio format Command: -b 160 --nores (160 Kbps) |
| Video Benchmarks and Settings | |
| Benchmarks | Details |
| Handbrake CLI | Version: 0.94 Video: Big Buck Bunny (720x480, 23.972 frames) 5 min. Audio: Dolby Digital, 48000 Hz, 6-channel, English to Video: AVC1 Audio1: AC3 Audio2: AAC (High Profile) |
| MainConcept Reference v2 | Version: 2.0.0.1555 MPEG-2 to H.264 MainConcept H.264/AVC Codec 28 sec. HDTV 1920x1080 (MPEG-2) Audio: MPEG-2 (44.1 kHz, 2-channel, 16-bit, 224 Kbps) Codec: H.264 Pro Mode: PAL 50i (25 FPS) Profile: H.264 BD HDMV |
| Application Benchmarks And Settings | |
| Benchmarks | Details |
| 7-Zip | Version 9.1 beta LZMA2 Syntax "a -t7z -r -m0=LZMA2 -mx=5" Benchmark: 2010-THG-Workload |
| WinRAR | Version 3.92 RAR Syntax "winrar a -r -m3" Benchmark: 2010-THG-Workload |
| WinZip 14 | Version 14.0 Pro (8652) WinZIP Commandline Version 3 ZIPX Syntax "-a -ez -p -r" Benchmark: 2010-THG-Workload |
| Autodesk 3ds Max 2010 | Version: 10 x64 Rendering Space Flyby Mentalray (SPECapc_3dsmax9) Frame: 248 Resolution: 1440 x 1080 |
| Cinebench 11.5 | Version 11.5 Build CB25720DEMO CPU Test single- and multi-threaded |
| Adobe Photoshop CS4 (64-bit) | Version: 11 Filtering a 16MB TIF (15000x7266) Filters: Radial Blur (Amount: 10; Method: zoom; Quality: good) Shape Blur (Radius: 46 px; custom shape: Trademark sysmbol) Median (Radius: 1px) Polar Coordinates (Rectangular to Polar) |
| Adobe Acrobat 9 professional | Version: 9.0.0 (Extended) == Printing Preferenced Menu == Default Settings: Standard == Adobe PDF Security - Edit Menu == Encrypt all documents (128 bit RC4) Open Password: 123 Permissions Password: 321 |
| Microsoft Powerpoint 2007 | Version: 2007 SP2 PPT to PDF PowerPoint Document (115 Pages) Adobe PDF-Printer |
| Fritz | Fritz Chess Benchmark Version 4.3.2 |
| Synthetic Benchmarks and Details | |
| Benchmark | Details |
| 3DMark Vantage | Version: 1.02 Patch 1901 Options: Performance Graphics Test 1 Graphics Test 2 CPU Test 1 CPU Test 2 |
| PCMark Vantage | Version: 1.0.2.0 Patch 1901 PCMark Benchmark Memories Benchmark |
| SiSoftware Sandra 2010 | Version: 2010.1.16.10 Processor Arithmetic, Cryptography, Memory Bandwidth |

Raw CPU horsepower scales perfectly. Each core you switch on adds essentially the same computing power.




The encryption tests show top performance per core when using all six cores.

If you remember the Phenom II X6's results, you'll recall that at least three cores are required to saturate the 13+ GB/s memory bandwidth of dual-channel DDR3-1333 memory. Here, the Intel Core i7 six-core processor gets rather close to the 19 GB/s throughput maximum of triple-channel DDR3-1333 memory with only two cores.

3DMark Vantage’s CPU test also says that the performance gain per core is slightly larger than single-core performance. This is the case because multiple cores can work on shared L3 cache data together.

The GPU score doesn’t change much. Keep in mind that this only reflects gaming performance for older titles that can't utilize more than one processing core. Modern game titles also involve more complex AI systems and physics calculations. Both require CPU power that isn’t visible in this test.

Our overall 3DMark Vantage score gets close to what you see in real life.

The PCMark Vantage memory test echoes behavior that we saw on the AMD Phenom II X6. Two and three cores deliver better memory performance than a single core. It takes four or more cores to get close to the processor’s maximum result in PCMark Vantage.

PCMark Vantage’s overall result says that cores five and six don’t add much additional performance, partially because this result includes the 3D tests, which don’t benefit from multi-core CPUs in this benchmark. It's also worth noting that PCMark is comprised of apps from Windows, most of which seem optimized for four-core CPUs.

3ds Max scales extremely well with each additional core.

Similar conclusions arise with 7-Zip, although you’re not getting a lot more performance beyond four or five cores.


Cinebench in its multi-threaded run shows that real life performance doesn't scale linearly with every processing core added. There is a bit of overhead incurred with each addition. Again, we’re seeing best results from the six-core configuration.

Adobe’s Acrobat 9 always takes at least a few seconds to generate a PDF document from a complex Word or PowerPoint file. Our benchmark uses a 115-page presentation, but the time savings on multiple cores versus a single core are embarassing for Adobe. It should be possible to parallelize this type of workload to a much greater extent. As things stand, all you really need is a fast, dual-core CPU.

Photoshop CS4 is a perfect example of how applications can take maximum advantage of modern multi-core processors.

WinRAR is thread-optimized and benefits from each CPU core enabled during testing, but benchmark variance is about as large as the performance gains witnessed once you exceed three cores.

WinZip needs a serious update. Variance is high and performance only scales if you boost clock speeds. What a disappointment for such a popular tool.

We’ve said repeatedly that iTunes wasn’t thread-optimized. It appears that this statement isn’t entirely accurate. The app is multi-threaded, but the "benefit" is pathetic.

The Lame MP3 encoder is also better off with a fast dual-core rather than a massive six-core processor.


MainConcept and HandBrake for video transcoding both scale well with additional CPU cores.

Idle power actually decreases if you decide to switch off CPU cores, but the difference isn’t large, thanks to optimizations in Intel's architecture that shut down unused execution resources when they're not in use anyway. We measured savings of 2% between six cores and just one.

Peak power scales very linearly. If you were to switch off five out of the six cores, our Core i7-980X machine would require only 122W at peak load rather than 223W. This would be a single-core 32 nm processor with 12MB L3 cache and a 3.2 GHz clock speed running at only 54% of the peak power of six cores. Keep in mind that performance drops much more, though. This is just a hypothetical example.

The total runtime tells us how long the systems took to complete our full efficiency workload, including most of the applications mentioned previously. Clearly, the difference between one, two, and four cores is significant, while adding cores five and six don't yield the same performance jumps. This is very applicable to real life and typical applications.

The average power consumption during this efficiency run increases with every core we switch on, but look at how the bars create a curve that flattens out. Average power may increase, but so does performance, and probably to a larger extent.

These results are similar to what we found on our AMD system. The total power required to complete our efficiency workload is lowest with a maximum number of cores.

These results clearly show how performance increases significantly with additional cores while power consumption increases more moderately. If we relate performance to power used, we get confirmation that 5- and 6-core operation is most efficient.



Both AMD's Phenom II X6 and Intel’s Core i7-980X prove that a larger CPU core count is by far the most reasonable technique for improving overall performance. This largely depends on software support for threading, as applications need to be able to take advantage of more than one or two processor cores. However, with this support in place, we can see from our results that you’ll not only be getting much faster performance, but also highly improved efficiency (performance per watt).
Since idle power between six, four, or only two active cores doesn't vary much, we can only recommend leaving all cores switched on all the time (as we suspected at the start of this piece). There are other, much more effective ways to reduce power consumption than disabling cores. Likewise, our results show that it makes sense to pick the highest possible core count within a processor generation when you’re looking for maximum performance in threaded environments.
Unfortunately, only professional applications are truly thread-optimized across the board. Lots of popular software, even from large software houses like Adobe, might not always be good at utilizing multi-core resources. Thus, clock speeds remain important, even though they make limited sense from an efficiency standpoint. We’ll soon be looking at the two six-core processors again to compare their Turbo features at stock speeds and at overclocked speeds, since it seems that these dynamic mechanisms are the best way to combine the best of both worlds: high clock speeds and a large core count.


