Editor’s Note: Eager to show off what is has done with Intel’s Sandy Bridge architecture, system builder CyberPower PC is offering Tom’s Hardware's audience the opportunity to win a new system based on Intel’s Core i7-2600K processor. Read through our review, and then check out the last page for more information on the system, plus a link to enter our giveaway!
The high-end desktop processor market is a one-horse race, with Intel’s LGA 1366-based Core i7-900-series CPUs pretty much tromping along uncontested. If you have the money and are building a performance-oriented machine, it’s hard to beat an overclocked Core i7-950. Power users who really need the punch of a six-core chip can go Core i7-970—just be ready to pay out the ears for the privilege of owning one.
It’s the mainstream where we see more interesting battles being waged. Funny how healthy competition has a habit of forcing more aggressive prices, isn’t it? For example, the quad-core Core i5-760 is compelling at $200. But so is AMD’s six-core Phenom II X6 1075T. And while AMD’s Black Edition parts captured the hearts of overclocking enthusiasts long ago, Intel more recently shipped a couple of K-series SKUs that bucked the company’s habit of only unlocking the multipliers on thousand-dollar Extreme Edition parts.

And now we have a new architecture from Intel, called Sandy Bridge. The last time Intel launched a processor design, it started with high-end Core i7-900-series chips and let the technology trickle down to the mainstream and entry-level derivatives. This time is different. Sandy Bridge is going to have to swim its way upstream, surfacing on the flagship LGA 2011 interface in the second half of this year for the real workstation-oriented aficionados.
Intel’s Here And Now
That’s a long way away, though. Between now and then, LGA 1366 is supposed to remain at the top of Intel’s stack, while LGA 1155-based processors centering on Sandy Bridge gobble up all of the volume as a result of what Intel claims is a ~30% performance improvement versus the Lynnfield- and Clarkdale-based processors.
Naturally, this means trouble for an AMD that continues to launch incrementally faster versions of its existing architecture—but nothing that’d give it the double-digit speed-up needed to fend off a new microarchitecture from its competition. The only way to strike back at this point is with lower prices, and that's probably not the route AMD wants to be taking. We expect Bulldozer, the company's own next-gen architecture, sometime in 2011; that launch can't come soon enough.
A large enough boost from Sandy Bridge would also make Intel’s Core i7-900-series vulnerable too, though. Right now, these are, at minimum, $300 parts (that’s just to get in the door with a -950) that drop into generally more expensive motherboards requiring pricier triple-channel memory kits. I’ve been saying all along that the X58 platform would remain, definitively, Intel’s crown jewel on the desktop. But after running the numbers I’ve run on Sandy Bridge, I have to wonder if X58’s days are numbered a little sooner than the company planned.
Sandy Bridge has a couple of other surprises up its sleeve—not all of them destined to go down as smoothly as a 1996 Dom Perignon on New Year’s Eve. For one, overclocking on an Intel platform is drastically different, and the LN2-drinking crowd probably won’t like it very much. There’s also a big emphasis on integrated graphics, which we’ve seen prematurely praised as a potential alternative to entry-level discrete graphics. That doesn't turn out to be the case, at least on the desktop.
Intel's year-old Clarkdale multi-chip package.
The new hotness: Sandy Bridge, now with more integration.
On the other hand, Sandy Bridge comes armed with a block of fixed function logic that specifically addresses video encoding. AMD and Nvidia have no answer to this, are a year behind Intel with a competitive solution, and get completely outperformed today in video workloads. We also have a couple of unlocked SKUs that really give this architecture, manufactured at 32 nm, room to stretch its legs.
Putting Sandy Bridge To The Test
Leading up to the Sandy Bridge architecture’s launch, Intel sent over four SKUs from its upcoming lineup: Core i7-2600K, Core i5-2500K, Core i5-2400, and Core i3-2100. We put all four processors through a brand new benchmark suite for 2011, along with Bloomfield-, Lynnfield-, Clarkdale-, and Yorkfield-based chips from Intel, plus Thuban- and Deneb-based CPUs from AMD.
While many of you were enjoying time away from work around Christmas and digging out of blizzard-like conditions ahead of New Year's Eve, the Tom's Hardware Bakersfield, CA lab was kept busy and warm by the latest bleeding-edge CPUs being run through their paces. Shall we?
From 10 000 feet, the Sandy Bridge die you saw on the previous page looks like a complete departure from its predecessor. After all, the mainstream Clarkdale-based CPUs consisted of two physical chips—a dual-core CPU manufactured at 32 nm and a graphics core/integrated memory controller/PCI Express controller etched at 45 nm. Now we’re looking at a single 32 nm part with all of those capabilities crammed onto one piece of silicon. Drill down, though, and there are really a lot of similarities that turn out to be more evolutionary in nature.
For each piece of Sandy Bridge that you look at, keep one word in mind: integration. Intel wanted to get the most out of each of the architecture’s nearly 1 billion transistors (the official count is 995 million).
There are actually three different versions of the Sandy Bridge die shipping at launch. The quad-core configuration—the one composed of 995 million transistors—measures 216 mm². Then, there’s a dual-core die with 12 execution units making up its graphics engine. That one features 624 million transistors on a 149 mm² die. Finally, the slimmest variation sports two cores and a graphics engine composed of six EUs. Though it’s flush with 504 million transistors, you’d hardly know it given the 131 mm² die size.
| Die Size (square mm) | Transistors (million) | |
|---|---|---|
| Sandy Bridge (4C) | 216 | 995 |
| Sandy Bridge (2C, HD Graphics 3000) | 149 | 624 |
| Sandy Bridge (2C, HD Graphics 2000) | 131 | 504 |
| Bloomfield (4C) | 263 | 731 |
| Lynnfield (4C) | 296 | 774 |
| Westmere (2C) | 81 | 383 |
| Gulftown (6C) | 248 | 1168 |
In comparison, the 45 nm Lynnfield design that served as the foundation for Intel’s Core i7-800- and Core i5-700-series chips measured a more portly 296 mm², despite the fact that it only consisted of 774 million transistors. Intel’s architects clearly owe much of what they were able to cram into Sandy Bridge to the engineers that brought the 32 nm node online for Westmere (tick), and then dialed in for today’s launch (tock).
The Cores
In its current state, Sandy Bridge-based processors are available with four cores (with and without Hyper-Threading) and two cores (dual-core models all have Hyper-Threading enabled). As you’ll see in the benchmarks, these cores are, clock-for-clock, more powerful than what we saw from Nehalem.
Still present are the 32 KB L1 instruction and data caches (along with 256 KB L2 cache per core), though Sandy Bridge now incorporates what Intel calls a L0 instruction cache that holds up to 1500 decoded micro-ops. This feature has the dual effect of saving power and improving instruction throughput. If the fetch hardware finds the instruction it needs in cache, it can shut down the decoders until they’re needed again. Intel also rebuilt Sandy Bridge’s branch prediction unit, improving its accuracy.


I ran these two single-threaded tests as a synthetic comparison of performance, clock for clock. Both quad-core chips are set to the same frequency with Turbo Boost and EIST disabled. As you can see, just the architectural shift makes a significant impact on Sandy Bridge's performance versus the Nehalem-based Lynnfield design.
Sandy Bridge-based processors are the first to support Advanced Vector Extensions (AVX), a 256-bit instruction set extension to SSE (AMD will also support AVX in its upcoming Bulldozer processor architecture). The impetus behind AVX comes from the high-performance computing world, where floating-point-intensive applications demand more horsepower than ever. To that end AVX’s impact on Sandy Bridge will very likely be limited. Intel does, however, expect that audio processing and video editing applications should eventually be optimized to take advantage of AVX (along with the financial services analysis and engineering/manufacturing software that AVX is really designed to target). Unfortunately, there aren't any real-world apps optimized for AVX that we can test as a gauge of the capability's potential.
Naturally, a lot of implementation work went into enabling AVX, including a transition from a retirement register file to a physical register. This allows operands to be stored in the register file, rather than traveling with micro-ops through the out-of-order engine. Intel used the power and die size savings enabled by the physical register to also significantly increase buffer sizes, more efficiently feeding its beefier floating-point engine.
The Cache
As a consequence of increased integration, Intel had to address the ways bits and pieces of its processor were accessing the last-level cache (in Sandy Bridge, it’s the L3).
Back in the days of Bloomfield, Lynnfield, and Clarkdale, a four-core (and even six-core, in Westmere) ceiling meant that each physical core could have its own connection to that shared cache. The Xeon 7500-series processors were designed to be more scalable, though, and currently-shipping models feature as many as eight cores per CPU. Built the same way, that’d be an exorbitant number of traces between each core and the last-level cache. So, Intel adopted a ring bus that, in those enterprise environments, allows the company to keep scaling core count without the logistics getting out of control.
The ring bus, as it appears in Intel's Xeon 7500-series
Earlier this year, I had the chance to talk to Sailesh Kottapalli, a senior principle engineer at Intel, who explained that he’d seen sustained bandwidth close to 300 GB/s from the Xeon 7500-series’ LLC, enabled by the ring bus. Additionally, Intel confirmed at IDF that every one of its products currently in development employs the ring bus. Think we’re going to see a continued emphasis on adding cores and other platform components directly to the CPU die? I’d say that’s a fair assumption.
Of course, Intel wasn’t worried about higher core count on the mainstream desktop version of Sandy Bridge. Rather, it was the on-die graphics engine that compelled a similar shift to the ring bus architecture, which now connects the graphics, up to four processing cores, and the system agent (formerly referred to as uncore) with a stop at each domain. Latency is variable, since each component takes the shortest path on the bus; overall, though it’s always going to be lower than a Westmere-based processor.
At the end of the day, the ring bus’ most significant contribution is going to be the performance it facilitates in graphics workloads.
The System Agent
Altered principally in name, the system agent includes functionality previously associated with the uncore—that is, it includes the processor subsystems that can’t be grouped with the execution cores (and now the graphics engine, too).
In that list, you have the dual-channel memory controller (which officially supports transfer rates of up to 1333 MT/s), 16 lanes of second-generation PCI Express connectivity, the DMI, and a more advanced power control unit, responsible for managing the operation of Turbo Boost, among its other roles.
Turbo Boost 2.0?
Speaking of Turbo Boost, Sandy Bridge includes a second-generation implementation of this technology, first seen two years ago on Bloomfield-based Core i7-900-series chips, but really only throttled up, so to speak, on Lynnfield a year later.
The premise behind what I’ll call Turbo Boost 1.0 was that, in a multi-core CPU, available resources are not always in use. An application like iTunes, for instance, can only use one core at a time. And yet, the chip’s thermal ceiling is defined by a worst-case scenario of all cores fully-utilized. Turbo Boost takes advantage of the thermal headroom that exists when the chip executes a workload like iTunes, in turn accelerating the one active core to get its task completed faster.
Turbo Boost 1.0 is smart in that it dynamically ratchets up the frequency of active cores based on temperature, current, power consumption, and operating system states. But it won’t exceed programmed power limits, even if thermal headroom exists to push performance harder.
In the real-world, a processor doesn’t heat up right away, though. From idle, it takes time to reach its thermal ceiling. Turbo Boost 2.0 (or next-gen Turbo Boost, whatever you want to call it) allows the processor to exceed its power ceiling until it reaches its thermal limit, at which point it drops power to conform to those same programmed limits.
Turbo Boost 2.0 does not mean the CPU will exceed its maximum Turbo Boost frequency. If you have a Core i7-2600K with a 3.4 GHz base clock and 3.8 GHz maximum Turbo clock, 3.8 is as fast as it’ll go in its stock trim. It’ll simply stay there longer—until the CPU heats up to its thermal limit—before backing down.
Unfortunately, it’s not really possible to quantify the benefits of this capability. The best I could get out of Intel was that it helped improve responsiveness. On the desktop, I frankly wasn’t able to tell a difference, and as a result, Turbo Boost 2.0 comes across somewhat gimmicky.
To be fair, it’s going to mean more in the mobile space, where base clocks start off a lot lower to save power, and Turbo Boost ceilings scale significantly higher. We have a Sandy Bridge-based notebook in the office and will be putting it through its pages this month, too.
Also more impactful in the mobile space is Sandy Bridge’s ability to share thermal budget between graphics and processor cores. Previous-generation Arrandale cores were able to do this, applying the Turbo Boost concept to both components. Now Sandy Bridge enables the same capability on the desktop. Intel says that, in 3D-heavy workloads, the power control unit will bias to the graphics core, as it stands to improve performance more than faster CPU cores.
I don’t think it has ever been said that Intel caught AMD and Nvidia off guard in the graphics department. And yet, the Quick Sync engine remained an unknown to everyone outside of Intel, right up until IDF 2010. Would you believe that it was first conceptualized five years ago? Talk about keeping a secret!
At the time, the first BD-ROM drives were starting to ship, representing this shift from SD to HD media. Additionally, there was more growth in mobility than the desktop space. Finally, Intel recognized that the PC was the sole platform for content creation—and the fact that editing a video could gobble up an entire weekend was flat-out unacceptable. It was at that point that Intel’s engineers decided to tackle decoding and encoding performance in Sandy Bridge—both pain points for content creators. They approached the video pipeline using dedicated fixed-function logic, which serves two purposes. First, it enables very compelling performance. And second, it keeps energy use to a minimum.
Of course, that fixed-function logic later came to be known as Quick Sync—a blanket marketing name for Sandy Bridge’s ability to accelerate decoding and encoding/transcoding.
“But wait,” you say. “AMD and Nvidia already accelerate those things using CUDA and Stream (now referred to as APP).” That’s true. But both companies are using general-purpose hardware to improve performance beyond what a software-only implementation can do. And while we’ve all been trained to think that general-purpose GPU computing is the future, at least relative to the more limited parallelism offered by a CPU, the tasks we’re talking about here simply cannot run as quickly or as efficiently (power-wise) in general-purpose logic circuits.
So, what’s the thinking here? We know that video—whether you’re talking about playback or encoding—is a common use case. Dedicating processing cores to that workload ties them up and uses a lot of power. We’ve seen this in our CPU reviews for years now (think about the MainConcept and HandBrake metrics). Software developers have had to parallelize their applications to make video-related workloads finish faster. And that means higher utilization, more power, more heat, and so on. I mean, really, video is one of the most demanding benchmark scenarios we regularly throw at a new chip.
Programmable EUs surrounded by Intel's fixed-function logic blocks
Intel’s answer was to build a dedicated block of silicon onto Sandy Bridge-based processor that does nothing but video. According to Dr. Hong Jiang, the senior principle engineer and chief media architect of Sandy Bridge, this decision was based on the pervasiveness of video. Intel is quite literally betting precious die space that video applies to a broader range of its customers than if it burnt transistor budget on more gaming performance. Of course, it helps that video is one of Intel’s competencies. The investment into Quick Sync ends up going a lot further than a more modest gain in 3D alacrity.
Needless to say, once word of Quick Sync spread, both AMD and Nvidia started burning rubber right away, working on their own answers to the fixed-function hardware built onto Sandy Bridge-based processors. But everything I’m hearing puts both companies a year away from having something able to compete. It’s like AMD with Eyefinity in that way—Intel took a major leap on the down-low, a number of ISVs were willing to play ball, seeing value added to their own products, and now the company has a major competitive advantage that’ll take a comparable effort to match.
What Does It Do?
There are two encompassing ideas here: encode and decode.
Intel already had a strong position on the decode front—its existing graphics-equipped processors are able to handle MPEG-2, VC-1, and AVC. However, motion compensation (the most complex piece of the decode pipeline) and loop filtering (applicable to VC-1 and AVC) have to be handled by the general-purpose execution units, eating up more power than necessary. Sandy Bridge rectifies this by moving the complete decode pipeline to an efficient fixed-function multi-format codec. It also adds MVC support, enabling Blu-ray 3D playback, too. Video scaling, denoise filtering, deinterlacing, skin tone enhancement, color control, contrast enhancement—all of those capabilities are addressed by blocks of logic in the graphics engine.
On the encode side, you have fixed-function logic working in concert with the programmable execution units. There’s a media sampler block attached to the EUs (Intel calls this a co-processor) that handles motion estimation, augmenting the programmable logic. Of course, the decoding tasks that happen during a transcode travel down the same fixed-function pipeline already discussed, so there’s additional performance gained there. Feed in MPEG-2, VC-1, or AVC, and you get MPEG-2 or AVC output from the other side.
Now, the way each company employs Quick Sync is naturally going to be different, depending on the application in question. Take CyberLink, for example. PowerDVD 10 capitalizes on the pipeline’s decode acceleration. A MediaEspresso project is going to be significantly more involved—it’ll read the file in, decode, encode, and turn back the output stream. Then, in PowerDirector, a video editing app, you have to factor in post-processing—the effects and compositing that happens before everything gets fed into the encode stage.
For now, we have a few media playback apps (decoders) and a couple of media conversion titles (encoders/transcoders) to play with.

CyberLink’s MediaEspresso has already been optimized to take advantage of AMD’s Stream (now called APP) and Nvidia’s CUDA API. Taken in the context of a machine without any hardware-based acceleration at all, you get a really freaking nice speed-up.
But the Quick Sync optimizations put Sandy Bridge in another league entirely. Converting an almost-500 MB source to 1024x768 for playback on an iPad takes a scant 22 seconds.

MediaConverter 7 presented more of a challenge. The pre-release version optimized for Quick Sync also accelerated AMD’s Stream API, but it wouldn’t recognize our GeForce GTX 570. Swapping out to the currently-shipping demo version didn’t give us a drop-down to turn on CUDA either, but it at least yielded results that show acceleration is definitely turned on. Arcsoft’s app has the added benefit of giving you a utilization monitor.
The result is fairly compelling. Without acceleration enabled, the iPad profile transcode took 1:35, tying up a Core i7-2600K at roughly 30% utilization. With Nvidia’s card installed, utilization jumped to 50%, but the job finished almost 20 seconds faster. A Radeon HD 6870 turns back lower utilization numbers and better performance results. But the Quick Sync-optimized path is most impressive, wrapping up in 41 seconds and barely touching the processor cores at all.
Unfortunately, you have to be using Intel's integrated graphics core in order to take advantage of Quick Sync. Neither MediaEspresso or MediaConverter are able to recognize the pipeline with a discrete card installed. So, if you're doing media work on a gaming PC, Quick Sync might not be an option for you.
That takes care of the encode/transcode optimizations, but what about decode? I really wanted to know what Arcsoft and CyberLink were doing with Quick Sync leading up to launch, so I spent time with both companies talking about their work.
Sub-10 percent utilization on a mobile CPU
It turns out that the decode pipeline on Sandy Bridge is so complete that even the AACS decryption is offloaded to fixed-function hardware. AACS employs AES encryption, which most Sandy Bridge-based CPUs accelerate, so that’s rather convenient.
In a best-case scenario, Arcsoft’s reps say that you bitstream encoded Dolby TrueHD or DTS-HD Master Audio to an HDMI 1.3- or 1.4-capable receiver (meaning there is no audio decoding for the CPU to do) and see CPU utilization as low as 0% playing back Blu-ray content.
Bitstreaming DTS-HD Master Audio
I used CyberLink’s PowerDVD 10 bitstreaming audio and didn’t get results that stunning. However, CPU utilization did hover around 10% on a Core i7-2820QM-equipped notebook while watching Quantum of Solace, which employs AVC.
Early previews benchmarking Sandy Bridge’s graphics suggested that we might have a solution capable of displacing some of the entry-level discrete market. Of course, Intel excitedly included that in the channel-oriented marketing material that press guys like me aren’t supposed to see.
As it pertains to the desktop market, though, you’re going to be disappointed for a few different reasons.
One Giant Leap…For Intel
Let’s start from the beginning—or at least the previous-generation implementation. As you know, the Clarkdale-based processors that launched one year ago were the first to benefit from Intel’s manufacturing leadership. Specifically, the dual-core processor was manufactured at 32 nm. But Intel used 45 nm lithography for a second on-package die, composed of a memory controller, PCI Express controller, and its Ironlake graphics engine.
That was a solid first step toward integrating even more functionality into the CPU, but it wasn’t ideal. Graphics performance was better than previous chipset-based implementations, yes. Memory performance dropped compared to Lynnfield and Bloomfield, though, since the controller migrated off-die.
With Sandy Bridge, all of that logic gets glued together, giving Intel a lot more control over its behavior. For example, the graphics core now has access to last-level cache, so the architecture has a mechanism to prevent thrashing between the cores and graphics engine. As mentioned, in 3D-heavy workloads, the power control unit can bias toward the graphics core, allocating it more thermal budget to run at frequencies of up to 1350 MHz.
The nomenclature Intel uses is similar from last generation to this one. Its HD Graphics engine still employs 12 scalar execution units (or EUs, for short) with DirectX 10.1 compatibility. However, a number of architectural improvements, like larger registers, mathbox integration, and new instruction support purportedly double the instruction throughput compared to the Ironlake GPU on Clarkdale. Factor in significant frequency increases and you’re looking at the potential for big performance gains.
A Tale Of Two GPUs
So far so good, right? Well, here’s where things start getting a little more…uh, weird.
There are two versions of the graphics core, inconspicuously dubbed HD Graphics 3000 (GT2) and HD Graphics 2000 (GT1). The former features all 12 EUs, while the latter is limited to six.
Of the 15 mobile Sandy Bridge-based SKUs being announced, all of them offer HD Graphics 3000. Some models run at up to 1300 MHz, others run at 1100 MHz, one does 950 MHz, and another tops out at 900 MHz. As you might guess, the final specification is largely dependent on TDP.
The desktop is a different story entirely. Of 14 new Sandy Bridge-based CPUs, only two of them come equipped with HD Graphics 3000. Almost humorously, those two chips are the K-series SKUs—enthusiast-oriented parts that I’m willing to bet will never get called on to perform a 3D task. The other 12 models—the ones that’ll go into more mainstream home and office desktops—get HD Graphics 2000. Those are the processors whose owners would actually appreciate saving $50 on an entry-level discrete card. And they get pegged with the handicapped version.
Power users spending an extra $20 on a K-series chip also buy discrete graphics. Period. This point gets hammered home even harder on the next page, where you learn that the H67 chipset needed to utilize on-die graphics doesn't support processor overclocking, so there's really no reason for a K-series/H67 combination.
There’s a fair chance my assessment will be different when it comes to testing Sandy Bridge in a mobile environment. LCD screens running lower native resolutions are totally complemented by the more powerful HD Graphics 3000 engine. But before I even get into the benchmarks, it looks like Intel missed a real opportunity to show off its efforts on the desktop.
It’s A Numbers Game

Let’s kick things off with a title right up Intel’s alley: World of Warcraft: Cataclysm. I’m using the same benchmark seen in our recent performance evaluation of the game—a flight from Crushblow to The Krazzworks in Twilight Highlands.
The HD Graphics 3000 engine actually stands up impressively to AMD’s Radeon HD 4550 512 MB—a card you can find for roughly $25 online. The HD Graphics 2000 core doesn’t fare well at all, even using the second-lowest detail preset available. It’s faster than Clarkdale’s on-package solution using half as many EUs, which I suppose says something. But if you want to play this game (even at very modest settings), you’re not using a desktop Sandy Bridge processor.
I threw in a Radeon HD 5550 with 1 GB of DDR3 memory for the sake of comparison, found online for $55 or so. Offering more than twice the HD Graphics 3000's performance, that’s a pretty sizable upgrade if you want to bump the detail slider up a notch or run at 1920x1080.

Nobody ever said integrated graphics were supposed to handle first-person shooters, but CoD is notoriously graphics-light, so I figured I’d give it a shot.
Once again, to Intel’s credit, HD Graphics 3000 stands up well to a very entry-level add-in card. But the HD Graphics 2000 implementation’s only real bragging point is beating Clarkdale’s Ironlake engine. In comparison, a relatively affordable Radeon HD 5550 is actually playable at 1680x1050.


The outcome is as-expected in these two tests—HD Graphics 3000 shows promise against entry-level AMD gear, but unless you’re going to spend extra on a K-series processor and use its graphics engine, you have to look at the HD Graphics 2000 results instead. And those are only marginally better than Clarkdale (which doesn't even complete the Left 4 Dead 2 benchmark successfully).
At least on the desktop, integrated graphics maintains the status quo. It’s fine for elementary tasks, and it’s too anemic for gaming. What a let-down.
Sandy Bridge processors are not compatible with Intel’s 5-series chipsets. I guess this is fine, since you already have to buy a new motherboard as a result of LGA 1156 getting abandoned. But that doesn’t make another platform upgrade an easy pill to swallow for the mainstream market—presumably folks who don’t have two or three grand to spend on technology every time “next-gen” becomes “current-gen” hardware.
At launch, there are two desktop-oriented chipsets to go with Sandy Bridge: P67 and H67. The former is intended for use with discrete graphics. To that end, P67 is your only option for dividing the 16 lanes of processor-based PCIe connectivity between multiple graphics cards. For a majority of enthusiasts, P67 is the way to go. The latter option, H67, is the only way for you to take advantage of Sandy Bridge’s integrated graphics engine.
Worried about P67's performance with multiple graphics cards installed? Don't be. We've already shown that you can get very X58-like performance out of a P55 board armed with Nvidia's NF200 bridge chip, even using a trio of Radeon HD 5870s.
Both platform controller hubs serve up as many as 14 USB 2.0 ports. Neither of them supports USB 3.0. The pair also exposes as many as six SATA ports, two of which run at 6 Gb/s transfer rates (the other four are limited to 3 Gb/s). Neither extends support for legacy PCI.
In return, though, the two chipsets finally offer 5 GT/s signaling, enabling 500 MB/s per direction, per lane. P67 and H67 both include eight lanes, just like P55/H57. Presumably, that’ll be nice for add-on USB 3.0 and SATA 6Gb/s controllers, though the fact that Intel’s data sheet still lists the DMI interface at 1 GB/s in each direction could still cause congestion.
| H67 Express | P67 Express | P55 Express | |
|---|---|---|---|
| Interface | LGA 1155 | LGA 1155 | LGA 1156 |
| Memory Channels / DIMMs Per Channel | 2/2 | 2/2 | 2/2 |
| USB 2.0 | 14 | 14 | 14 |
| Total SATA (6 Gb/s) | 6 (2) | 6 (2) | 6 (0) |
| PCIe | 8 (5 GT/s) | 8 (5 GT/s) | 8 (2.5 GT/s) |
| PCI Slot Support | None | None | 4 |
| Independent Display Outputs | 2 | 0 | 0 |
| Protected Audio/Video Path | Yes | No | No |
| Rapid Storage Technology | Yes | Yes | Yes |
| Overclocking | Graphics-only | Processor ratio-only | Processor ratio / BCLK |
Being the only chipset able to expose Sandy Bridge’s graphics capabilities, H67’s differentiators are naturally graphics-oriented. The PCH can do dual independent display outputs, for starters. It’s also the key to a protected audio/video path—mandatory for Blu-ray playback and bitstreaming high-def audio to a receiver. Finally, H67 lets you manually overclock on-die graphics.
How about H67’s limitations? Well, H67 does not support processor overclocking. If you pay a premium for a K-series SKU to get the faster graphics engine, you’re limited to the chip’s highest Turbo Boost setting as its frequency ceiling. H67 is also locked to Sandy Bridge’s programmed memory and power limits. To get unlocked core, power, and memory settings, you have to use P67. More on overclocking after the jump…
Later in 2011, Intel will release a chipset called Z68, which will facilitate core and graphics overclocking on the same board. That’s not to be confused with X78—Intel’s next-gen flagship chipset, set to replace X58.
As you probably already know, Sandy Bridge dramatically alters the way enthusiasts can approach overclocking. For a great many, the days of wringing massive gains out of a scalable architecture like Nehalem are over. Options do still exist, though.
The back-story is already pretty well known. In an effort to simplify its design (which really does make sense from an engineering perspective), Intel integrated the clock generator into the 6-series chipsets. Now, one clock affects the entire system, meaning you can’t independently set the frequencies of various subsystems like PCI Express and the DMI.
Unfortunately, PCI Express doesn’t like to operate very far outside of its specification, so any significant deviation beyond the new 100 MHz BCLK causes problems. Though there’s generally a few percentage points worth of wiggle room, the days of taking Nehalem’s 133 MHz BCLK up to 200+ MHz are history. Overclockers are basically losing one of the two variables that previously affected processor performance. Intel addresses this in two ways.
First, it carries over the unlocked K-series that first surfaced back in May of last year. These parts top out at a 57x ratio multiplier, enabling frequencies of up to 5.7 GHz without touching the BCLK. Intel says the 57x is largely a “design consideration,” whatever that means. The good news for the LN2 crowd is that the company is working on a BIOS that’ll go higher and apply to today’s CPUs. The K-series chips also offer "unlocked" DDR3 memory ratios, which aren't literally unlocked, but rather exposed up to DDR3-2133 (higher than most kits are capable of going anyway). Power and current limits can also be custom-specified, too.
There are only two K-series parts at launch: the Core i7-2600K and the Core i5-2500K. The unlocked i7 costs $23 more than the partially-unlocked version of the same chip, while the i5 runs $11 more expensive than its less-flexible equivalent. When you consider that, at its default settings, the Core i5-2500K runs at 3.3 GHz and Turbo Boosts up to 3.7 GHz, compared to the Core i5-760 at 2.8 GHz, an unlocked Sandy Bridge chip for $11 extra is actually pretty damn sexy.
If you don’t buy a K-series chip and instead grab a Core i7-2600, Core i5-2500, -2400, or -2300 (along with a P67-based motherboard), you’ll still have access to “limited unlocking.” This basically means you can set clock rates up to four speed bins above the highest Turbo Boost frequency setting available at any given level of processor activity.
So, take a Core i7-2600 as an example. The chip’s base clock is 3.3 GHz. With four cores active, it gets one bin worth of additional performance—3.4 GHz. Four bins above that would be 3.8 GHz. With two cores active, Turbo Boost bumps it up two bins, to 3.5 GHz. Limited overclocking makes 3.9 GHz available in that case. In a best-case scenario, only one core is active. Turbo Boost adds four bins of frequency, yielding 3.7 GHz, and Intel’s overclocking scheme lets you run at up to 4.1 GHz.
Anyone with a K-series CPU overclocking on air is going to be in good shape. Thomas and I both have Core i7-2600Ks that’ll do 4.7 GHz at 1.35 V all day long. More mainstream folks with non-K i5s and i7s will at least have an extra 400 MHz to milk from their chips. It’s the value-oriented buyers with processor budgets between $100 and $150 (where AMD offers some of its best deals) who get screwed. The only two Sandy Bridge-based options under $175 are the Core i3-2100 and -2120 at 3.1 and 3.3 GHz, respectively. No Turbo, no BCLK option, no limited unlock—those chips are quite literally stuck.
As with the integrated graphics situation, I think that Intel missed the boat by trying to use overclocking as a differentiating feature. The guys who hit 7 GHz+ in our recent K-series overclocking contest are getting artificially capped. The folks buying at the bottom end of the mainstream stack can’t touch their BCLK or multiplier settings. And unless you buy one of two K-series SKUs, you’re on a Turbo Boost + 400 MHz leash.
Hopefully AMD is taking notes. Though most of its newest 45 nm processors don’t offer a ton of headroom, sticking 32 nm manufacturing later this year could make flexible Bulldozer-based CPUs very attractive to anyone who feels like Intel is muscling them out of overclocking at the high- and low-end.
New Names, Of Course
There are a total of 14 new desktop CPUs launching today (an additional 15 are being made available in the mobile space). The Core i3, i5, and i7 brands persist, roughly denoting entry-level, mainstream, and enthusiast parts. However, the modifiers are changing. Also, Intel is making more rampant use of suffixes at the end of the model names.
| i7-2600K | i7-2600 | i5-2500K | i5-2500 | i5-2400 | i5-2300 | i3-2120 | i3-2100 | |
|---|---|---|---|---|---|---|---|---|
| Price | $317 | $294 | $216 | $205 | $184 | $177 | $138 | $117 |
| TDP | 95 W | 95 W | 95 W | 95 W | 95 W | 95 W | 65 W | 65 W |
| Cores / Threads | 4/8 | 4/8 | 4/4 | 4/4 | 4/4 | 4/4 | 2/4 | 2/4 |
| Base Clock | 3.4 GHz | 3.4 GHz | 3.3 GHz | 3.3 Ghz | 3.1 GHz | 2.8 GHz | 3.3 GHz | 3.1 GHz |
| Max. Turbo Clock | 3.8 GHz | 3.8 GHz | 3.7 GHz | 3.7 GHz | 3.4 GHz | 3.1 GHz | N/A | N/A |
| Memory (MT/s) | 1333 | 1333 | 1333 | 1333 | 1333 | 1333 | 1333 | 1333 |
| L3 Cache | 8 MB | 8 MB | 6 MB | 6 MB | 6 MB | 6 MB | 3 MB | 3 MB |
| HD Graphics | 3000 | 2000 | 3000 | 2000 | 2000 | 2000 | 2000 | 2000 |
| Max. Graphics Clock | 1350 MHz | 1350 MHz | 1100 MHz | 1100 MHz | 1100 MHz | 1100 MHz | 1100 MHz | 1100 MHz |
| Hyper-Threading | Yes | Yes | No | No | No | No | Yes | Yes |
| AVX Support | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Quick Sync Support | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| AES-NI Support | Yes | Yes | Yes | Yes | Yes | Yes | No | No |
| Interface | LGA 1155 | LGA 1155 | LGA 1155 | LGA 1155 | LGA 1155 | LGA 1155 | LGA 1155 | LGA 1155 |
Consistent across the new models is the ‘2’ leading each model designator. Of course, this represents Intel’s second-generation Core processors, and is almost-humorously the only component of Intel’s naming scheme that actually means something. The three numbers that follow are arbitrary performance indicators—exactly what you grew accustomed to from the Nehalem-era CPUs. Intel uses clock rate, L3 cache, Hyper-Threading, and Turbo Boost to differentiate one model from another. It’s a safe guess, though, that -2600 is faster than -2500 and so on.
Some of the model numbers end with four digits. Others are succeeded by a K, S, or T. We already know from the Core i7-875K and Core i5-655K that the K denotes an unlocked clock multiplier. Intel is offering two K-series SKUs—the Core i7-2600K and Core i5-2500K—both hit with a premium over the non-K versions. If you’re an enthusiast planning to overclock, it’s worth ponying up the extra cash for the more flexible parts.
S-series parts should be familiar as well. We’ve seen Intel play games with the S designator in the past, dropping performance on its Core i5-750S to hit an 82 W TDP and simultaneously raising its price. The company isn’t giving out prices on its S-class models prior to launch, claiming these are going to be channel-oriented CPUs that you won't be buying online. We do know these “lifestyle” parts feature lower 65 W TDPs, though, and will still hit the same maximum Turbo Boost levels when the thermal headroom exists.
The ‘T’ suffix is new, denoting a handful of low-power 35 and 45 W desktop processors that employ reduced voltages and base clock rates to hit more aggressive thermal profiles. The only model that defies Intel’s establish nomenclature is the Core i5-2390T, which doesn’t feature four cores, like the i5 would suggest, but instead offers two cores with Hyper-Threading. Why this couldn’t have just been a Core i3, I’m not sure.
A New Interface, Too
This one is bound to rile up anyone who recently spent their Christmas cash on a new Lynnfield- or Clarkdale-based platform. Yes, Sandy Bridge employs a new processor interface called LGA 1155. Yes, that’s one-pin off from the existing LGA 1156 interface, breaking the compatibility of a socket that’s just over one year old. In fact, the actual interface is identical; it’s just keyed differently to prevent you from dropping in a Lynnfield- or Clarkdale-based CPU.
LGA 1156 (left) and LGA 1155 (right): Completely different pin-outs
Intel says the move to LGA 1155 couldn’t be helped. Sandy Bridge revolves around the idea of integration. Things moved onto the processor die that weren’t there before. And as a result, pins had to be moved around in response. The folks we talked to at Intel insisted that, had it been possible to make Sandy Bridge LGA 1156-compatible, it would have, as the company doesn’t make any money on an interface transition (that’s only partially true—it’s still selling the chipsets that go onto new motherboards).
But before you jump all over Intel for this one, realize that AMD faced the same challenges with its Bulldozer-based Zambezi, expected later this year. The company has gone on record saying it could have made the next-gen processor AM3-compatible, giving up architectural capabilities in the process. The smart move, however, was to simply transition to Socket AM3+, enabling the architecture’s full complement of features.
Bottom line: LGA 1155 breaks compatibility with the existing infrastructure, necessitating a platform upgrade. Unfortunately, the P67/H67 chipsets don’t really give you any features that weren’t already available on high-end P55-based motherboards, so the value proposition takes a substantial hit if you’re already rocking a decent mid-range machine.
| Test Hardware | |
|---|---|
| Processors | Intel Core i7-2600K (Sandy Bridge) 3.4 GHz (34 * 100 MHz), LGA 1155, 8 MB Shared L3, Hyper-Threading enabled, Turbo Boost enabled, Power-savings enabled |
| Intel Core i5-2500K (Sandy Bridge) 3.3 GHz (33 * 100 MHz), LGA 1155, 6 MB Shared L3, Turbo Boost enabled, Power-savings enabled | |
| Intel Core i5-2400 (Sandy Bridge) 3.1 GHz (31 * 100 MHz), LGA 1155, 6 MB Shared L3, Turbo Boost enabled, Power-savings enabled | |
| Intel Core i3-2100 (Sandy Bridge) 3.1 GHz (34 * 100 MHz), LGA 1155, 3 MB Shared L3, Hyper-Threading enabled, Power-savings enabled | |
| Intel Core i7-875K (Lynnfield) 2.93 GHz (22 * 133 MHz), LGA 1156, 8 MB Shared L3, Hyper-Threading enabled, Turbo Boost enabled, Power-savings enabled | |
| Intel Core i5-655K (Clarkdale) 3.2 GHz (24 * 133 MHz), LGA 1156, 4 MB Shared L3, Hyper-Threading enabled, Turbo Boost enabled, Power-savings enabled | |
| Intel Core i7-950 (Bloomfield) 3.06 GHz (23 * 133 MHz), LGA 1366, 8 MB Shared L3, Hyper-Threading enabled, Turbo Boost enabled, Power-savings enabled | |
| Intel Core 2 Quad Q9550 (Yorkfield) 2.83 GHz (8.5 * 333 MHz), LGA 775, 12 MB L2, Power-savings enabled | |
| AMD Phenom II X6 1100T (Thuban) 3.3 GHz (16.5 * 200 MHz), Socket AM3, 6 MB Shared L3, Turbo CORE enabled, Power-savings enabled | |
| AMD Phenom II X4 970 (Deneb) 3.5 GHz (17.5 * 200 MHz), Socket AM3, 6 MB Shared L3, Power-savings enabled | |
| Motherboard | Gigabyte P67A-UD7 (LGA 1155) Intel P67 Exrpress, BIOS F6a |
| Gigabyte H67MA-UD2 (LGA 1155) Intel H67 Express, BIOS F6a | |
| Gigabyte P55A-UD7 (LGA 1156) Intel P55 Express, BIOS F8b | |
| Gigabyte X58A-UD7 (LGA 1366) Intel X58 Express/ICH10R, BIOS FC | |
| Gigabyte 890FXA-UD5 (Socket AM3) AMD 890FX/DB850, BIOS F6 | |
| Intel DX48BT2 (LGA 775) Intel X48 Express/ICH10R, BIOS 2006 | |
| Asus P7H57D-V EVO (LGA 1156) Intel H57 Express, BIOS 1606 | |
| Memory | Kingston 8 GB (4 x 2 GB) DDR3-2133, KHX2133C9AD3W1K2/4GX x 2 @ DDR3-1333, 7-7-7-20 and 1.65 V |
| Crucial 12 GB (3 x 4 GB) DDR3-1333, MT16JTF51264AZ-1G4D1 @ DDR3-1333, 7-7-7-20 and 1.65 V | |
| Hard Drive | OCZ RevoDrive X2 240 GB PCI Express x4 (Main Test Bed) |
| Intel SSDSA2M160G2GC 160 GB SATA 3Gb/s (Graphics/Quick Sync Test Bed) | |
| Graphics | Nvidia GeForce GTX 580 1.5 GB |
| AMD Radeon HD 5550 1 GB DDR3 | |
| AMD Radeon HD 4550 512 MB DDR3 | |
| Power Supply | Cooler Master UCP-1000 W |
| System Software And Drivers | |
| Operating System | Windows 7 Ultimate 64-bit |
| DirectX | DirectX 11 |
| Graphics Driver | Nvidia GeForce Release 263.09 (For GTX 580) |
| Intel GFX_Vista64_Win7_64_8.15.10.2266_PV (For Sandy Bridge and Clarkdale) | |
| AMD Catalyst 10.12 (For Radeon HD 6870 1 GB) | |
That's a ton of hardware, right? Well, not all of it was used for all of the tests.
All 10 processors are represented in the bulk of the benchmarks. However, for the entry-level gaming metrics, we used the H67- and H57-based motherboards hosting a Core i7-2600K and Core i5-661 CPU. Onto the H67 platform, we dropped two AMD discrete cards for comparison to Intel's on-die and on-package solutions.
We chose the Radeon HD 6870 1 GB and GeForce GTX 570 1.25 GB as best-case examples of what each respective company's GPGPU technologies could do versus Quick Sync. That platform consisted of Gigabyte's H67 board hosting a Core i7-2600K processor.
OCZ's RevoDrive X2 240 GB was used exclusively on the main test bench, while we leaned on an Intel SSD for the platform that was separately running the gaming/Quick Sync tests.

Three Sandy Bridge-based CPUs top the Overall chart, followed by the Core i7-875K, which benefits from its ability to Turbo Boost up to 3.6 GHz.
Although Intel’s Core i7-950 runs at a faster base clock rate and enjoys the throughput of a triple-channel DDR3 memory interface, its highest official memory data rate is 1066 MT/s (compared to Lynnfield’s 1333 MT/s). And it probably doesn’t help that the 900-series processor can only Boost to 3.33 GHz.
A lack of software able to fully utilize six cores and a lower base clock rate hurts AMD’s Phenom II X6 1100T overall. You do see that, in certain disciplines, like TV and Movies and Communications, the hexa-core chip’s standing improves. Meanwhile, the Phenom II X4 970 sees a comparative advantage thanks to its fixed 3.5 GHz clock rate.
This won’t be the first time I say this in my benchmark analysis: if Phenom II were living in a Core 2 world, it’d be sitting pretty. Up against Nehalem, it simply gets outclassed. Compared to Sandy Bridge, it’s just not close. But then again, PCMark Vantage is technically a synthetic. Let’s keep moving and see if more specific workloads change the story our data tells.








A more recent addition to our benchmark suite, 3DMark11 is principally a gaming metric—the dedicated Graphics score clearly reflects this in a very tight grouping of results using our GeForce GTX 580 reference board.
However, several components of this test also employ CPU-based physics—specifically, the Bullet library. The result is a more spread-out grouping in the overall Performance test. Even still, it’s hard to start declaring winners with fewer than 1000 points separating 10 different contenders.
Drop down to the broken-out Physics test, though, and it’s clear that the high-frequency quad-core Sandy Bridge- and Nehalem-based chips get favored. In fact, it looks like plenty of Turbo Boost headroom gets the Core i7-2600K and Core i7-875K their first- and second-place finishes, followed by the quad-core (it doesn’t seem like scaling to eight threads matters much here) Core i5-2500K.
What does hurt is having two physical cores—even aided by Hyper-Threading. The Core i3-2100 and Core i5-655K are soundly beaten by the older Core 2 Quad Q9550.








As we’ve come to expect, Intel’s processors deliver solid integer performance, while the AMD CPUs do well in floating point math. These same results are reflected in both the Arithmetic and Multimedia benchmarks.
The Cryptography test naturally favors the Intel CPUs with AES-NI support. It’s notable that AES hashing bandwidth is actually higher in Sandy Bridge than it was back on Clarkdale. This was something that Intel announced back at IDF 2010, and we see that claim quantified here. Also, bear in mind that the new second-gen Core i3s—just like the first generation—ship with AES-NI disabled. That’s why the Core i3-2100 falls into last place.
Finally, Intel’s dual-channel memory controller appears to be improved, moving up to 1 GB/s more using the exact same modules set to the exact same latencies compared to Lynnfield. The Clarkdale-based Core i5-655K is naturally much slower, since its memory controller is off-die/on-package. Core 2 Quad pulls up the rear, complemented by the last of a dying breed of platforms with chipset-based memory controllers.

When it comes to professional content creation, it’s more common to find apps optimized for as many processing cores as you make available. Given AMD’s Phenom II X6 1100T placement, 3ds Max 2010 is employing all six of that chip’s execution cores. But Intel still manages to swipe a first-place finish using its Core i7-2600K. That’s a roughly $320 Intel processor going up against a roughly $270 AMD chip. Overall, not a bad showing for AMD…
…that is, until you get to Intel’s Core i5-2500K, which follows right behind AMD’s flagship, offering an unlocked multiplier and 95 W TDP for just under $220. Note that it’s beating Intel’s Core i7-875K—a Hyper-Threading-enabled CPU that sells for $100 more at the time of publication.
The rest of the field falls in place behind, for the most part tightly grouped. Check out the Core i5-2400 and Core i7-950 going head-to-head. That’s a $184 CPU matching stride with a $300 chip.

Seemingly bottlenecked by something other than threading, Photoshop is almost as kind to the Core i5-2500K as it is to the -2600K, both of which slide past the Core i7-875K.
Surprisingly, the Bloomfield-based Core i7-950 falls to the middle of the pack, just ahead of AMD’s hexa-core flagship. Confused as to why that part is able to best the 3.5 GHz Phenom II X4 970? Either Turbo CORE is kicking in, allowing the X6 1100T to jump to 3.7 GHz, or Photoshop really can put all six cores to work.
At least we know it’s able to use more than two cores—both dual-core Intel parts bring up the rear, including the Core i3-2100 in last place.

We use this Paladin sequence for benchmarking graphics. However, Premiere Pro CS5 doesn’t officially support CUDA acceleration on the GeForce GTX 580 without a little software hack. Otherwise its Mercury Playback Engine leans on CPU muscle.
More than 40 minutes of rendering time separates the first- and last-place finishers. We again see the Core i7-2600K up top—likely a result of heavy parallelism, which gives all three eight-thread CPUs first, second, and third spots. Two four-core Sandy Bridge chips follow, and AMD’s Phenom II X6 1100T falls behind them.
The message here is clear. First, if you’re doing heavy lifting in Premiere Pro, make sure you have CUDA support. We’ve seen this test finish in less than two minutes running on a mid-range GeForce card. Second, if you choose to ignore us on point one, throw as much CPU horsepower as possible at the app—it’ll use it.

Our custom After Effects benchmark isn’t as demanding as the Premiere Pro test. It’s hardly a surprise to see a trio of Sandy Bridge-based CPUs take the top three places, followed by Lynnfield and Bloomfield.
The six-core Phenom II X6 1100T’s extra processing resources give it an advantage over the Phenom II X4 970’s higher clock rate. And both AMD chips outmaneuver the older Core 2 Quad Q9550, along with Intel’s dual-core offerings.

By popular request, we’ve incorporated Blender into the test suite with a custom image rendering.
Starting from the bottom, both dual-core processors get embarrassed, despite the fact that Hyper-Threading allows each to operate on four threads at a time. Loud and clear, we’re hearing that dual-core chips aren’t the way to go for content creation.
Above that, you’re looking at a two-generation-old Core 2 Quad CPU and AMD’s fastest quad- and hexa-core processors. AMD’s placement on the charts is a little deceptive though, since the Phenom II X6 1100T is only six seconds slower than Intel’s Core i5-2400.
Still, the X6 1100T is priced at $265, while Intel asks $184 for its quad-core part.

Although the OpenGL test was run using a GeForce GTX 580 on all of these platforms, its results look like a shotgun blast on the wall. More relevant here are the CPU ratings, which put the Core i7-2600K in its familiar first-place spot, followed by AMD’s Phenom II X6 1100T.

ABBYY’s FineReader 10, an optical character recognition app, was another requested benchmark. We’ve automated the scanning of a 111-page document for testing—a task that apparently really appreciates parallelism.
The top two finishers are quad-core, Hyper-Threading-enabled CPUs, and third place goes to AMD’s six-core entrant. Fourth belongs to the Core i7-950 (Bloomfield), also able to work on eight threads concurrently.
It takes more than twice as long to get this workload completed on either of the dual-core CPUs compared to Intel’s Core i7-2600K.
Another significant comparison here is between the Core i5-2400 and Phenom II X4 970 (both $185 processors). In case you haven’t been keeping score, the i5 has beaten the Phenom II in every single test. It looks like AMD is going to have to drop prices to make its fastest quad-core chip competitive.

I stopped using the Lame benchmark a while back, but it makes for yet another point of comparison (and a decent indication of performance on a clock-for-clock basis, given its single-threaded operation), so I’m including it here.
The scaling falls in line with what we’d expect. As you get down to the Phenom IIs, bear in mind that the X6 1100T has Turbo CORE, which is able to send it to 3.7 GHz in a workload like Lame.
The only other anomaly would seem to be the Core i7-875K. But its 3.6 GHz Turbo Boost ceiling is undoubtedly the reason it beats the Core i5-655K, despite a sizable base clock disparity.

We phased WinZip out a while ago. But with the release of WinZip 14, Intel got the developer to include AES-NI support. So, we’re putting it back into rotation, alongside the latest versions of WinRAR (no AES-NI support) and 7-Zip (free to use; includes AES-NI).
If you didn’t know any better, you’d think we duplicated the charts for Lame and WinZip. Indeed, it looks like the fine folks at WinZip still haven’t optimized for threading, and so performance is based almost exclusively on IPC throughput and clock rate, rather than parallelism. Boooring.

Thanks to all of our readers who suggested adopting WinRAR 4.00 and 7-Zip 9.20—we’ve upgraded both utilities to the latest versions.
While the results of our WinRAR compression routine don’t look significantly different than the WinZip charts, you will notice that the dual-core Core i5-655K gets knocked down to last place and the Core i3-2100 falls behind the Core i7-950 and Core i7-875K.
Unfortunately for AMD, the six-core 1100T and four-core 970 don’t move up in the standings, despite the threading optimizations in WinRAR. We’re not sure if this is a development issue or not, but there’s a pretty clear tendency toward the new Sandy Bridge processors here.

Rather than run the same set of files through a third compression utility, we took advantage of 7-Zip’s built-in ability to measure each platform’s performance in millions of instructions per second.
Parallelism really benefits the 7-Zip metric, which we set to take advantage of all available threads on each CPU.
We’ve already looked at two applications that were written to take advantage of Intel’s Quick Sync pipeline prior to the Sandy Bridge launch. Let's see those results one more time, as a reminder of what's possible when the processing load shifts from general-purpose hardware to specialized fixed-function logic:


Of course, there are more programs still lacking the requisite hooks—these have to employ general-purpose execution resources to get their workloads finished. As such, they reflect the performance of the CPU cores.

An upgrade to our iTunes benchmark makes us current yet again, but it’s really of little consequence since Apple still runs everything on a single thread. He who sits down to this table with the most aggressive turbo implementation wins.
Intel’s Core i3-2100 and Core i5-655K swap places compared to the Lame and WinZip charts on the previous page. Otherwise, they’re identical. Without more in the way of developer attention, that’s the way every single-threaded title is going to end.

Fortunately, not every application is as poorly optimized as iTunes. MainConcept uses as many processor threads as it can get its hands on. Moreover Sonic Solutions recently launched version 1.1 of its CUDA SDK, facilitating transcoding from MPEG-2, VC-1, or H.264 to H.264 in hardware. Hopefully, the company will update the software to exploit Quick Sync as well.
Intel’s Core i7-2600K is the lone processor to complete our test in under a minute. From there, the Lynnfield-based Core i7-875K takes second, followed by AMD’s Phenom II X6 1100T. Intel’s Core i5-2500K is just one second behind, tying the pricier Core i7-950, which operates on eight threads concurrently, but runs at a slower clock rate.

HandBrake also makes good use of parallelism, handing the Core i7-2600K a massive win. The Phenom II X6 1100T shows well, outmoding the Core i7-875K by one second for second place. From there, things look very similar to the MainConcept benchmark, with the Core i5-2500K, Core i7-950, and Core i5-2400 landing close together.
AMD’s quad-core Phenom II X4 970 trails further back, besting the aged Core 2 Quad. Naturally, the two dual-core models bring up the rear in any benchmark that emphasizes threading.



We used the fastest single-GPU graphics card available in order to expose any platform-oriented bottlenecks in Metro 2033. With that said, it’s hard to imagine anyone buying a GeForce GTX 580 and gaming at 1680x1050. If they did, they’d see performance start to drop off in a noticeable way starting with AMD’s Phenom II X4 970, continuing on through Intel’s dual-core offerings, and ending with an older Core 2 Quad Q9550.
The moral of the story here seems to be that, as you step up to higher-end graphics, a dual-core processor simply isn’t fast enough.
Also interesting is that the six-core Phenom II X6 1100T, though not the fastest offering, opens up enough headroom to enable the highest minimum frame rate in our Metro 2033 benchmark. That advantage shrinks as you crank resolution up, though, shifting more demand onto the GPU. By the time you hit 2560x1600, eight of 10 platforms fall within one frame per second of each other.



F1 2010 is far more CPU-dependent than Metro 2033, a notoriously graphics-heavy game. In fact, even with the same GeForce GTX 580 installed through our run of 10 different platforms, the difference between first and last place is more than double!
Three Sandy Bridge-based CPUs take the top three spots at 1680x1050 and 1920x1080. Things get a lot closer at 2560x1600 though, where graphics take more of the load. Ignoring 1680x1050 (nobody with high-end graphics and a respectable processor is gaming on a 17” display, right?), the AMD processors get pounded. A $185 Core i5-2400 does 83.3 FPS at 1920x1080, while the $185 Phenom II X4 970 manages just under 50. You really have to crank the resolution up another notch to 2560x1600 (or dial up anti-aliasing) to get these CPUs on more equal footing.



This one’s easy. Aliens Vs. Predator doesn’t give a damn about the CPU under your hood. Even at 1680x1050, there are about four frames per second difference between the 10 tested configurations. Feed this one GPU power and it’s good to go.

I’ve been having fun logging power in my graphics card reviews, so I’m going the same route here. I took data for all 10 configurations, but that turned out to be very messy on a single chart. So I left out the two lower-end Sandy Bridge chips, along with the Lynnfield and Clarkdale processors.
With six different series on the graph, there are some interesting observations to make. First, the Phenom II X6 1100T sucks down a lot of power, seemingly followed by the Phenom II X4 970.
In all actuality, when you run the averages, Intel’s Core i7-950 turns out to be the second most power-hungry processor (the X6 1100T winds up at 197 W, the i7-950 sits around 181 W, and the X4 970 averages 180).
| PCMark Vantage Complete Run | Core i7-2600K (Sandy Bridge) | Core i5-2500K (Sandy Bridge) | Core i7-950 (Bloomfield) | Core 2 Quad Q9550 (Yorkfield) | Phenom II X6 1100T (Thuban) | Phenom II X4 970 (Deneb) |
|---|---|---|---|---|---|---|
| Average System Power | 163.99 W | 164.34 W | 181.73 W | 161.56 W | 197.12 W | 180.91 W |
How do Intel’s two fastest Sandy Bridge-based chips fare? The Core i7-2600K sits at 164 W. So does the Core i5-2500K. Compare those figures to the Core 2 Quad Q9550, which averages 161 W. Then go back and look at the PCMark Vantage results page. The Core i7-2600K pulls a first-place finish. The Core 2 Quad winds up last. Are these 32 nm chips more efficient (getting more work done within a similar power profile)? Yeah, we’d say so. We'll be following up in the next couple of days with a story dedicated to comparing Sandy Bridge's efficiency to a number of other platforms. More on that soon.
No doubt, there’s a lot going on in this launch. The Sandy Bridge introduction hits a number of high notes that have me dusting off an award, while simultaneously compelling me to cringe at a couple of Intel’s clumsier moves.
Let’s start with the bad, so I can wrap up on a positive note for the New Year.
Overclocking isn’t handled well at all. Really, the only viable option for power users is a K-series SKU. That’s not entirely bad, of course. Less than one year ago, the only unlocked option in Intel’s portfolio was priced at $999. The fact that we have a couple of choices in the $200 and $300 ranges is great. But the limited overclocking (Core i5/i7) and outright lack of options (Core i3) strikes a sour chord sure to burn off a lot of the enthusiast equity Intel earned by launching the K-series chips last year.
The graphics situation, at least on the desktop, is also pretty whacky. Of the 14 models introduced at launch, the two best suited to enthusiast-oriented gaming machines with discrete GPUs are the ones armed with Intel’s HD Graphics 3000 engine. The other 12—conceivably candidates for more mainstream gaming builds, office desktops, and HTPCs—sport the downright average HD Graphics 2000 implementation.
Those two gripes out of the way, how could we not be impressed by Sandy Bridge’s performance? Existing Lynnfield- and Clarkdale-based processors already offer strong performance compared to AMD’s lineup. Significant gains, clock-for-clock, compound in the face of notable frequency increases across the board (thanks to a mature 32 nm process), giving Sandy Bridge an even more commanding position.
I’m also a big fan of Quick Sync. Neither AMD nor Nvidia have an answer to Intel’s decode/encode acceleration, and they’re not expected to any time soon. If you do a lot of video editing or transcoding, an upgrade to Sandy Bridge might be warranted based solely on the time you’ll save by virtue of this feature. Kudos to Intel for getting developer support lined up right out of the gate, too. If the graphics guys could rally the software industry as quickly, we'd already be swimming in CUDA- and APP-accelerated titles.
If there was one Sandy Bridge-based SKU that I’d personally recommend to friends and family building new PCs, it’d be the Core i5-2500K. Its performance relative to AMD’s lineup and the rest of Intel’s stack is noteworthy—especially given its price tag just north of $200. The i5-2500K circumvents Sandy Bridge's overclocking challenges with an unlocked multiplier, and I'm counting on gamers to drop it onto a P67-based motherboard, skirting the integrated graphics debate entirely.

And while this is only the second time in two and a half years that I’ve dusted off the Recommended Buy award for a very deserving processor, you’d better believe I have an eye to the future, waiting to see how AMD’s Bulldozer architecture contends with Intel’s ever-plodding tick-tock cadence.
For a chance at winning your own Core i7-2600K-based PC, please click this link to enter our CyberPower PC/Tom's Hardware contest. The system's specs are as follows:
- Intel Core™ i7-2600K LGA 1155
- Thermaltake Armor A60 Mid-Tower Chassis
- Asus P8P67 Pro LGA 1155 ATX Mainboard
- Asus ENGTX 460 Video Card
- Asus BC-08B1LT 8x Blu-ray Player & DVD-RW Combo
- Kingston 4 GB (2 x 2 GB) DDR3-1600 Dual-Channel Memory Kit
- Maximum 120 mm Case Cooling Fans
- Thermaltake Frio CPU cooler
- 1 TB SATA 6 Gb/s 64 MB Cache 7200 RPM HDD
- Microsoft® Windows® 7 Home Premium
- Thermaltake 600 W Power Supply
Contest is limited to residents of the USA (excluding Rhode Island) 18 years of age and older. Contest starts on January 2, 2011 9:00 PM, Pacific Standard Time and closes on January 17, 2011 11:59 PM, Pacific Standard Time.
Results will be announced by January 21, 2011.
The information you provide will only be used to contact you in relation to this contest.
YOU MAY SUBMIT ONLY ONE ENTRY. MULTIPLE ENTRIES FROM THE SAME PERSON WILL ALL BE DISCARDED.















