AMD's Bulldozer Architecture: Overclocking Efficiency Explored

Overclocking And Undervolting AMD's FX Family

AMD's new FX family was highly anticipated, but its performance simply underwhelmed us. Rather than leap-frogging Intel's mainstream CPUs, it only managed to (at best) match and (at worst) fall behind them. Of course, this is a result of a redesign from the ground-up, which involved certain decisions that affected performance, and others made with power efficiency in mind. In theory, the FX family should be more efficient than its predecessor thanks to AMD's decisions. And a 32 nm manufacturing node would have been thought to help, too.

Just how does the design fare with regard to power as you move frequency around? That's what we're aiming to find out.

There are seven models based on the Bulldozer architecture, presenting a range of clock rates and prices to the folks interesting in dropping one of these chips into a Socket AM3+-based motherboard. For more information about them, check out AMD Bulldozer Review: FX-8150 Gets Tested.

Better Utilization Thanks To Second-Gen Turbo Core

Turbo Core, similar to Intel's Turbo Boost technology, tries to optimize processor performance by evaluating several power-related variables in real time and adjusting clock rate in response. When thermal headroom permits, the feature increase frequency, completing workloads faster and ideally dropping you back to idle more quickly.

From our FX launch story:

"Application Power Management (APM) describes Zambezi/Valencia/Interlagos’ ability to monitor (in real-time) the amount of power each core consumes. Rather than taking thermal or current measurements, the activity of each Bulldozer module is tracked. AMD knows how much power each operation requires and is able to come up with instantaneous power use on a per-module basis. A quick comparison between real consumption and maximum TDP indicates whether or not there’s headroom to increase performance. In an example where you’re running an application that doesn’t tax the processor’s resources, Turbo Core dithers between the processor’s base frequency and a higher clock rate, jumping between them to average better overall performance at the defined TDP.

Turbo Core isn’t limited to just a base and some arbitrarily higher frequency, either. It’s actually implemented in three p-states: the base (referred to as P2), an intermediate state (P1), and a higher state (P0). That’s an improvement over the first-gen version of Turbo Core, which AMD says only switched between two p-states. And it’s significant, too, because you can enter P1 with all eight cores active, so long as the headroom is there. Stepping up to P0 requires at least two of four modules to idle. AMD does allow the chip’s TDP to be exceeded instantaneously, but of course it can’t hold that for any thermally significant amount of time.

As such, when you look at the specs for an FX processor and see CPU Base, CPU Turbo Core, And CPU Max. Turbo, you are guaranteed to always get at least that base frequency. You’ll see the Turbo Core clock rate so long as TDP is in check (as it would be in a well-threaded workload that doesn’t exceed the processor’s thermal ceiling). And, whenever half of the chip’s cores are idle, it’s possible to realize maximum Turbo Core speeds."

How Efficient is Bulldozer?

Although a superficial examination of AMD's architecture implies some pretty lofty expectations on the efficiency front, enthusiasts only really care about how they translate to the real world. We answered a lot of questions in AMD FX: Energy Efficiency Compared To Eight Other CPUs. However, in that story, we limited ourselves to stock clocks. Here, we're expanding our analysis to overclocking.

We also want to find out where the Bulldozer architecture achieves a balance between low voltage, low power, and decent performance. It's particularly convenient, then, that all of the FX-based processors feature unlocked multiplier ratios. Combined with our test bench's firmware, which lets us easily modify voltage and performance, we're able to fine-tune performance very flexibly. We have six different combinations of clock rate and voltage to explore, so let's get to it.

  • aznshinobi
    Reading conclusion paragraph, I'd have to agree. I think they probably would've been better of using the STARS arch and just die shrinking it to 32nm.
    Reply
  • Darkerson
    I know I have been critical in my comments here and there, but I really do hope Bulldozer helps AMD learn and refine Piledriver and future CPUs so that they are all better as a result. I know I will be skipping BD, but that doesnt mean I dont ever want to use AMD again. I will always root for the underdog, in hopes that we have another Athlon 64 on our hands again.
    Reply
  • hellfire24
    gulftown=expensive and useless.
    Sandybridges=king of the hill(price to performance)
    Sandybridge-E=expensive sandybridge.
    Bulldozer=budget cpu with multitasking capabilities.
    Reply
  • deadon2
    Fehh... did my build on a 990fx platform with a 955be CPU. Runs plenty fast, and I can upgrade the AM3+ in a year when AMD gets it right.

    Although I appreciate the work done on this article...

    Nothing to see here folks, move along...
    Reply
  • dontcrosthestreams
    im just fine with my 110$ 955be.... 29 deg idle at 3.7ghz
    Reply
  • noob2222
    Is that a typo on page 7 and 8? "Clock Frequency: 4.5 GHz, Multiplier: 22.5x, CPU Voltage: 1.428 V" cpu-z shows 1.380? page 8 cpu z shows 1.44 and not 1.5.

    As for my own efficiency testing, I achieved 1.375V (cpu z), 4.4Ghz out of my 8120 with ease. I upped the NB to 1.115v (+.015V)wich added more stability and clocked the NB to 2600 to match HTT, wich brought another 1gb/s on sandra's memory test. All without disabling C1E or C3 states.

    Would be nice to see some followups with memory testing, BD responds to fast speeds. Hard to read since its in a different language but the graphs are easy enough to see
    http://www.planet3dnow.de/vbulletin/showthread.php?t=401023&garpg=13
    Reply
  • Tom's Hardware finds that overclocking increases speed, power requirements. Film at 11.
    Reply
  • de5_Roy
    yay! another efficiency article from toms. :love:
    sad to see amd's claims about efficiency turn out to be (much) less than accurate.
    some people are definitely gonna complain about the ram used (ddr3 1333) and windows 8 or lack of highly threaded benchmarks like truecrypt encryption or pov ray tracing (as if those are always used by regular users lol) and stuff.
    undervolting does look promising...but it doesn't seem to make any difference compared to sandy bridge systems. worse, bulldozer needs voltage increase to get more clockspeed.. i guess it will be more evident in fx 4100 and 6100 where substantial core voltage increase is necessary to get stock sandy bridge level performance out of them. that's just disappointing.
    Reply
  • memadmax
    It seems to me that Bulldozer is either a AMD bastard child chip, or it's a first gen chip of which subsequent generations of the architecture will be playing "catch up" performance wise. Otherwise, it's typical AMD trying to be efficient rather than a heavy hitter.

    But if you ask me, this is a "defensive" chip in the processor wars. And no war has been won playing defense.
    Reply
  • shinkueagle
    memadmaxIt seems to me that Bulldozer is either a AMD bastard child chip, or it's a first gen chip of which subsequent generations of the architecture will be playing "catch up" performance wise. Otherwise, it's typical AMD trying to be efficient rather than a heavy hitter.But if you ask me, this is a "defensive" chip in the processor wars. And no war has been won playing defense.
    Meaning this war is a TOTAL loss to AMD... SADLY... AMD - ABSURDLY MORONIC DEVICES.
    Reply