Enabling Turbo Core
Back when AMD launched its Phenom II X6 1090T, it debuted a feature called Turbo Core. It was supposed to be an answer to Intel’s Turbo Boost capability, which would capitalize on available TDP in poorly-threaded workloads (where other cores simply sat idle) to increase clock rate.
As you know, Turbo Boost (Intel’s feature) employs an on-die power controller that evaluates temperature, current, power consumption, and operating system states. With all of that information, it can shut down idle cores, freeing up thermal headroom to accelerate active cores. The degree of acceleration is contingent on how many cores are in use. Obviously, there’s a lot more room to ratchet up clock rate in a single-threaded application. As a result, you end up with a frequency map of sorts that scales up and down depending on the parallelism of any given application. As a quick example, from Intel Core i5 And Core i7: Intel’s Mainstream Magnum Opus:
|Turbo Boost: Available Bins (Under TDP/A/Temp)|
|Processor Number||Frequency||4 Cores Active||3 Cores Active||2 Cores Active||1 Core Active|
|Core i7-870||2.93 GHz||2||2||4||5|
|Core i7-860||2.8 GHz||1||1||4||5|
|Core i5-750||2.66 GHz||1||1||4||4|
|Core i7-975||3.33 GHz||1||1||1||2|
|Core i7-950||3.06 GHz||1||1||1||2|
|Core i7-920||2.66 GHz||1||1||1||2|
Turbo Core (AMD’s feature), in comparison, was presented as a deterministic feature that turns on in lightly-threaded workloads where three or fewer cores are active, or off altogether when an application taxes anything more than three cores. In practice, it didn’t seem nearly as binary as AMD described. What I saw in AMD Phenom II X6 1090T And 890FX Platform Review: Hello, Leo was that cores would jump to many different frequencies, never really settling on what was suggested as the top Turbo Core clock rate. As a result, performance gains attributable to Turbo Core seemed more modest than what I expected.
Fortunately, AMD says it made some changes to the technology for Bulldozer that should improve its effectiveness compared to Thuban.
FX Does Turbo Core A Little Differently
Application Power Management (APM) describes Zambezi/Valencia/Interlagos’ ability to monitor (in real-time) the amount of power each core consumes. Rather than taking thermal or current measurements, the activity of each Bulldozer module is tracked. AMD knows how much power each operation requires and is able to come up with instantaneous power use on a per-module basis. A quick comparison between real consumption and maximum TDP indicates whether or not there’s headroom to increase performance. In an example where you’re running an application that doesn’t tax the processor’s resources, Turbo Core dithers between the processor’s base frequency and a higher clock rate, jumping between them to average better overall performance at the defined TDP.
Turbo Core isn’t limited to just a base and some arbitrarily higher frequency, either. It’s actually implemented in three p-states: the base (referred to as P2), an intermediate state (P1), and a higher state (P0). That’s an improvement over the first-gen version of Turbo Core, which AMD says only switched between two p-states. And it’s significant, too, because you can enter P1 with all eight cores active, so long as the headroom is there. Stepping up to P0 requires at least two of four modules to idle. AMD does allow the chip’s TDP to be exceeded instantaneously, but of course it can’t hold that for any thermally significant amount of time.
As such, when you look at the specs for an FX processor and see CPU Base, CPU Turbo Core, And CPU Max. Turbo, you are guaranteed to always get at least that base frequency. You’ll see the Turbo Core clock rate so long as TDP is in check (as it would be in a well-threaded workload that doesn’t exceed the processor’s thermal ceiling). And, whenever half of the chip’s cores are idle, it’s possible to realize maximum Turbo Core speeds.
In the top chart, we see Turbo Core’s effect on iTunes, a single-threaded title. Because seven of its eight cores are essentially idle in this metric, the FX-8150 is allowed to dither at up to 4.2 GHz (it doesn’t hold that frequency constant; rather, it bounces between P1 and P0, or 3.9 and 4.2 GHz). The result is a 10-second shave compared to the same test running without Turbo Core turned on, yielding a flat 3.6 GHz.
The chart below is indicative of 7-Zip, a more thoroughly threaded application able to tax all of the -8150’s resources. Again, you don’t get a constant 3.9 GHz. With Turbo Core enabled, FX-8150 dithers between 3.9 and 3.6 GHz (versus a straight 3.6 GHz with the feature disabled). The resulting two-second speed-up is pretty modest. Even still, we have to appreciate the “free” performance that wouldn’t have been possible in first-gen Turbo Core limited to two p-states.