Mullins And Beema APUs: AMD Gets Serious About Tablet SoCs

Meet The Mullins And Beema Tablet APUs

When we wrote AMD's Kabini: Jaguar And GCN Come Together In A 15 W APU almost one year ago, AMD told me that it hoped that its Kabini and Temash APUs would bridge the gap between low-power ARM-based tablets and higher-performance notebooks. While the SoCs arguably achieved their goal, the most popular x86 tablets (Lenovo's ThinkPad, and Dell's Latitude and Venue Pro) employ Intel's Atom platform. The Temash APU never delivered a sub-4 W sweet spot the way Bay Trail did, and the Atom Z3770/D and Z3740/D became the weapon of choice for Windows-based tablets for good reason.

That's not to say AMD's Jaguar architecture isn't perfectly capable (both Microsoft and Sony would argue it is). And at higher thermal ceilings, I'd even suggest that the desktop-oriented version of Kabini is superior to the Bay Trail-D design. But to be truly competitive in the mobile space, AMD needs to do more with less power.

And that was the company's mantra as it created the Mullins and Beema APUs, both low-power SoCs destined to replace the Temash and Kabini solutions.

In presenting its two newest processors, AMD makes some bold claims. For instance, the company says that Mullins boasts two times the graphics performance per watt, and twice the system productivity per watt, compared to Temash. Beema is purported to offer a greater-than 10% graphics performance improvement over Kabini at a TDP that's 40% lower. And compared to its competition, AMD says that Beema serves up better graphics performance than both Bay Trail-T and Haswell-Y. The recurring theme is significantly lower power consumption and more speed. Exactly what kind of magic is involved in bringing these APUs to life?

Perhaps it makes the most sense to talk about what doesn't change. Beema and Mullins are manufactured on a 28 nm node, just like Kabini and Temash. As for the underlying architecture, it turns out that Puma+ offers the same IPC as Jaguar, the design that precedes it. Despite the nomenclature change, cores, caches, and schedulers remain the same. The graphics complex is also similar to the previous generation; the newest APUs similarly sport 128 GCN-based shaders.

Put simply, if you run the Beema/Mullins chips at the same frequencies as Kabini/Temash, you get identical performance.

Of course, that means speed increases must come from higher clock rates. How can that be, when we're talking about lower power and the same manufacturing process, all in the same breath? Fortunately, AMD has a lot to talk about on the subject. 

The bummer is that it won't tell us where its newest parts are being etched. What we can relay are boasts of impressive process improvements yielding up to 38% lower leakage from the graphics transistors and 19% less leakage from the CPU cores. Company representatives also cite significant power savings attributable to I/O enhancements like an optimized DDR3L-1333 interface, which is responsible for a 500 mW reduction in draw. There's also a 200 mW savings that comes from a more efficient display engine.

Additionally, system-aware power management reportedly enables up to 50% more frequency at nearly half the TDP of AMD's Temash and Kabini APUs. Indeed, the top-of-the-line Mullins A10 Micro-6700T has a maximum 2.2 GHz clock rate and a 4.5 W TDP. Compare that to the fastest Temash-based A6-1450, which capped out at 1.4 GHz with an 8 W TDP. Of course, real-world frequencies are going to depend on the workload you're running. But AMD's saying it's better able to balance between high clock rates in single-threaded apps and lower frequencies in more parallelized tasks. The 15 W Beema-based A6-6310 tops out at 2.4 GHz, compared to the 25 W Kabini A6-5200 at 2 GHz.

As far as graphics go, the highest-clocked Beema and Mullins APUs run as fast as 800 and 500 MHz, respectively. Kabini and Temash topped out at 600 and 400 MHz.

If all of this sounds too good to be true, note that AMD is listing the highest possible clocks for Mullins and Beema, not their base frequencies. That'd be like Intel rating its processors at their peak Turbo Boost settings, or Nvidia marketing graphics cards at their typical GPU Boost figures. This isn't a new behavior from AMD though, which already takes the same approach with some of its newer graphics and general-purpose processing products.

The company claims that intelligent power control avoids waste by boosting only the applications that benefit from it, tying in with thermal management. In the case of the platform AMD gave us to play with, we saw the A10 Micro-6700T bounce between 1 GHz and a maximum 2.2 GHz clock rate. I fired up a single thread of Prime95 and recorded a 2.2 GHz frequency. Repeating the experiment with two, three, and four threads, we came up with 1.6, 1.4, and 1.2 GHz ceilings. As the SoC heated up, even those settings slid, though.

AMD also spoke to us at length about Skin Temperature Aware Power Management, or STAPM. The thermal limit of a tablet is often constrained by the temperature of its chassis, rather than the SoC's ceiling, since a piece of silicon withstands higher heat levels than your lap. Most devices are bound by the highest clock rate a processor can sustain without pushing the skin temperature beyond the user's sensitivity limit. Using STAPM, an APU ramps up aggressively until its host device's enclosure reaches a defined maximum, allowing higher performance for brief periods. Since many mobile applications involve holding onto a tablet for short durations, it's easier to get a speedier experience this way.

Finally, memory support evolves, allowing the top-end Beema APU to handle DDR3L-1866. Previously, the top-end mobile Kabini APU peaked at DDR3L-1600. 

But before we tackle the performance implications of AMD's adjustments, let's take a closer look at the new on-die Platform Security Processor and the specific models planned for introduction.