Sign in with
Sign up | Sign in
AMD's Kabini: Jaguar And GCN Come Together In A 15 W APU
By ,
1. Temash And Kabini: AMD's Mobile Future

A little over a year ago, we sat in an auditorium in AMD's Sunnyvale, CA office to hear Rory Read and his executive team explain how the company planned to stay competitive, despite what we saw as lagging positions in the client and server computing segments. At no point did he mention taking back the high-end x86 CPU crown from Intel. Rather, the rallying cry revolved around APUs: deliver a compelling user experience across device categories using what was lauded as disruptive APU technology, and propel these devices into ultra-low-power markets.

The three keys to executing this vision were listed as the reuse of SoC IP, an improved design methodology, and better time-to-market. Based on the roadmap AMD showed off at that event, it hasn't quite found its stride yet. Most notably, Sea Islands didn't become the new architecture with HSA-oriented features we were expecting, so it looks like we'll be testing with GCN-based boards for the rest of 2013.

But AMD is delivering on the Temash and Kabini designs it outlined at that analyst day, the former a low-power APU powering notebooks and the latter an ultra-low-power APU diminutive enough to drive tablets. Both feature AMD's Jaguar x86 core design and the already-familiar Graphics Core Next GPU architecture. 

These aren't the only Jaguar-based SoCs being talked about lately. The PlayStation 4 and Xbox One center on eight-core Jaguar-based APUs, too. Reuse IP? Leverage the company's GPU architecture in new markets? Deliver a compelling experience across device categories? Check, check, and check. Although Rory's team looks a little different today than it did in 2012, the company appears to be satisfying some of the important goals it set forth. 

The fact that Microsoft and Sony are leaning on AMD's Jaguar design is pretty telling. But of course, we don't have have access to the next-generation PlayStation or Xbox. We do, however, have a prototype notebook powered by Kabini in our possession. We can also talk about the details surrounding AMD's Temash SoC.

To give you an idea of the range we're talking about, the highest-power Kabini APU is a 25 W part, while the lowest-power Temash-based chip uses up to 3.9 W.

These processors are destined for tablets, convertibles, and ultra-thin notebooks. AMD intends to fill the gap between low-power ARM-based tablets and high-performance laptops with silicon that seems to slot in between the Silvermont-based Atom architecture Intel just announced and mid-range mobile CPUs based on the same company's Ivy Bridge design.

If you ever wanted a decent Windows-based tablet, and hoped to pay less than the $1,000 Microsoft charges for a Surface Pro, Temash could be promising. How about a desire for a low-cost ultra-thin notebook with great battery life and graphics performance that shames Intel's Atom? If AMD's claims are to be believed, Kabini is the answer there.

Let's have a look inside both APUs to see if the specs tell us a compelling story.

2. Jaguar: A Low-Power x86 Core

We've already introduced you to a number of AMD's APU designs, which combine general-purpose and graphics processing resources onto a single die. First it was Llano in the mobile space with The AMD A8-3500M APU Review: Llano Is Unleashed. Then it was Trinity on the desktop in AMD Trinity On The Desktop: A10, A8, And A6 Get Benchmarked! But both of those APU designs followed AMD's more performance-oriented roadmap with the Stars- and Piledriver-derived CPU architectures.

For an example of the company's low-power efforts, we have to go all the way back to January of 2011 for ASRock's E350M1: AMD's Brazos Platform Hits The Desktop First. The Brazos platform came armed with a Zacate APU. Within Zacate, AMD integrated two 1.6 GHz Bobcat-based x86 cores and its Cedar (Radeon HD 5450ish) GPU. 

The Jaguar architecture we're looking at today is an iterative improvement over Bobcat. In approaching Jaguar, AMD says it had three design goals. First, improve IPC. Bobcat was (in)famously slow, barely outperforming Intel's 2008-era Atom 330. Second, bring the ISA's functionality up to more modern standards, introducing instruction sets like SSE4.1/4.2 and AVX. Third, augment portability for the future, making Jaguar easier to take to new process technologies and fab partners.

As end users, that last point isn't our problem. The modern list of features is nice, but once you know what Jaguar supports, it's easy to anticipate the gains in specific, optimized workloads. AMD's efforts to improve IPC are much more interesting, though.

Let's start with the basics. Jaguar (as it shows up in the SoCs we're talking about today) is available in dual- and quad-core configurations. Bobcat-based SoCs were limited to dual-core arrangements. The quad-core variants based on Jaguar require active cooling, while the dual-core chips should run cool enough for passive cooling. 

The CPU core is manufactured using 28 nm technology, and AMD's chief technology officer, Joe Macri, points out that the x86 design team leveraged some of the software tools used to build GPUs, squeezing more resources into smaller area than more custom previous-gen cores. As a result, each Jaguar core occupies 3.1 square millimeters of die space. That's notably smaller than the 4.9 square millimeters each Bobcat core monopolized.

Now, where does Jaguar improve over Bobcat? In the front-end, Jaguar's instruction cache offers similar throughput, though it delivers this bandwidth at a lower power cost thanks to a selective read process that only activates one-fourth of the banks. A 4x32B loop buffer is also added; when the execution pipelines can use information stored there, the instruction cache can stay powered-down, yielding the double benefit of lower latency.

In addition, the instruction buffer is about 30% larger than it was on Bobcat, circumventing some of the hit you might take after a cache miss.

Finally, the execution pipeline grows by one decode stage. As we saw so painfully when Intel introduced Pentium 4, longer pipelines are actually detrimental to IPC. However, breaking the pipeline up does help improve scalability. The assumption is that AMD is countering the IPC hit with higher clock rates.

The integer pipeline is augmented with a divider unit pulled over from Llano's Stars architecture and modified for Jaguar. Support for a number of familiar complex operation (cops) instructions is included, in addition to hardware CRC units to help the CPU's x86 code execution efficiency. Schedulers and re-order buffers are anywhere from 30 to 70% larger, improving the parallelism of code executed out-of-order.

The L2 cache and its interface with the execution cores is completely redesigned. It is now shared, 2 MB-large (broken up into 512 KB banks), and 16-way associative, no longer 512 KB dedicated to each core. AMD says this is a nod to efficiency, as software can take advantage of a little or a lot, depending on a thread's needs. 

Bobcat's L2 cache ran at half of the CPU's clock rate. Jaguar's interface runs at full processor frequency. Pre-fetching is improved; AMD's algorithm pays better attention to data patterns, assisting the predictor in making better choices. Sixteen additional L2 snoop entries act as a probe filter to avoid look-ups whenever possible, again, saving power and improving latencies. According to AMD, its shared L2 is one of the greatest contributors to IPC improvements in Jaguar compared to Bobcat.

The load/store unit between the the execution pipeline and L2 cache, and the data cache, are improved to help make AMD's L2 enhancements more tangible. Jaguar combines loads, utilizing a much bigger buffer to avoid store data shuffling and perform load bypasses at lower latencies.

The sum of AMD's changes to Jaguar add up to a 22% single-threaded IPC increase over Bobcat, the company says. That's a per-clock improvement, so optimizations for clock rate should push that number upwards as this architecture hits higher frequencies. Naturally, we'll be putting those claims to the test in just a few pages...

3. The First APUs With AMD's GCN Architecture, Plus Power Management

In addition to a redesigned x86 core architecture, Temash and Kabini are also AMD's first APUs sporting the Graphics Core Next architecture.

Insofar as it applies to API support, the GCN-based logic built in to Kabini and Temash is identical to AMD's discrete parts. DirectX 11.1, OpenGL 4.3, and OpenCL 1.2 are all supported. The fixed-function Video Codec Engine is present, accelerating video decode and H.264-based encoding. Of course, this feature requires third-party developer support, and adoption has been slow thus far.

A new component of the VCE is called scalable video encoding, or SVC. This is able to encode multiple streams in one output pass, creating content that can be pushed to backwards-compatible devices. In other words, you're able to scale temporally and spatially, enabling playback at less demanding bitrates on lower-end hardware.

Like the Zacate-based APUs with the Cedar graphics core, this APU's graphics engine is identical across the line-up, differentiated only by clock rate. Every Kabini and Temash processor comes equipped with two compute units, each with four texture and four vector units. As you can see in the visualization above, a vector unit contains 16 ALUs and a register file. All told, one APU plays host to 128 ALUs and eight texture units. A single render back-end facilitates four full-color raster operation pipelines. Put more simply, think of this as one-fourth of a Radeon HD 7750, with lower clock rates and less memory bandwidth.

There are some notable differences between these APUs and AMD's discrete GPUs, though. For example, the GCN-based GPUs we've reviewed thus far all employed two asynchronous compute engines, which dispatch work to the compute units. In Tahiti, two ACEs served 32 CUs. Here four ACEs serve two CUs. Also new is a set of flat instruction accesses that allow an address to be issued in a load/store operation. This purportedly makes function calls simpler.

AMD says the new APUs support Ultra HD (2160p) output over HDMI and DisplayPort, Wi-Fi-certified Miracast, DisplayPort panel self-refresh to cut power consumption on compatible displays, dynamic refresh rates to save power when screen updates aren't necessary, and dual-display Eyefinity.

Power Management

Temash and Kabini use logic in each x86 core to calculate instantaneous power based on weight events and leakage. That result is fed into a power control unit called the Turbo Core Manager, along with GPU power and the on-die Fusion Controller Hub's consumption. A fourth input from the display interface yields a pretty complete picture of what each APU subsystem needs from the total available TDP. Using a credits-based system, Turbo Core can then change the chip's P-states, optimizing performance within a thermal ceiling.

AMD adds even more practicality to its power monitoring capabilities with the Turbo Dock concept. This hybrid form factor leverages active cooling inside a detachable keyboard to increase cooling performance and potentially double the platform's thermal ceiling.

As a result, you can use an APU-powered tablet on its own and still get a reasonably-fast experience, or dock with the keyboard to improve performance substantially.

4. AMD's E-Series and A-Series APUs, Along With Their Bundles

So, let's have a look at the individual SKUs that AMD is announcing. Note that there are some new Richland-based ULV APUs on the list, too.

That's a diverse range of APUs from 3.9 to 25 W. The quad-core Temash-based A6-1450 is particularly interesting at 8 W, and we'd like to see how that solution might fare in a tablet (Ed.: though that thermal ceiling is pretty high for a tablet).

AMD Elite Experience Program

As a minimalist, I'm no fan of most value-added software. Often, those apps are included free for a good reason. With that said, AMD is both creating and licensing a lot of software it plans to use as a means of creating baseline experiences on devices powered by its hardware. It's not uncommon to find mobile devices loaded up with software able to expose the products differentiating capabilities. And the idea here appears to be similar.

AMD's bundle is tiered according to APU hierarchy. The E2 and A4 families reside at the bottom of the stack, and include Steady Video (an application for smoothing out sudden movements in shaky video clips; this is already available in the Catalyst driver package), Perfect Picture HD (image quality enhancements for video playback, also available in the Catalyst driver already), and Quick Stream technology (an Internet QoS app, again, already value-added by AMD).

Stepping up to an A6-class APU adds Screen Mirror (powered by ArcSoft) to the bundle, allowing you to broadcast your system's display output across your home network. An A8 APU piles Face Login on top of the other features, delivering facial recognition capabilities that take the place of typing in a password by using a webcam. Gesture Control is also included, yielding Microsoft Kinect-like control over certain applications.

Systems with A10 APUs get a "regionally-assorted game bundle." Given fairly modest graphics engines, we're still unsure of what this really means. Surely you can't expect the pricey bundles shipping with AMD's discrete cards.

5. AMD's Kabini-Based Prototype And Our Benchmarks

Prototypes generally don't emphasize industrial design. However, our Kabini-based ultra-thin notebook with an A4-5000 in it is surprisingly svelte.

Although we used a number of different notebooks for our benchmarks, we used the same hard drive and memory in all of them to keep our comparisons as valid as possible. We want to zero in on platform performance after all, and not the difference between a mechanical disk and SSD.

AMD suggests that Kabini's prime adversary will be low-priced Pentium laptops. To that end, we secured a Pentium B960-equipped notebook for testing as the most inexpensive platform we could find with Intel's CPU inside. It sells for $350 on Newegg.

At the other end of the spectrum, Core i3-3217U is another good match-up. This processor sports a similar power envelope and is available in notebooks starting around $400.

Laptop Comparison Test Settings
PlatformKabini Prototype LaptopAcer Aspire V3
HP Pavillion Sleekbook 15
ProcessorA4-5000: 1.5 GHz Base Clock Rate, 2 MB Shared L2 Cache, 15 W (Kabini)
Pentium B960: 2.2 GHz Base Clock Rate, 2 MB Shared L3 Cache, 35 W (Sandy Bridge)
Core i3-3217U: 1.8 GHz Base Clock Rate, 3 MB Shared L3 Cache, 17 W (Ivy Bridge)
MemoryHynix 8 GB (2 x 4 GB) DDR3-667 @ CAS 9-9-9-24-1T
GraphicsRadeon HD 8330
128 ALUs, 500 MHz core
Intel HD Graphics, 6 EUs, 350 to 1,100 MHz coreIntel HD 4000 Graphics, 16 EUs, 350 to 1,350 MHz core
Hard Drive
Toshiba MQ01ABD100H 1 TB, 5,400 RPM
Graphics Driver
AMD Catalyst
13.101_Beta3
Intel HD Graphics Driver
15.28.15.64.3062
Intel HD Graphics Driver
15.31.3.64.3071


And here are the benchmark details:

Benchmark Configuration
3D Games
Metro: Last Light
Version 1.0.0.0, DirectX 10, Built-in Benchmark
The Elder Scrolls V: SkyrimVersion 1.6.89.06, Version 1.5.26.05, 25-Sec. Fraps
Tomb Raider
Version 1.04, Built-in Benchmark
F1 2012Version 1.2, Direct X 11, Built-in Benchmark
Audio/Video Encoding
HandBrake CLIVersion: 0.98, Video: Video from Canon EOS 7D (1920x1080, 25 frames) 1 Minutes 22 Seconds, Audio: PCM-S16, 48,000 Hz, Two-Channel, to Video: AVC1 Audio: AAC (High Profile)
iTunesVersion 10.4.1.10 x64: Audio CD (Terminator II SE), 53 minutes, default AAC format 
Lame MP3Version 3.98.3: Audio CD "Terminator II SE", 53 min, convert WAV to MP3 audio format, Command: -b 160 --nores (160 Kb/s)
TotalCode Studio 2.5Version: 2.5.0.10677, MPEG-2 to H.264, MainConcept H.264/AVC Codec, 28 sec HDTV 1920x1080 (MPEG-2), Audio:MPEG2 (44.1 kHz, Two-Channel, 16-Bit, 224 Kb/s) Codec: H.264 Pro, Mode: PAL 50i (25 FPS), Profile: H.264 BD HDMV
Productivity
ABBYY FineReaderVersion 10.0.102.82: Read PDF save to Doc, Source: Political Economy (J. Broadhurst 1842) 111 Pages
Adobe Photoshop CS6Version 13 x64: Filter 15.7 MB TIF Image: Radial Blur, Shape Blur, Median, Polar Coordinates
Autodesk 3ds Max 2012Version 14.0 x64: Space Flyby Mentalray, 248 Frames, 1440x1080
7-ZipVersion 9.28, LZMA2, Syntax "a -t7z -r -m0=LZMA2 -mx=5"
Benchmark: THG-Workload-2012
WinRARVersion 4.2, RAR, Syntax "winrar a -r -m3"
Benchmark: THG-Workload-2012
WinZipVersion 17.0 Pro, Best Method, ZIPX
Benchmark: THG-Workload-2012
Synthetic Benchmarks and Settings
3DMark 11Version: 1.0.1, Entry, Performance, Extreme Suite
PCMark 7Version: 1.0.4, System, Productivity, Hard Disk Drive benchmarks
SiSoftware Sandra 2012Version: 2012 SP5c-1872, CPU Test = CPU Arithmetic / MultiMedia, Memory Test = Bandwidth Benchmark
6. Results: Synthetics

Although synthetic metrics aren't representative of real-world performance, they do help us drill down into specific subsystems. Let's start by looking at graphics.

The HD Graphics engine in Intel's Pentium B960 does not support DirectX 11, so we have 3DMark Vantage (the green bar) as a secondary measurement. 3DMark 11 does yield viable results on the other two platforms.

Obviously, the Pentium B960 gets outclassed in 3DMark Vantage. Intel's HD Graphics 4000 engine is quite a bit faster in Vantage, though it's only slightly faster in 3DMark 11. AMD's GCN architecture tends to fare best in more modern titles, so this really isn't a surprise to us.

PCMark 7 yields conflicting results. The Pentium gives us the best Overall and Productivity suite scores. AMD's A4-5000 leads in the storage test. And the Core i3-3217U performs best in the Creativity suite.

Cinebench doesn't do AMD any favors; regardless of whether you're looking at single- or multi-threaded performance, the Intel cores are quickest.

The A4-5000 fares well against the Pentium in Sandra's floating-point benchmark. However, it's beaten in raw measures of integer performance.

With support for AES acceleration, the A4-5000 achieves a great result in Sandra's encryption/decryption subtest, moving data as fast as its memory subsystem allows. This is one of those features that Intel strips off for the sake of differentiation. As such, Kabini is handed an easy win.

Intel's Sandy Bridge architecture only supports OpenCL on its CPU cores. Ivy Bridge added support for HD Graphics, though the test only ran in Compute Shader mode for us. Meanwhile, AMD's A4-5400 is able to tackle Sandra's OpenCL workload across its x86 and graphics resources.

LuxMark tells a different story, though. We expect Intel to serve up potent performance from its x86 cores. However, the HD Graphics engine serves up great results as well compared to Kabini's 128 ALUs. It's not exactly clear why AMD's architecture, which is known for its compute alacrity, suffers so badly in this test. The Pentium-based notebook does not work in LuxMark, though its general-purpose cores should support OpenCL.

7. Results: F1 2012 And The Elder Scrolls V: Skyrim

Codemasters' racing games are notorious for scaling down to lower-end hardware, so we're testing F1 2012 at 1280x720 using the most entry-level detail settings available.

The Core i3's HD Graphics 4000 component maintains frame rates in excess of 30 FPS, while the A4-5000 dips as low as 23 FPS. This isn't a smooth result, but it's still fairly playable. That's more than we can say for the Pentium.

A small step up in quality, enabling shadows, hits all processors equally. The Pentium is still not playable. A4-5000 is marginal. And the Ivy Bridge-based Core i3 is still smooth enough to enjoy.

We couldn't get any benchmark numbers from the Core i3 due to an incompatibility between the CPU, driver, and game at the 1024x600 resolution we wanted to use. No worries, though. We'll make up for this in the next chart.

Both the A4 and Pentium are playable, though.

Our resolution issue goes away when we set all systems to 1280x720. Both the A4 and Pentium are now too slow to claim viability, while the Core i3 serves up a respectable performance level.

8. Results: Tomb Raider And Metro 2033

The previous two games were fairly lightweight titles, and we frankly expected them to run on Kabini at 15 W. Tomb Raider is more demanding, so we're not sure what's going to happen.

At the lowest detail settings, using a resolution of 1024x768, Tomb Raider is barely playable on the A4-5000, completely unplayable on the Pentium B960, and quite smooth on Intel's Core i3-3217U.

Bumped up to 1280x720, AMD's APU is no longer playable, while the Core i3 is still fast enough to enjoy.

Even at the bottom-end detail settings and a modest 1024x768 resolution, Metro: Last Light is unplayable on these low-power parts. There's really no point to pushing a more taxing configuration.

9. Results: Media Encoding

The game results caught us a bit off-guard; based on what we saw from Llano and then Trinity, AMD had a clear bias towards graphics performance that lent itself to gaming. With Kabini, that seems to be lost, as Intel's HD Graphics 4000 component on the 17 W Core i3 is quite a bit faster.

On the flip side, we're hoping to see AMD's improvements to its x86 hardware materialize as more competitive performance in more general computing apps.

iTunes is single-threaded, so clock rate and IPC win this one. The 2.2 GHz Pentium B960 scores a first-place finish, followed by Intel's Core i3. AMD's A4-5000 trails by a substantial margin.

Also single-threaded, Lame yields the same outcome.

HandBrake is threaded, so the Core i3 with its Hyper-Threading technology manages to outperform the Pentium. AMD's quad-core A4-5000-based platform finishes in last place again, though the outcome isn't as severe.

Although the Intel-based chips trade places once again, one outcome that doesn't change is Kabini's inability to keep up at what we anticipate to be a competitive price point.

10. Results: Adobe CS6 Suite

Time and again, After Effects demonstrates a sensitivity to available memory, particularly as core count increases. In this case, the quad-core Kabini-based A4 turns in the last-place finishing time, a ways behind the Pentium and Core i3.

Hyper-Threading again propels the Core i3-3217U into a first-place finish in this very well-threaded workload. The dual-core Pentium achieves its second-place result through more aggressive clock rates. Even with four physical cores, though, the 1.5 GHz A4-5000 just can't keep up.

11. Results: Productivity

This next test is a single-threaded workload that sees us take a PowerPoint presentation and print out a PDF file of it.

Unable to fully utilize its four cores, the 1.5 GHz A4-5000 gets worked over pretty bad. Perhaps 3ds Max can better-demonstrate the benefits of a quad-core APU.

We thought this might have been Kabini's chance to shine, but the higher-clocked dual-core Pentium scores a first-place finish.

The Core i3's Hyper-Threading capability is more of a boon in Blender. The A4 can't catch a break, though.

The same goes for ABBYY's FineReader OCR application.

12. Results: Compression

The outcome in WinRAR appears pretty familiar after the past several pages of benchmark results.

7-Zip puts the A4 and Pentium much closer together, though the Core i3 wraps up our workload almost three minutes faster.

Regardless of how we benchmark WinZip, using the same folder full of data, he finishing order remains the same: Core i3, Pentium, A4-5000.

13. Power Consumption

AMD's Kabini-based APUs cannot be evaluated based on performance alone. This 15 W processor is meant to go into mobile devices driven by batteries, and lower power use translates to more compact and lighter platforms.

Let's measure consumption in three disciplines: gaming, Web browsing, and HD video playback. The following tests are run by removing each notebook's batteries, plugging them into the wall, and logging power use. In order to factor out each system's LCD, we turn them off and in favor of an external monitor.

Although the A4-5000-based notebook is soundly bested in our F1 2012 benchmark, it also uses a lot less power than the Core i3- and especially the Pentium-based laptop. 

The 14 W delta between A4 and Core i3 is particularly notable since the processors driving those platforms have TDPs only 2 W apart.

Web browsing doesn't apply as much of a graphics load, so we're seeing more impact from the x86 cores. The field narrows substantially, but A4-5000 remains the most power-friendly option.

All three of these platforms feature fixed-function logic able to accelerate H.264 playback in hardware. So, the workload isn't much more demanding than simply browsing the Web. 

Each solution appears equally adept at offloading the decode process, so we again see the A4-5000 in first place, with Intel's Core i3 just 2 W higher.

14. The Kabini-Based A4-5000: Mediocre Performance, But Great Efficiency

Before we try to reach any conclusions, check out the following chart summarizing performance in each of our benchmarks. Pay particular attention to those orange bars, which represent power efficiency. This is the relationship between each platform's power draw and average performance.

On one hand, the Kabini-based A4-5000 doesn't fare very well when it comes to charting frame rates in games and the time it takes to complete our wide range of desktop workloads. It gets beaten by the dual-core Pentium B960 in every discipline except gaming. But the efficiency bar tells a very different story. Even if you can get Pentium B960-based notebooks fairly inexpensively, you certainly cannot expect them to deliver great battery life with that 35 W processor in there.

So what about the Core i3-3217U, a 17 W processor? Surely that one is a more virile competitor, and not much more expensive than the Pentium. Core i3's on-die HD Graphics 4000 engine with its 16 EUs stomps all over the A4's 128 ALUs, despite the backing of AMD's capable Graphics Core Next architecture. Now, AMD claims that Kabini isn't meant to go up against Core i3. But we found notebooks with this exact CPU selling for as little as $360 on Newegg. It may turn out that the free market doesn't let AMD choose which Intel-based platforms its Kabini-based APUs contend with. 

Fortunately, the A4-5000 roughly matched the Core i3-3217U's efficiency. More than likely the A4 is going to give you slightly better battery life at the expense of performance. The bad news for AMD is that 17 W CPUs from Intel already pin Kabini into the budget notebook space, and that's before an onslaught of Haswell-based parts from the top and Silvermont-based options down below.

Truth be told, though Kabini isn't the solution that has us most interested. Sure, it's great to see some compelling hardware from AMD up into the 25 W range. However, we really want to get our hands on Temash. AMD needs to work with OEMs to enable compelling form factors that change the way we work. We have several Atom Z2760-based tablets in the lab. It's great to have x86 compatibility on a handheld device with a full copy of Windows 8, but the build quality of those things frankly sucks. We're talking bending, flexing, intermittent dock connections, and cheap plastic. That's no way to tackle the tablet space. Oh, and we'd really like the flexibility to play something other than Angry Birds, too (though, based on the lackluster gaming numbers in the 15 W range, consider our expectations tempered down at 8 W). 

At the end of the day, AMD's next-gen APUs likely have the best chance of success in well-built Windows 8 tablets at the right price. Given a choice between something running Android, an iPad with iOS, or a Surface with Windows RT, the hamstrung Microsoft option seems to get beat up on pretty savagely. But show me an unconstrained Windows 8 device for $100 more and I'll pull my wallet out for you. It's really a shame that AMD talked up what it has planned in this quickly-maturing space and then sent over a notebook that was outgunned in a segment Intel already has saturated with options. 

Lastly, it's a bit of a bummer that neither Temash nor Kabini incorporate AMD's heterogeneous unified memory access (hUMA). This is the feature that will allow the GPU and CPU to share system memory without copying it back and forth, eliminating a massive source of latency in today's APUs. This is where we expect the company's SoCs to stand apart from some of the other highly-integrated processors being designed. Unfortunately, we won't see hUMA in a shipping APU until Kaveri is released later this year.