Sign in with
Sign up | Sign in
AMD's Trinity APU Efficiency: Undervolted And Overclocked
By ,
1. Trinity: Great Gamer, But What About Power?

By now, you’ve read all about AMD’s unconventional approach to introducing its Trinity APU architecture, peeling back the embargo on gaming-oriented performance first, and then granting permission to talk about pricing, overclocking, and alacrity in x86-based apps via benchmarks a few days later.

I warned the company that splitting Trinity’s debut into two days made it look like AMD wasn’t particularly proud of its showing in productivity and content creation apps—and it had no reason to do this, in my mind. As far back as June, we had already shown that a Piledriver module was performing about 15% better than Bulldozer in single- and multi-threaded apps at the same clock rate. We knew that Trinity would be faster than Llano in some situations, but slower in others.

AMD went ahead with its plan. But because we got our hands on Trinity-based A10, A8, and A6 processors right after they were announced at Computex, the performance data that AMD wanted to keep under wraps until now was already available in AMD Trinity On The Desktop: A10, A8, And A6 Get Benchmarked! and AMD Desktop Trinity Update: Now With Core i3 And A8-3870K. If you want to know how Piledriver does at the same clock as Bulldozer, how memory bandwidth scaling affects graphics performance, how effective Dual Graphics is, and how power changes from one generation to the next, all of that information is available between those two links, and was up in our top carousel last week.

But back in June, I was missing performance data from Intel’s Ivy Bridge-based Core i3s. They weren’t available yet. So, as I was trying to come up with an interesting angle for today’s story, I knew we needed results from the Core i3-3220 and -3225, at least.


Radeon HD
GPU (MHz)
Shaders
TDP
Cores
Base CPU
Turbo Core
L2 Cache
Price
A10-5800K
7660D
800
384
100 W
4
3.8 GHz
4.2 GHz4 MB
$122
A10-5700
7660D
760
384
65 W
4
3.4 GHz4.0 GHz4 MB
$122
A8-5600K
7560D
760
256
100 W
4
3.6 GHz3.9 GHz4 MB
$101
A8-5500
7560D
760
256
65 W
4
3.2 GHz3.7 GHz4 MB
$101
A6-5400K
7540D
760
192
65 W
2
3.6 GHz3.8 GHz1 MB
$67
A4-5300
7480D
724
128
65 W
2
3.4 GHz
3.6 GHz
1 MB
$53


Also, it was apparent that a lot of work went into improving Trinity’s idle power consumption compared to Llano. But the APU’s 100 W TDP is still significantly higher than Core i3’s 55 W ceiling. Almost certainly, Trinity can’t be made to outperform and under-consume Ivy Bridge. We're nevertheless testing our A10-5800K at lower voltages and higher clock rates to see if it can be made to shine more brightly against the backdrop of Intel’s entry-level chip, which is far less flexible.

As a result, we get to add overclocked and undervolted performance to our existing library of Trinity-based data, along with a comparison to Intel’s Core i3 competition. And with pricing information in our hands, it becomes pretty easy to set AMD and Intel up head-to-head and declare one company’s solution better than the other’s. Are you ready for the big judgement?

2. A10-5800K: The Undervolt And Overclock

It would have been easy enough to drop the clock rate on our A10-5800K and nudge its voltage down as well, triggering lower power consumption. But this exercise wasn’t about cramming a 100 W APU into a mini-ITX chassis. Rather, we wanted to maintain stock performance at the lowest voltage possible.

A BIOS-set 1.25 V seemed like it was going to be stable. After the system hung a couple of a times in our multi-hour benchmark suite, though, we settled on 1.275 V instead. With all of the platform’s power-saving features already enabled, further cuts would have taken a more efficient power supply, different memory, perhaps, or maybe a minimalist motherboard. Really, though, we were most interested in power cuts directly attributable to the processor.

Overclocking was a little more exciting. Based on our conversation with Sami Mäkinen in Professional Help: Getting The Best Overclock From AMD's A8-3870K, we began our quest with a graphics tweak, pushing the integrated Radeon HD 7660D core from 800 MHz all the way up to 1083 MHz in AMD’s OverDrive utility (this required a 1.275 V northbridge setting). From there, we edged the processor clock up to a stable 4.4 GHz at 1.5 V.

The A10-5800K in our preview story managed a stable 4.5 GHz with all four cores under full load. We tried the same thing this time around, but discovered that, at just under 70 degrees Celsius, cores would jump back down to 1.4 GHz at .91 V to throttle heat if we were using AMD’s reference FX cooler. Even switching over to AMD’s Asetek-designed closed-loop liquid cooler wasn’t enough to get the chip stable at 4.5 GHz this time.

So, we settled for 4.4 GHz across all cores—as high as we could go without triggering performance-debilitating issues throughout our suite. But we noticed another strange behavior that might affect the peak overclock of an aggressive enthusiast. As soon as we crested 4.5 GHz and started trying to push 4.6 and 4.7 GHz, slowly increasing voltage along the way, MSI’s FM2-A85XA-G65 motherboard forcibly pushed down our multiplier (as low as 29x in some cases), despite UEFI and OverDrive ratios that read otherwise.

We’re not sure if this is a deliberate mechanism to protect the motherboard’s power logic, but it’d make sense when you get to our power analysis and see how quickly consumption ramps up as you increase clock rate and voltage.

At the end of the day, it looks like there might be a couple of different protection mechanisms in play: AMD’s thermal monitor keeping the APU from exceeding a ceiling, and what appears to be MSI’s motherboard from jumping over a certain power level, even with temperatures well under the aforementioned limit.

3. Test Setup And Software
Test Hardware
Processors
AMD A10-5800K (Trinity) 3.8 GHz (38 * 100 MHz), Four Cores, Socket FM2, 4 MB Total L2 Cache, Turbo Core enabled, Power-savings enabled

AMD A8-5600K (Trinity) 3.6 GHz (36 * 100 MHz), Four Cores, Socket FM2, 4 MB Total L2 Cache, Turbo Core enabled, Power-savings enabled

Intel Core i3-3225 (Ivy Bridge) 3.3 GHz (33 * 100 MHz), Two Cores, LGA 1155, 3 MB Shared L3 Cache, Hyper-Threading enabled, Power-savings enabled

Intel Core i3-3220 (Ivy Bridge) 3.3 GHz (33 * 100 MHz), Two Cores, LGA 1155, 3 MB Shared L3 Cache, Hyper-Threading enabled, Power-savings enabled
Thermal Paste
Zalman ZM-STG1
Motherboard
MSI FM2-A85XA-G65 (Socket FM2) AMD A85X FCH, Beta BIOS

Gigabyte Z77X-UD3H (LGA 1155) Intel Z77 Express, BIOS F17
Memory
G.Skill 16 GB (4 x 4 GB) DDR3-1600, F3-12800CL9Q2-32GBZL @ 9-9-9-24 and 1.5 V
Hard Drive
Crucial m4 256 GB, SATA 6 Gb/s
Graphics
AMD Radeon HD 7660D

AMD Radeon HD 7560D

Intel HD Graphics 4000

Intel HD Graphics 2500
Power Supply
Cooler Master UCP-1000 W
System Software And Drivers
Operating System
Windows 7 Ultimate 64-bit
DirectX
DirectX 11
Graphics DriverCatalyst 12.8

HD Graphics Driver For Windows 7 (15.26.8.64.2696)


We were forced to cut a couple of different benchmarks from our suite because of stability issues on AMD’s Trinity platform. The first was PCMark 7, which hung up during the first video playback and transcoding test with a white screen where the output should have been. MSI and AMD both claim that nobody else has reported this, so it's possible we have a one-off issue. The second was Blender, which crashed shortly after starting up. We re-imaged our drives, uninstalled and reinstalled the software, and reinstalled drivers and codec packs to no avail.

Audio Benchmarks and Settings
iTunesVersion: 10.4.10, 64-bit
Audio CD ("Terminator II" SE), 53 min., Convert to AAC audio format
Lame MP3Version 3.98.3
Audio CD "Terminator II SE", 53 min, convert WAV to MP3 audio format, Command: -b 160 --nores (160 Kb/s)
Video Benchmarks and Settings
HandBrake CLIVersion: 0.9.8
Video: Big Buck Bunny (720x480, 23.972 frames) 5 Minutes, Audio: Dolby Digital, 48 000 Hz, Six-Channel, English, to Video: AVC Audio: AC3 Audio2: AAC (High Profile)
MainConcept Reference v2.2
Version: 2.2.0.5440
MPEG-2 to H.264, MainConcept H.264/AVC Codec, 28 sec HDTV 1920x1080 (MPEG-2), Audio:
MPEG-2 (44.1 kHz, 2 Channel, 16-Bit, 224 Kb/s), Codec: H.264 Pro, Mode: PAL 50i (25 FPS), Profile: H.264 BD HDMV
Application Benchmarks and Settings
WinRARVersion: 4.20
RAR, Syntax "winrar a -r -m3", Benchmark: 2010-THG-Workload
WinZip 16.5Version: 16.5
WinZip GUI, Benchmark: 2010-THG-Workload
7-Zip
Version 9.22 beta
LZMA2, Syntax "a -t7z -r -m0=LZMA2 -mx=5", Benchmark: 2010-THG-Workload
Adobe Premiere Pro CS6
Hollywood Sequence to H.264 Blu-ray
Output 1920x1080, Maximum Quality, Mercury Playback Engine: Software Mode
Adobe After Effects CS6
Version: CS6
Tom's Hardware Workload, SD project with three picture-in-picture frames, source video at 720p, Render Multiple Frames Simultaneously
Adobe Photoshop CS6 (64-Bit)Version: 11
Filtering a 16 MB TIF (15 000x7266), Filters:, Radial Blur (Amount: 10, Method: zoom, Quality: good) Shape Blur (Radius: 46 px; custom shape: Trademark sysmbol) Median (Radius: 1px) Polar Coordinates (Rectangular to Polar)
ABBYY FineReaderVersion: 10 Professional Build (10.0.102.82)
Read PDF save to Doc, Source: Political Economy (J. Broadhurst 1842) 111 Pages
3ds Max 2012
Version: 10 x64
Rendering Space Flyby Mentalray (SPECapc_3dsmax9), Frame: 248, Resolution: 1440 x 1080
Adobe Acrobat X Professional
PDF Document Creation (Print) from Microsoft PowerPoint 2010
Visual Studio 2010
Compile Chrome project (1/31/2012) with devenv.com /build Release
Synthetic Benchmarks and Settings
3DMark 11
Version 1.0.3
4. Benchmark Results: 3DMark 11

When we talked to Sami about overclocking Llano, he made it clear that his first order of business was tuning the APU’s graphics engine. Our 3DMark results make it clear why. A roughly 15% speed-up would be great news in the games that we tested last week that might have been barely-playable at 1920x1080.

Even at stock settings, the A10 and A8 destroy Intel’s Ivy Bridge-based Core i3s. This is the data we were missing in our June preview, though these numbers and the gaming data we already ran confirm what we hypothesized back then. 

It’s only when we start looking at the Physics subtest, which taxes x86 hardware, that we see Intel’s two physical cores with Hyper-Threading besting AMD’s dual Piledriver modules with four integer cores. 

5. Benchmark Results: Adobe CS6

Our Photoshop test employs a handful of filters optimized for multi-core processors. It’s purely x86-based, though, so the latest OpenCL-based enhancements aren’t reflected.

Nevertheless, AMD’s quad-core APUs outperform Intel’s Core i3s in the first example of what we’re going to see as a trend throughout testing: that is, threaded benchmarks favor AMD’s design, while an ever-shrinking collection of single-threaded metrics give Ivy Bridge’s tremendous IPC throughput advantage a stage on which to shine.

We have a completely different test able to benefit from OpenCL acceleration, too.

Because the HD Graphics 4000 and 2500 engines are OpenCL-capable, Intel actually improves its position against AMD, outperforming the A10-5800K. It takes an aggressive overclock for the flagship APU to secure a first-place spot.

This brings up an interesting question. Although AMD currently seems like the biggest proponent of addressing parallelized workloads with its graphics hardware, Intel has the same capability. Will we simply see the competition speed up alongside AMD, or does AMD have a genuine advantage? We'll see one way the company is addressing Intel's ability to follow its path to glory in our WinZip testing.

The finishing order in Premiere Pro is fairly close, with 30 seconds or so separating the first- and last-place finishers.

Both Intel chips edge past the stock A10-5800K, though, and it again takes an overclock to 4.4 GHz in order for the APU to take the lead.

After Effects CS6 is an even closer race, with the Core i3s and A10 two seconds apart. Overclocked, the APU does enjoy a five-second lead, trivial though that may be.

6. Benchmark Results: Content Creation

Able to use as many x86 cores as you throw at it, 3ds Max gives the edge to AMD’s A10 and A8, though Intel’s Core i3s are only seconds behind.

Overclocking has a profound impact on performance in this test because, normally, the A10’s Turbo Core technology isn’t able to scale all the way up to 4.2 GHz when both modules are active. By setting a static 4.4 GHz clock rate, the APU’s x86 resources operate at a higher frequency even in the face of a taxing workload.

Based on Maxon’s Cinema 4D software, Cinebench is unique in that it allows us to isolate single- and multi-threaded performance.

Using it, we’re able to clearly see that Intel’s Ivy Bridge-based cores achieve much better performance than AMD’s Piledriver modules, even at significantly lower clock rates.

Truly, it takes parallelization to even out the field. Intel’s Hyper-Threading technology is designed to better-exploit underutilized processing resources, but it cannot overcome AMD’s approach, which exposes two complete integer cores per module.

In contrast, turning a PowerPoint document into an Adobe Acrobat file is not a task that gets divvied up across multiple cores. Intel’s powerful architecture consequently secures a victory that not even a Trinity-based APU at 4.4 GHz can overcome.

7. Benchmark Results: Productivity

This well-threaded optical character recognition task hands AMD’s A10 and A8 a victory that gets extended through overclocking. Intel’s Core i3 are effectively not overclockable, so what you see is what you get.

The Ivy Bridge-based parts don’t need a clock rate advantage in single-threaded benchmarks, though. A file conversion from WAV to MP3 formats in Lame uses just one core in each chip, again illustrating how much work each Core i3 core can get done per clock cycle. If only these were quad-core parts…

There’s a similar situation in iTunes, where even a boost to 4.4 GHz leaves the A10-5800K behind Intel’s 3.3 GHz Core i3-3225 and -3220.

The AMD APUs come out ahead in Fritz, which could be due to a combination of higher clock rates, more cache, and improvements to how branch midpredicts are handled.

In AMD Desktop Trinity Update: Now With Core i3 And A8-3870K, we saw interesting results from Visual Studio 2010. First of all, A8-3870K scored a first-place finish, letting us know that the old Stars-based architecture is able to outmaneuver Trinity’s Piledriver x86 cores in specific tasks.

Also, the Sandy Bridge-based Core i3-2100 managed to edge out A10-5800K. It comes as no surprise, then, that a Core i3 centering on Ivy Bridge and running 200 MHz faster extends Intel’s lead in this very real-world benchmark.

Even overclocked to 4.4 GHz, the A10-5800K can’t quite keep up.

8. Benchmark Results: Compression Utilities

WinZip used to be decidedly single-threaded. Slowly, Corel has done a better job of optimizing the compression utility’s engine, and it’s now able to utilize multiple cores to some degree. Nevertheless, Intel’s chips take the win ahead of AMD’s Trinity-based APUs. This is probably because even the "optimized" WinZip 16.5 still doesn't take full advantage of multi-core chips, as I showed in our Radeon HD 7970 GHz Edition review.

That’s not the whole story, though. AMD and Corel worked together to enable OpenCL support in WinZip 16.5, which you can enable on platforms with AMD’s graphics hardware installed (even though Nvidia and Intel support OpenCL on their respective products).

So, we have benchmark results under the same workload with OpenCL turned on. It’s clear that any disadvantage AMD might have suffered in the previous chart is more than made up for with OpenCL enabled.

Of course, if the Photoshop CS6 benchmark is any indication, Intel’s Core i3s will quickly regain ground as soon as Corel exposes support for competing platforms, which it plans to do.

WinRAR is unable to tax AMD’s hardware fully, resulting in a win for Intel. The difference isn’t particularly bothersome. Though, given the amount of power AMD’s chips consume while active, the conversion to efficiency makes this outcome more severe.

We use 7-Zip on our own workstations, not only because it’s freely available, but also because the utility effectively utilizes our hardware. A fairly even finish shows that AMD’s additional processing resources and notably higher clock rates more than make up for a loss in instruction throughput per clock cycle, if just barely.

9. Benchmark Results: Media Encoding

The field is tight in our MainConcept benchmark, with AMD’s dual-module APUs edging out Intel’s dual-core processors. An overclock helps the A10-5800K, which sports a 3.8 GHz base frequency, shave off an additional seven seconds at 4.4 GHz.

HandBrake similarly favors the AMD APUs. Undervolting A10-5800K has no negative performance effect at all, and we enjoy the benefit of reduced power consumption from this taxing workload. Meanwhile, overclocking has a notable impact on performance (so long as you’re willing to tolerate a significant jump in power use).

10. Power Consumption

We didn’t have an opportunity to compare the power profiles of Intel’s 65 W Core i3-2100 and AMD’s Trinity-based APUs in our preview stories. However, now we have Core i3-3000-series chips to include instead.

The following chart is a power log of our entire test suite, which is scripted. That long stretch of fairly constant consumption in the middle is our Visual Basic 2010 benchmark, which lasts for almost an hour. Normally we’d have Blender and PCMark 7 as part of this capture, but stability issues on the Trinity-based system forced us to rem them from our batch file.

Here’s where AMD gets hammered. This power chart from June’s Trinity preview showed that, although the new architecture is effectively able to idle at lower power than Llano, it still butts right up to its 100 W thermal ceiling under load. That's a problem because AMD's competition is rated for 55 W.

Now, when we put A10-5800K overclocked, at its stock settings, and undervolted in the same chart as a Core i3-3225, we come away with a couple of different observations.

First, there’s an almost 50 W difference in average system power between the overclocked and undervolted A10-based configurations. That’s a greater-than 43% jump in overall consumption. Did performance increase commensurately? We doubt it, but we’ll get an exact answer on the next page.

Second, Intel’s advantage in manufacturing technology translates directly to its rated thermal design power, and we can see the result of that in an average system power consumption of 80 W. This says nothing of efficiency, of course, which involves a performance component. However, given what we now know about power (and what we saw on the preceding benchmark pages), it’d be almost impossible for AMD to catch up in the applications we’re testing today.

11. Efficiency

Averaging together system power use from the previous page, the overclocked A10-5800K uses more than 155 W, which is 33 W higher than an A10-5800K at its stock settings.

Undervolting the APU to 1.275 V helps cut consumption by 14.3 W on average, though there is a cumulative performance hit of about two minutes (hardly anything when you’re talking about an almost two-hour run).

But none of the APUs finish the suite as quickly or average the same low power consumption of Intel’s Core i3-3225, which averages 80 W.

When you break down the time it takes to complete the many benchmarks in our suite, the difference between the fastest and slowest chip is less than six minutes.

This chart is an unlikely representation of something AMD keeps trying to pound into our heads: the nebulous idea of experience. Will you notice six minutes over the course of 20 back-to-back demanding tasks? Almost certainly, no. That’s the idea of “good enough” x86 performance. Will you notice the difference in gaming performance illustrated last week, though? When it means the difference between playable frame rates at 1920x1080 or choppiness, then yes.

That doesn’t make the next chart any easier to swallow, though.

In watt-hours, an overclocked A10-5800K uses almost twice as much power as a Core i3-3225 to complete the same workloads. Enthusiasts in AMD’s camp are going to look at those numbers and claim they don’t care about a marginally-higher power bill (the light bulbs on either side of your garage, together, probably use as much power), so long as they get usable 3D performance, while the cool-and-quiet crowd will remind us that a 100 W APU requires more cooling. That could mean a faster-spinning fan or a larger heat sink. Either way, that piece of logic that shifts balance from x86 performance to graphics alacrity is going to cost you.

12. The Pursuit Of Balance Warms Our Hearts

Because AMD split its Trinity architecture introduction up into two days of coverage, I’m forced to draw a conclusion today that runs counter to the efficiency data we just presented.

At its stock settings, the company’s flagship A10-5800K is generally faster than Intel’s Ivy Bridge-based Core i3-3220/3225 in heavily-threaded applications and slower in x86-oriented tasks that only run on one core. Some of our benchmarks fall somewhere in between, and the results reflect as much.

Although HD Graphics 4000 is clearly an improvement over HD Graphics 3000, Intel cannot touch AMD in gaming. The difference is significant enough to split the two solutions across resolutions and detail settings. Where AMD is viable, Intel is not. And if you step down to the HD Graphics 2500-equipped Core i3-3220, well, we hope you like spreadsheets, word processing, and Web browsing.

The price you pay for AMD’s heavy emphasis on graphics performance is power consumption, heat, and, at least on our test bench, noise. Intel’s Core i3-3225 can be used to drive a very fast desktop machine that, completely built-up, uses far less power than the TDP of just AMD’s APU. And it does so without as much as a whisper.

Of course, now we have pricing details to consider as well. AMD plans to sell its A10-5800K for $122 and its A8-5600K for $101. The Core i3-3220 sells for $130 and the -3225 is $145. Frankly, neither Intel option is very attractive to us. We’d rather go for a Pentium G2120 for $100 with entry-level discrete graphics. On the AMD side, the A10-5800K touches the performance we’d want from an on-die GPU to feel comfortable recommending to a friend with modest gaming ambitions. The A8-5600K gives up too much ground in that regard, and the A8-3870K couldn't quite get there last generation, either.

In the end, then, both Intel and AMD are offering you an experience. Which one do you pick?

Intel gives you great performance in productivity and content creation apps, with a fantastic thermal envelope. But any aspiration for gaming necessitates discrete graphics, putting you in the $200 range.

AMD counts on a “good enough” showing in x86-based applications and ample 3D muscle to play a number of modern games at mainstream resolutions. In exchange, you’re asked to accept comparatively high power use. But it’s a price point below what Intel charges for its neutered Core i3-3220 that swings favor toward the A10-5800K for enthusiasts on a strict budget.

We’re power users, after all. We know how to cope with heat and noise; we can deal with a 100 W chip, even in an HTPC. But there’s no way to make the Core i3 look better unless you spring for an add-in card. AMD’s emphasis on balance makes the A10-5800K a better platform for more people than Intel’s closest competition.