The CPU Side: An All-New Piledriver Core
An APU is an amalgamation of x86 cores and graphics resources. So, let’s start by exploring the component of the die traditionally referred to as the CPU.
When Llano was introduced a year ago, we already knew that its Stars architecture was on its last legs. AMD’s plans for the future clearly centered on Bulldozer, a design that wouldn’t make it into a desktop-oriented product until last October.
Well, the situation is reversed for Trinity’s introduction. This time, AMD’s most modern processor architecture is being shown off in an APU—a mobile APU at that. Dubbed Piledriver, we’re faced with the update to Bulldozer that won’t find its way onto the desktop until later in 2012.
What are the main differences between the Husky cores in AMD’s Llano architecture and the Piledriver-based cores in Trinity? Whereas a quad-core APU built on the Llano design employs four distinct execution cores, quad-core Trinity chips feature two Bulldozer modules. Each module boasts two integer cores. However, they share some of the resources that you’d typically find duplicated on more traditional multi-core implementations, such as the fetch and decode stages, floating point units, and L2 cache. Again, you can read more about the Bulldozer architecture in AMD Bulldozer Review: FX-8150 Gets Tested.
The most obvious difference between AMD’s desktop FX processors and the CPU component of Trinity is cache. While each of the APU’s modules still shares 2 MB lf L2, Trinity lacks the 8 MB shared L3, leaving this module architecture with 4 MB of L2 and no L3, matching Llano’s on-die memory.
AMD engineers made it clear that one of their main design goals for Piledriver was to improve IPC compared to Bulldozer. We knew this as far back as AMD’s original Bulldozer briefing, so it’s not a surprise. With FX, we saw that the architecture gave up significant per-clock performance compared to its predecessor, and that clearly needed to be addressed. The engineering team didn’t use just one magic bullet in its quest, but rather a variety of strategies that result in improved performance per clock.
Here are the main improvements implemented in the Piledriver core:
First, the branch predictor was significantly re-vamped and split into a two-level structure. Keeping the instruction pipeline flowing is a critical job when performance is the target, and while AMD didn’t disclose anything more specific, it did make it clear that branch prediction plays a significant role.
In addition, engineers increased the size of the instruction window to allow a larger group of instructions to be processed; this improves performance, and helps process operating system-level code more efficiently. In addition, more ISA instructions were added, including a fused multiply-add (FMA3) and a floating point 16-bit convert (F16C). The Bulldozer architecture already supported FMA4, so the inclusion of FMA4 enables support for a capability that Intel will introduce in its next-gen architecture as well. According to AMD, instruction executable times were improved, resulting in faster floating-point and integer divide results in addition to calls and returns, changes that are critical to get in and out of subroutines quickly. Page translation has also been improved and optimized.
The memory subsystem is another key component of performance, and we saw early on that high cache latencies were one of Bulldozer’s key weaknesses. AMD engineers claim to have invested a lot of effort to improve Piledriver’s L2 cache and hardware prefetcher, purportedly reducing latencies when memory is read. Stream prediction is purportedly improved significantly since the previous generation of APUs.
The Load/Store unit has also been targeted as a place where latency can be reduced, so store-to-load reordering has been improved with follow-up reads to better anticipate compiler requests and reduce load latency. The L1 translation lookaside buffer (TLB) has been doubled to 64-entries to avoid associated latency increases if possible, as a larger TLB provides a more efficient structure. Finally, both the integer and floating-point schedulers have been improved to better utilize all of the hardware units that Piledriver has to offer.
With improvements in clock rate (something we’ll talk about a little later), AMD claims its Trinity-based A10-5800K offers a 26% improvement over the Llano-based A8-3850 on the desktop, and that its A10-4600M shows a 29% improvement over the A8-3500M in notebooks.
Those are pretty aggressive improvements, and we’ll be keeping them in mind as we run through our tests. But first, let’s take a look at the graphics segment of Trinity.
Current page: The CPU Side: An All-New Piledriver CorePrev Page AMD’s Next APU: Trinity Next Page The GPU Side: VLIW4 > VLIW5
Stay on the Cutting Edge
Join the experts who read Tom's Hardware for the inside track on enthusiast PC tech news — and have for over 25 years. We'll send breaking news and in-depth reviews of CPUs, GPUs, AI, maker hardware and more straight to your inbox.
Hope its only the beginning of much moreReply
Recently Charlie at semiaccurate (a massive amd fanboy) hinting an upcoming apple products, then I saw an article in thg that tells an upcoming mbp will using retina display... 15 inch retina will require huge gpu horsepower, my wild guess is mbp will use trinity as it's CPU.Reply
Based on this, gaming is much better than old i5, but everything else including application performance is still better on the old Sandy architecture. I'm not really sure why I would buy a Trinity other than for a casual gaming laptop. Unfortunately, budget says that my laptops have to be used for business first, play time later.Reply
Nice to see that Trinity and AMD have delivered the goods. I want a Trinity powered Ultrathin. Intel can stick their crap where the Sun don't shine.Reply
BTW, Charlie @ SemiAccurate is not an AMD fanbois IME. He just calls it like it is. Reality bites sometimes be it Nvidia, AMD or Intel's problems. Denial never changes reality. It is what it is.
duckwithnukesWhere is the Intel HD 4000 vs. AMD Trinity comparison? Lazy reviewing at its finest.Reply
A10-4600M laptops will be int eh $600-$700 neighborhood, and we're still waiting for Ivy bridge Core i5 to arrive in this price range.
We go over this. We also talk about how we'll do a follow up as soon as an appropriate product is available.
You need to read for it to make sense.
FlippyFlap, Apple doesn't use AMD and an HD4000 can power a retina display. I'm sure Apple has worked with Intel engineers to get the drivers right for retina displays which is HD4000's problem. HD4000 is still lacking in terms of driver support (one can see that from the OpenCL benches around the net where only 1/2 get acclerated on HD4000). When the drivers work right, there isn't much difference between Ivy and Trinity.Reply
I agree with Cleeve and I personally hate comparing a reference system to a selling system anyway. Review 2 actual selling systems with similar parts and that gives you the benchmark.Reply
This looks like a very nice effort from AMD. I really, really need to replace my notebook. It's a six year old Toshiba Satelite with an AMD 1.9 GHz Turion 64 X2 with intergrated X2100 graphics.... yeah. Ancient now, I know. I've been trying to figure out a sweet spot in power since my needs are kind of complex. Typically I don't need it to do much more than handle MSOffice and web surfing. But I also tend to use it for video gaming when am interesting game comes around and some work in PaintShop when I'm out of the house, or don't feel like sitting at my desktop. This may be a little closer to what I'd like. It would be nice to get a notebook that combines this with a really good discrete card (sort of like how some MacBook Pros have their dual graphics setup). Nevertheless, Trinity looks to be just about enough power and performance, but the question is price. If tradition holds, it should be a good price competitor with Intel, which is the most important part, otherwise I'd just buy a core I7 already.Reply
In a related question, does Trinity's details and specs lead to any conclusions about what Piledriver desktop processors will be like?
So this means that AMD can kick Intel's ass in the gpu department for the moment while AMD suffers greatly in the CPU portion of the apu battle. Didn't I said before that Intel is trying to make an (proprietary) Intel only PC with no third party strings attached? We all know that there is no competition in the CPU battle when it comes to Intel. Still, i would like to see that the morons of intel to drop the price of their hardware for once and for all and drop ridiculously low end hardware out of production.Reply
No WoW benchmarks this time? I was wondering if this might make a good laptop for WoW, but you guys failed me. :(Reply