The Xbox One CPU: Complements Of AMD's Jaguar µArch
A Familiar CPU Architecture
The Xbox One's CPU component brings Microsoft’s architecture of choice back to x86, just like the first-generation Xbox. This time, the company kicked IBM to the curb and opted not to jump into another Intel/Nvidia love triangle. Instead, it put a ring on AMD's finger, tapping the same source for general-purpose and graphics computing IP. At the same time, Sony was courting AMD's semi-custom division as well. Both competitors ended up sharing a bed, but due to NDAs was never caught laying down.
When AMD acquired ATI back in 2006, the company promised integrated graphics paired with x86 resources. It took a while, but we saw the first deeply collaborative effort in AMD's Brazos platform for very low-power devices. While it proved to offer notable advantages over Intel's slow Atom, even the Zacate-based E-350 wasn't a performance powerhouse. Not long after, we were introduced to the Llano-based design, which existed primarily on the desktop between 65 and 100 W. But it was still a far cry from achieving the numbers we wanted to see in any one discipline.
Although even today's Richland-based APUs don't get us where one capable CPU and even mainstream discrete graphics can go, AMD did gain the experience needed to create a modular architecture with right-sized CPU and GPU resources to build more customized processors. And that's exactly what it did for Microsoft's Xbox One.
At the heart of AMD's Temash and Kabini APUs (covered in AMD's Kabini: Jaguar And GCN Come Together In A 15 W APU) is the energy-efficient Jaguar CPU architecture. It's an enhancement of the previous-gen Bobcat design, which powered the aforementioned E-350 and went up against Intel's Atom.
Jaguar maintains many of Bobcat's features, such as some of its cache structures, the ability to decode two instructions, and a similar branch predictor, but adds a number of enhancements covered in our linked launch coverage and summarized in the slides on this page. There's an extra decode stage in the front-end, though this plays a role in achieving higher operating clock rates as well.
In AMD's APU portfolio, Jaguar has access to a shared cache unit with 2 MB of L2 that's 16-way associative in quad-core configurations. The L2 is broken up into 512 KB banks, but no longer associated with each core, as it was before. Both the Xbox One and Sony's PS4 employ eight Jaguar-based cores. But because Jaguar supports four-core arrays, these next-gen consoles actually implement two modules, doubling total L2 to 4 MB.
Compared to what we're used to seeing in the desktop PC space, Jaguar is comparatively simple. Each core occupies 3.1 square millimeters of die space. Even with eight of these on a die, that's only 24.8 square millimeters. It's this efficient design that makes Jaguar viable in such a parallelized SoC, particularly matched up to a much more complex graphics component. Programmed to properly, Microsoft should be capable of great multi-tasking capabilities as the Xbox One runs social media, Skype, and Kinect-oriented tasks with minimal impact on gaming.
In fact, we recently spent some time on the phone with Johan Andersson of DICE, who mentioned that his team is seeing 85-95% CPU utilization on next-gen consoles by virtue of optimizations for scaling in its Frostbite 3 game engine. It looks like the developer community is really ramping up efforts to more thoroughly program to many cores, which could yield great returns on the PC, too.
Microsoft's first developer kits featured 1.6 GHz x86 cores. For production, frequencies were boosted to 1.75 GHz. Although combining a less potent architecture with fairly conservative clock rates might appear to leave the Xbox One underpowered, there's a lot of emphasis on parallelism. And not only in the CPU structure, but also offloading compute tasks to the graphics engine. This is where AMD's HSA efforts will hopefully bear fruit.