AMD to Make Hybrid CPUs, Also Using AI for Chip Design: CTO Papermaster at ITF World

AMD CTO Mark Papermaster at ITF World 2023

(Image credit: Tom's Hardware)

I met with AMD CTO Mark Papermaster on the sidelines of ITF World, a conference hosted by semiconductor research firm imec in Antwerp, Belgium, for an interview to discuss some of AMD’s plans for the future. The highlights of the interview include Papermaster’s new revelation that AMD will bring hybrid architectures to its lineup of consumer processors in the future, a first. These types of designs use larger cores designed for performance mixed in with smaller efficiency cores, much like Intel’s competing 13th-Gen chips. Papermaster also spoke about AMD’s current use of AI in its semiconductor design, testing, and verification phases, and about the challenges associated with the company’s plans to use generative AI more extensively for chip design in the future. We have the full conversation further below.

Mark Papermaster has served as AMD’s Chief Technical Officer (CTO) and SVP/EVP of Technology and Engineering since 2011. He's directed AMD's technology development for over a decade, laying the cornerstones of technology that powered the company’s resurgence against industry stalwart Intel, giving him incredible insight into the company's past, present, and future.

Imec’s ITF World 2023 conference featured a string of keynotes from powerful luminaries in the semiconductor industry, like AMD’s Mark Papermaster, Intel’s Ann Kelleher, Nvidia’s Jensen Huang, imec’s Luc Van de hove, and ASML’s Christophe Fouquet.

Papermaster’s presentation centered on the fact that computing is now being gated by power efficiency as Moore’s Law slows. We’ll cover that and many of the other highlights of the event over the coming days. First, here’s our interview with Papermaster before his keynote:

More Cores, With a New Twist

Paul Alcorn: I interviewed you back in 2019 at the Supercomputing conference and we talked about increasing CPU core counts. You said at the time that you see a runway for more cores, and that you don't see a saturation point in the foreseeable future. At the time AMD was at a peak of 64 cores [Rome data center chips], and now you're at 96 for Genoa. AMD was also at 16 cores for Ryzen [3000] for desktop PCs, and now you're still at 16 cores for Ryzen [5000] - so two generations of 16-core chips for client [PCs].

So now, today, do you still see a runway for more cores in data center chips? Additionally, do you see the need for more cores in the client space [PCs] now that it's at 16 cores? It's been two generations with 16, is that going to be a sweet spot moving forward?

Mark Papermaster: What you're going to see in PCs, as well as in the data center, is more bifurcation of tailored SKUs and processers coming out. Because it's really now where one size doesn't fit all; we're not even remotely close to that. You're going to have a set of applications that actually are just fine with today's core count configurations because certain software and applications are not rapidly changing. But what you're going to see is that you might need, in some cases, static CPU core counts, but additional acceleration.

So, if you look at what we've done in desktop for Ryzen, we've actually added a GPU with our CPU. And that's because it really creates a very dense and power-efficient offering, and if you don't need a high-performance GPU, you can save energy with that sort of tailored configuration. If you do need tailored, extensive acceleration, you can still bolt on a discrete GPU. And the other example in PCs is the Ryzen 7040; we've actually added AI acceleration right into the APU.

But what you'll also see is more variations of the cores themselves, you'll see high-performance cores mixed with power-efficient cores mixed with acceleration. So where, Paul, we're moving to now is not just variations in core density, but variations in the type of core, and how you configure the cores. It's not only how you've optimized for either performance or energy efficiency, but stacked cache for applications that can take advantage of it, and accelerators that you put around it.

When you go to the data center, you're also going to see a variation. Certain workloads move more slowly [...] You might be in that sweet spot of 16 to 32 cores on a server. But many businesses are indeed adding point AI applications and analytics. As AI moves from not only being in the cloud, where the heavy training and large language model inferencing will continue, but you're going to see AI applications in the edge. And it's going to be in enterprise data centers as well. They're also going to need different core counts and accelerators.
Paul Alcorn: So, it's probably safe to say that a hybrid architecture will be coming to client [consumer PCs]?

Mark Papermaster: Absolutely. It's already there today, and you'll see more coming.

AMD Already Using AI to Design Chips, Moving Into Generative AI

Paul Alcorn: We're starting to see a lot of AI used in chip design, a lot of use of AI and machine learning for certain things like macro placement, which is where most of the public-facing research and developments are happening. Can you tell me a little bit about AMD's design efforts? Is AMD using any type of AI or exploring that area for certain functions of chip design?

Mark Papermaster: We absolutely are. [...] How we think about it internally at AMD is, we need to practice what we preach. So we are applying AI today in chip design. We're using it in 'place and route,' both in how we position and optimize our sub-blocks of each of our chip designs to get more performance and to lower the energy [consumption]. AI does an amazing job of having an infinite appetite to iterate, iterate, and iterate until you have a truly optimal solution. But it's not just iterating; we could do that before. It's iterating and learning. It's looking at what patterns created the most optimal design, and so it's actually speeding the rate of having an optimized layout of your chip design elements, and therefore giving you higher performance and lower energy, just like we're doing with how we optimize our chiplet partitioning and chiplet placement.

We're also using it across our verification suites to drive a reduction in the time it takes to find any bugs in the typical iterative process of bringing a chip from concept through the whole verification and validation phase. And we're even using it in test pattern generation. So, when you have many billions of transistors in a chip design, getting the test coverage to make sure that when you test your product is flawless as it leaves your manufacturing floor. It turns out you can leverage AI to look at the learning; How did I get test coverage? Where are the gaps? What's the most effective? And get the controllability and observability to maximize test coverage, AI can really speed the times by building off of each successive run, learning from it, and shortening the time to adjust where you are going after any remaining holes in test coverage.

Paul Alcorn: Do you see a future where it could design microarchitectures, or certain chip functions?

Mark Papermaster: There's no question that we're just at such an early phase where you can start thinking about generative AI to actually create new approaches and build upon existing designs you have. We're doing that today with software. So, who would have thought the GitHub Copilot application could get such rapid adoption? Usually, new design tools take quite some time to permeate, but we're already experimenting with Copilot and looking at how it could best be deployed.

We, like everyone else in the industry, when you think about code generation, you have to first assure yourselves that you're protecting IP. That if you're drawing upon an existing knowledge base that you're protected - is that code safe to use? Is that protected from an IP reuse standpoint? So the biggest barriers to adoption right now are, frankly, ensuring that you have the right model, that the source of data that was used to train that model has you in the clear, and that the iterations that you do with generative AI to create next designs, that it's your IP. That it's your learnings of your company, and your engineers are using to iterate to [garbled].

It is absolutely capable today. But you have to make sure that you have an environment where you're protecting the ideas that you have, that creation process. If you use public models and you build upon them, you're actually donating your creative process back to the community. So, you actually have IP considerations as well in how you deploy.

The short answer to your question is, we're going to solve all of those constraints and you'll see more and more generative AI used in the very chip design process. It is being used in point applications today. But as an industry, over the next couple of years, next really one to two years, I think that we'll have the proper constraints to protect IP and you're going to start seeing production applications of generative AI to speed the design process.

Paul Alcorn: And even create designs.

Mark Papermaster: It won't replace designers, but I think it has a tremendous capability to speed design. It's no different than when you think about the broader creative process. Will generative AI replace artistry? Will it replace novels? No. But can it assist the process? Can it speed a student in creating some of the base elements, and they can then add their own thinking on top of it? Will it add to the artists' and novelists' creative process? Yes. And will it speed future chip designs? Absolutely. But we have a few hurdles that we have to get our arms around in the short term.

Sustainable Chip Manufacturing

Paul Alcorn: There's a big focus on power efficiency, and everyone focuses on reducing the carbon footprint when the chips are actually operating, and that is extremely important. But there's also a carbon footprint that can be substantial, and probably becoming more substantial as time goes on, associated with actually manufacturing and creating the chips. There’s EUV’s higher power consumption, a lot of new materials involved, and more processes, more steps, and it's becoming increasingly energy intensive. How does AMD deal with that, to minimize that? Is that something you do, or is that entirely up to TSMC? Or is this a collaborative effort?

Mark Papermaster: We’re supportive of those efforts, and I think they'll grow over time. Today, we are largely reliant on our foundry partners who are the experts in driving the best use of materials and the manufacturing processes. But we work with them to understand those techniques and we're constantly asking what we can do with our design approaches to help reduce the carbon footprint.

In addition, we are members of imec, right here in Belgium, so it's a tie-in to the conference here. And imec actually spans the disciplines. So, if you look at what imec does, it's usually five to 10 years in advance of mass production. So it's able to intercept the industry while the new technology nodes are in development and all the key players that are doing both the silicon, or the materials and the equipment, or the packaging, are involved in imec, as are the fabless design houses such an AMD. So they're in a perfect spot to intercept in that early ideation phase and help drive more sustainable solutions, and they are doing that.

Imec has kicked off a sustainable design initiative across its members. Stay tuned, I can't say more on that, but you might hear more than that at tomorrow's conference. And when you do, know that it is just kicking off, but through our design efforts we intend to be supportive of imec as a central point to help share across the semiconductor ecosystem where there are levers that we can drive more sustainable solutions and reduce the carbon footprint of our industry.

{Papermaster was referring to this announcement: GlobalFoundries, Samsung Electronics, and TSMC Join Imec’s “Sustainable Semiconductor Technologies & Systems” (SSTS) Program]

The Impact of Increased Power Consumption on Chip Cooling

Paul Alcorn: Power consumption has become quite the challenge - the power consumption of the latest Genoa chips jumped up quite a bit. And while some people look on the surface and say, 'wow, that's so much power,' it's actually very power efficient and very dense - and that's good. You're getting more [work] done in a smaller space. But that has created some cooling challenges.

I was really impressed with some of the custom coolers that were made for the Genoa servers to deal with that and the innovations around that. Do you see future generations of chips requiring even more robust cooling solutions? Are we reaching that point where with air cooling, there's pretty much an effective limit, or will we need to start going to like a liquid or maybe immersion cooling?

Mark Papermaster: We don't see that limit yet. At AMD, we're going to continue to offer air-cooled solutions. But what you're going to see more and more is a subset of customers that really want the highest compute density, really leveraging the efficiency and the core density we have. They'll be optionally choosing to do closed-loop cooling solutions. We see that already today with Genoa-based systems.

I think what you'll see in the future is a real choice based on how you're choosing to optimize your data center. Are you legacy; do you want to leverage the legacy infrastructure you have? If so, we absolutely will support that, we're going to have air-cooled solutions for that. Are you building a new data center, and do you want to have it outfitted for the highest density that you can achieve? If so, you're going to adopt closed-loop cooling solutions.