AMD announces unified UDNA GPU architecture — bringing RDNA and CDNA together to take on Nvidia's CUDA ecosystem

(Image credit: AMD)

Here in Berlin, Germany, at IFA 2024, AMD's Jack Huynh, the senior vice president and general manager of the Computing and Graphics Business Group, announced that the company will unify its consumer-focused RDNA and data center-focused CDNA architectures into one microarchitecture, named UDNA, that will set the stage for the company to tackle Nvidia's entrenched CUDA ecosystem more effectively. The announcement comes as AMD has decided to deprioritize high-end gaming graphics cards to accelerate market share gains.

When AMD moved on from its GCN microarchitecture back in 2019, the company decided to split its new graphics microarchitecture into two different designs, with RDNA designed to power gaming graphics products for the consumer market while the CDNA architecture was designed specifically to cater to compute-centric AI and HPC workloads in the data center.

Huynh explained the reasoning behind the split in a Q&A session with the press and the rationale for moving forward with a new unified design. We also followed up for more details about the forthcoming architecture. Here's a lightly edited transcript of the conversations:

Jack Huynh [JH], AMD: So, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It’s forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

We forked it because then you get the sub-optimizations and the micro-optimizations, but then it's very difficult for these developers, especially as we're growing our data center business, so now we need to unify it. That's been a part of it. Because remember what I said earlier? I'm thinking about millions of developers; that’s where we want to get to. Step one is to get to the hundreds, thousands, tens of thousands, hundreds of thousands, and hopefully, one day, millions. That's what I'm telling the team right now. It’s that scale we have to build now.

Tom's Hardware [TH], Paul Alcorn: So, with UDNA bringing those architectures back together, will all of that still be backward compatible with the RDNA and the CDNA split?

JH: So, one of the things we want to do is ...we made some mistakes with the RDNA side; each time we change the memory hierarchy, the subsystem, it has to reset the matrix on the optimizations. I don't want to do that.

So, going forward, we’re thinking about not just RDNA 5, RDNA 6, RDNA 7, but UDNA 6 and UDNA 7. We plan the next three generations because once we get the optimizations, I don't want to have to change the memory hierarchy, and then we lose a lot of optimizations. So, we're kind of forcing that issue about full forward and backward compatibility. We do that on Xbox today; it’s very doable but requires advanced planning. It’s a lot more work to do, but that’s the direction we’re going.

PA: When you bring this back to a unified architecture, this means, just to be clear, a desktop GPU would have the same architecture as an MI300X equivalent in the future? Correct?

JH: It's a cloud-to-client strategy. And I think it will allow us to be very efficient, too. So, instead of having two teams do it, you have one team. It’s not doing something that's that crazy, right? We forked it because we wanted to micro-optimize in the near term, but now that we have scale, we have to unify back, and I believe it's the right approach. There might be some little bumps.

PA: So, this merging back together, how long will that take? How many more product generations before we see that?

JH: We haven’t disclosed that yet. It’s a strategy. Strategy is very important to me. I think it’s the right strategy. We’ve got to make sure we’re doing the right thing. In fact, when we talk to developers, they love it because, again, they have all these other departments telling them to do different things, too. So, I need to reduce the complexity.

[...]From the developer's standpoint, they love this strategy. They actually wish we did it sooner, but I can't change the engine when a plane’s in the air. I have to find the right way to setpoint that so I don’t break things.

[End of Huynh's comments]

Yes, high-end silicon can build markets, but ultimately, software support tends to define the winners and losers. Nvidia has taught the master's class of how to build a seemingly impenetrable moat with its unparalleled proprietary CUDA ecosystem.

Nvidia began laying the foundation of its empire when it started with CUDA eighteen long years ago, and perhaps one of its most fundamental advantages is signified by the 'U' in CUDA, the Compute Unified Device Architecture. Nvidia has but one CUDA platform for all uses, and it leverages the same underlying microarchitectures for AI, HPC, and gaming.

Huynh told me that CUDA has four million developers, and his goal is to pave the path for AMD to see similar success. That's a tall order. AMD continues to rely on the open source ROCm software stack to counter Nvidia, but that requires buy-in from both users and the open source community that will shoulder some of the burden of optimizing the stack. Anything AMD can do to simplify that work, even if it comes at the cost of some micro-optimizations for certain types of applications/games, will help accelerate that ecosystem.

AMD has taken its fair share of criticism for the often scattered efficacy of the ROCm stack. When it bought Xilinx in 2022, AMD even announced that it would put Victor Peng, the then-CEO of Xilinx, in charge of a unified ROCm team to bring the project under tighter control (Peng recently retired). That effort has yielded at least some fruit, but AMD continues to receive criticism for the state of its ROCm stack — it's clear the company has plenty of work ahead to fully put itself in a position to take on Nvidia's CUDA.

The company also remains focused on ROCm despite the emergence of the UXL Foundation, an open software ecosystem for accelerators that is getting broad support from other players in the industry, like Qualcomm, Samsung, Arm, and Intel.

What precisely will UDNA change compared to the current RDNA and CDNA split? Huynh didn't go into a lot of detail, and obviously there's still plenty of groundwork to be laid. But one clear potential pain point has been the lack of dedicated AI acceleration units in RDNA. Nvidia brought tensor cores to then entire RTX line starting in 2018. AMD only has limited AI acceleration in RDNA 3, basically accessing the FP16 units in a more optimized fashion via WMMA instructions, while RDNA 2 depends purely on the GPU shaders for such work.

Our assumption is that, at some point, AMD will bring full stack support for tensor operations to its GPUs with UDNA. CDNA has had such functional units since 2020, with increased throughput and number format support being added with CDNA 2 (2021) and CDNA 3 (2023). Given the preponderance of AI work being done on both data center and client GPUs these days, adding tensor support to client GPUs seems like a critical need.

The unified UDNA architecture is a good next logical step on the journey to competing with CUDA, but AMD has a mountain to climb. Huynh wouldn't commit to a release date for the new architecture, but given the billions of dollars at stake in the AI market, it's obviously going to be a top priority to execute the new microarchitectural strategy. Still, with what we've heard about AMD RDNA 4, it appears UDNA is at least one more generation away.

See more CPUs News

TOPICS

Paul Alcorn is the Editor-in-Chief for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

With contributions from

Jarred Walton

79 Comments Comment from the forums

Elusive Ruse

I can’t fault the general message and their strategy in going unified but considering Huynh was evasive when you asked about a clear timeline of implementation; I guess I’ll believe it when I see it.
Reply
-Fran-

So... They're bringing GCN back from the dead? LOL.

Christ...

EDIT: Just to add a bit more to my knee-jerk reaction to the overall information (thanks for the interviews, BTW!) in regards to my comment...

AMD is missing something crucial, which was very succintly pointed out in the article: longevity. It doesn't matter what you call it or market it or tell the world you'd be doing technically. The reason why CUDA is king is longevity and support. AMD needs to stop screwing around with the long term strategy and flip-flopping to much and stick to something for longer than 3 generations. Whatever they create, please do stick to it and support it. Anyone remembers HSA? What about Audio Acceleration? And a few other techs which they put out but didn't get adoption and were dropped, but were good ideas. Just, in very AMD fashion, pushed terribly bad into the market.

Regards.
Reply
Pierce2623

Could Nvidia legally prevent AMD from making their GPUs capable of running CUDA code? If AMD could run CUDA code natively, they’d literally be right back in the game in the workplace.
Reply
ET3D

Elusive Ruse said:
I can’t fault the general message and their strategy in going unified but considering Huynh was evasive when you asked about a clear timeline of implementation; I guess I’ll believe it when I see it.
Huynh said: "we’re thinking about not just RDNA 5, RDNA 6, RDNA 7, but UDNA 6 and UDNA 7". Which I think is indicative of the time frame. We won't get to see the fruits of this until the RDNA 6 generation at least, so it's a few years down the road. Then again, it's not clear what he means, as it implies that we will have RNDA 6 alongside UDNA 6.
Reply
rluker5

ET3D said:
Huynh said: "we’re thinking about not just RDNA 5, RDNA 6, RDNA 7, but UDNA 6 and UDNA 7". Which I think is indicative of the time frame. We won't get to see the fruits of this until the RDNA 6 generation at least, so it's a few years down the road. Then again, it's not clear what he means, as it implies that we will have RNDA 6 alongside UDNA 6.
I kind of saw it as him letting us know that they are considering replacing RDNA6 with UDNA6, or they might do it at RDNA7. That they are weighing their options but don't know for sure when the replacement will occur yet. Since he announced it, they must have a good degree of certainty that they are planning to proceed, but since it seems like a bad idea I will also believe it when I see it.

Maybe it is what AMD needs to do to effectively get tensor type cores in their gaming GPUs to catch up with Nvidia and Intel in this regard.
Reply
Makaveli

Elusive Ruse said:
I can’t fault the general message and their strategy in going unified but considering Huynh was evasive when you asked about a clear timeline of implementation; I guess I’ll believe it when I see it.
I think RDNA 5 is already in the pipeline so I would assume its GPU's after that gen so maybe that is why he is being vague.

Pierce2623 said:
Could Nvidia legally prevent AMD from making their GPUs capable of running CUDA code? If AMD could run CUDA code natively, they’d literally be right back in the game in the workplace.
Native will never happen there are however other options.

https://www.tomshardware.com/tech-industry/new-scale-tool-enables-cuda-applications-to-run-on-amd-gpus
https://www.xda-developers.com/nvidia-cuda-amd-zluda/
Reply
stuff and nonesense

Pierce2623 said:
Could Nvidia legally prevent AMD from making their GPUs capable of running CUDA code? If AMD could run CUDA code natively, they’d literally be right back in the game in the workplace.
AMD GPUs can run CUDA, AMD pulled the plug on the software project. Technically there is no physical reason AMD couldn’t make the hardware interface CUDA compliant but… Nvidia have the user/license agreement tied down such that CUDA code can only be run on Nvidia hardware… the lawyers would get richer.
Reply
jlake3

Pierce2623 said:
Could Nvidia legally prevent AMD from making their GPUs capable of running CUDA code? If AMD could run CUDA code natively, they’d literally be right back in the game in the workplace.
Nvidia’s EULA/ToS and their aggressive enforcement thereof has made it legally risky to sell or use a translation layer in a corporate environment, and AMD (or Intel) coming out with native CUDA support effectively impossible.
Reply
-Fran-

Being devil's advocate: nVidia can license the use of CUDA. They more than likely won't, much like Intel did not want AMD to use X86 (slightly different, but applies here) and both AMD and Intel could just help OpenCL be relevant, but they aren't because they want their own stuff to be relevant, which is hilarious to see (how they fail).

I'd even say Intel has seen more success than AMD on that front with oneAPI. ROCm has seen adoption, but at the end they're just not as good as a common open standard, even if the rely or use OCL heavily (BLAS for instance). Expand OCL IMO, but they won't. Maybe Khronos would be to blame there? Not sure. Just throw money at the problem, I guess.

Regards.
Reply
hotaru251

Pierce2623 said:
Could Nvidia legally prevent AMD from making their GPUs capable of running CUDA code?
yes. in fact ZLUDA was able to run cuda on amd gpu...and NVIDIA within i think a month of dev making it public on github updated their terms of use that CUDA can only be used on nvidia hardware.

now this "could" change as France is in legal stance over nvidia for its cuda dominance but that wont play out for yrs and nothing may change.
Reply

Show more comments