AMD strikes back at Nvidia with new MI300X benchmarks — MI300X shows 30% higher performance than H100, even with an optimized software stack

AMD Instinct MI300X GPU for AI and High-Performance Computing Workloads

(Image credit: AMD)

Neither AMD nor Nvidia intends to back out of this argument involving the performance difference between the Instinct MI300X and the H100 (Hopper) GPUs. But AMD does make some strong points while comparing FP16 using vLLM, which is a more popular choice against FP8, which works only with TensorRT-LLM.

The red team announced the MI300X graphics accelerator early this December, claiming up to 1.6X lead over Nvidia's H100. Two days ago, Nvidia fired back by saying AMD did not use its optimizations when comparing the H100 with TensorRT-LLM. The reply reached a single H100 against eight-way H100 GPUs while running the Llama 2 70B chat model.

The Continued War of Benchmark Results and Test Scenarios

In this latest response, AMD said that Nvidia used a selective set of inferencing workloads. It further identified that Nvidia benchmarked these using its in-house TensorRT-LLM on H100 rather than vLLM, an open-source and widely used method. Furthermore, Nvidia used the vLLM FP16 performance datatype on AMD while comparing its results with DGX-H100, which used the TensorRT-LLM with FP8 datatype to display these alleged misconstrued results. AMD stressed that in its test, it used vLLM with the FP16 dataset due to its widespread use, and vLLM does not support FP8.

There's also the point that servers will have latency, but instead of accounting for that, Nvidia showed its throughput performance, not emulating the real-world situation, according to AMD.

AMD's Updated Test Results with More Optimizations and Accounting for Latency with Nvidia's Testing Method

AMD made three performance runs using Nvidia's TensorRT-LLM, the last notable one having measured latency results between MI300X and vLLM using the FP16 dataset against H100 with TensorRT-LLM. But the first test involved a comparison between the two using vLLM on both, hence FP16, and for the second test, it compared its MI300X's performance with vLLM while comparing TensorRT-LLM.

AMD's second Inference Performance bench on Llama 2 70B using three test scenarios — (Image credit: AMD)

So, AMD used the same selected testing scenario Nvidia did with its second and third test scenarios, showing higher performance and reduced latency. The company added more optimizations when compared to H100 while running vLLM on both, offering a 2.1x boost in performance.

It is now up to Nvidia to evaluate how it wants to respond. But it also needs to acknowledge that this would require the industry to ditch FP16 with TensorRT-LLM's closed system for using FP8, essentially ditching vLLM for good. While referring to Nvidia's premium, a Redditor once said, "TensorRT-LLM is free just like the things that come free with a Rolls Royce."

TOPICS

Roshan Ashraf Shaikh has been in the Indian PC hardware community since the early 2000s and has been building PCs, contributing to many Indian tech forums, & blogs. He operated Hardware BBQ for 11 years and wrote news for eTeknix & TweakTown before joining Tom's Hardware team. Besides tech, he is interested in fighting games, movies, anime, and mechanical watches.

11 Comments Comment from the forums

thisisaname

My benchmark can beat up your benchmark!! :ROFLMAO:
Reply
prtskg

It's good to see AMD reach a level where Nvidia is bothered to respond to its claims.
Reply
OneMoreUser

I know who is trustworthy and it isn't the green team.

Nvidia may or may not be as bad a Intel when it comes to what is essentially alternative facts, but the track record of creative bs isn't pretty.
Reply
GustavoVanni

Wait a minute! My Rolls Royce didn't come with any freebies. I demand a refund!!!
Reply
dalek1234

prtskg said:
It's good to see AMD reach a level where Nvidia is bothered to respond to its claims.
I know, right.
Reply
KyaraM

OneMoreUser said:
I know who is trustworthy and it isn't the green team.

Nvidia may or may not be as bad a Intel when it comes to what is essentially alternative facts, but the track record of creative bs isn't pretty.
But also not the red team, lmfao...

Only trustworthy source is third party, always was, always will be. Anyone claiming otherwise needs their head checked.
Reply
-Fran-

Slap eachother silly, please. This is amusing.

Regards.
Reply
KyaraM

-Fran- said:
Slap eachother silly, please. This is amusing.

Regards.
I thought AMD and Nvidia were doing that for some time now 😜
Reply
-Fran-

KyaraM said:
I thought AMD and Nvidia were doing that for some time now 😜
More often than not it's one sided in favour of nVidia. This is one of those rare ocassions where nVidia is actually, from what I can read between the lines, bothered.

I can't even remember when it was the last time nVidia bothered to "challenge" AMD numbers.

Regards.
Reply
KyaraM

-Fran- said:
More often than not it's one sided in favour of nVidia. This is one of those rare ocassions where nVidia is actually, from what I can read between the lines, bothered.

I can't even remember when it was the last time nVidia bothered to "challenge" AMD numbers.

Regards.
Didn't they take potshots at AMD drivers not too long ago?
Reply

Show more comments