Nvidia addresses significant Blackwell yield issues, production ramps in Q4

(Image credit: Nvidia)

On Wednesday, Nvidia admitted that its upcoming Blackwell-based products suffer from low yields, which required the company to re-spin some layers of the B200 processor to improve yields. Nvidia will ramp up production of Blackwell in Q4 2024 and will ship Blackwell GPUs worth several billion dollars in the last quarter of this year.

"We executed a change to the Blackwell GPU mask to improve production yield," a statement by Nvidia reads. "Blackwell production ramp is scheduled to begin in the fourth quarter and continue into fiscal 2026. In the fourth quarter, we expect to ship several billion dollars in Blackwell revenue."

Earlier, it was reported that Nvidia's B100 and B200 GPUs are the first processors to use TSMC's CoWoS-L packaging, which connects chiplets using an RDL interposer with local silicon interconnect (LSI) bridges that enable a transfer rate of around 10 TB/s. These bridges must be placed precisely. However, an alleged mismatch in the coefficient of thermal expansion (CTE) among the GPU chiplets, LSI bridges, RDL interposer, and motherboard substrate led to warping and system failure. According to reports, Nvidia had to redesign the GPU silicon's top metal layers and bumps to improve yields. However, the company didn't provide details on the fix — instead, it simply said it had to create new masks.

Nvidia says that no functional changes to Blackwell silicon were required, so all changes were made to improve yields and ensure a steady supply of B100 and B200 GPUs.

It is hard to tell how many Blackwell GPUs Nvidia will ship in the fourth quarter of 2024. Keeping in mind that Nvidia is rumored to charge around $70,000 per module (keep in mind that pricing of data center hardware depends on volumes and demand, so take this number with a grain of salt), and the fact that Nvidia expects to post 'several billion dollars in Blackwell revenue' (i.e., more than $2 billion, but less than $10 billion) in Q4 2024, it's clear that the company will ship a substantial number of chips in the final quarter of this year. Naturally, the company won't disclose its actual shipment volumes.

See more GPUs News

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

16 Comments Comment from the forums

Lucky_SLS

All I am hearing : Low Blackwell yields -> cut down blackwell gaming GPUs (a man can hope!)
Reply
KyaraM

Lucky_SLS said:
All i am hearing : Low Blackwell yeilds -> cut down blackwell gaming GPUs (a man can hope!)
What exactly do you mean by 'cut down'? That they release the 5060 and 5070 first? I don't think that will happen...
Reply
toffty

I think he means that if not all transitors are working, see if they can be used in GPUs.

Won't happen, two different use cases / architectures.
Reply
thestryker

Lucky_SLS said:
All i am hearing : Low Blackwell yeilds -> cut down blackwell gaming GPUs (a man can hope!)
It's not the singular core with the problems it's the dual core parts which are the super expensive halo ones.
Reply
Amdlova

They have problem (want more money)
Nand and ddr products (we have problem on production)
Ramp the prices for everyone :]
Reply
Lucky_SLS

toffty said:
I think he means that if not all transitors are working, see if they can be used in GPUs.

Won't happen, two different use cases / architectures.

It might very well be like hopper, but no confirmation till now. If it is like ada lovelace, the GPU wafer can be used in both data center and gaming. Like I said, a man can hope XD

https://www.techpowerup.com/gpu-specs/nvidia-gb202.g1072
https://www.techpowerup.com/gpu-specs/nvidia-gb100.g1069
Reply
Pierce2623

An 800mm^2 die had yield problems? WHO could’ve predicted that? Oh wait….we ALL did.
Reply
Pierce2623

toffty said:
I think he means that if not all transitors are working, see if they can be used in GPUs.

Won't happen, two different use cases / architectures.
Blackwell is equivalent to Ada and being used in gaming and data center. They may do another data center only architecture like Hopper but it’s not Blackwell. Data center that aren’t focused on AI had no motivation to select Hopper over Ada as it basically only had advantages in AI. If you have traditional computing to do, you need general purpose ALUs, not fixed function matrix math accelerators. The H100 tripled the price of getting roughly 17000-18000 cuda cores compared to an a100 ada($7000 vs $20,000). The AI fad chasers happily paid up for the extra matrix accelerators though.
Reply
BearRaid

Lucky_SLS said:
All I am hearing : Low Blackwell yields -> cut down blackwell gaming GPUs (a man can hope!)
The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
Reply
edzieba

BearRaid said:
The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
Bingo. the chances of the Blackwell dies for consumer cards using TSMC's not-EMIB process at all is slim to none. They will be good ol' monolithic dies with offboard GDDR as usual, complex packaging would shove the price up far more than any potential savings from modular dies.

I'm guessing Nvidia are probably none too happy about being the guineapigs for CoWoS-L rather than sticking with CoWoS-S. Particularly as they were looking at Intel's fabs for packaging at the start of this year, who have years of experience with EMIB - bet there's behind closed doors 'well, we could have told ya that would happen's going on.
Reply

Show more comments