Nvidia addresses significant Blackwell yield issues, production ramps in Q4

Nvidia
(Image credit: Nvidia)

On Wednesday, Nvidia admitted that its upcoming Blackwell-based products suffer from low yields, which required the company to re-spin some layers of the B200 processor to improve yields. Nvidia will ramp up production of Blackwell in Q4 2024 and will ship Blackwell GPUs worth several billion dollars in the last quarter of this year. 

"We executed a change to the Blackwell GPU mask to improve production yield," a statement by Nvidia reads. "Blackwell production ramp is scheduled to begin in the fourth quarter and continue into fiscal 2026. In the fourth quarter, we expect to ship several billion dollars in Blackwell revenue." 

Nvidia reaffirmed that it sampled Blackwell GPUs with its customers in the second quarter but admitted that it had to produce 'low-yielding Blackwell material' to meet demand for its Blackwell processors, which impacted its gross margins. 

Nvidia's chief executive, Jensen Huang, said during the earnings call that the company has implemented all the necessary changes to the design of its Blackwell B100 and B200 GPUs and is on track to mass produce them in the fourth quarter.

Earlier, it was reported that Nvidia's B100 and B200 GPUs are the first processors to use TSMC's CoWoS-L packaging, which connects chiplets using an RDL interposer with local silicon interconnect (LSI) bridges that enable a transfer rate of around 10 TB/s. These bridges must be placed precisely. However, an alleged mismatch in the coefficient of thermal expansion (CTE) among the GPU chiplets, LSI bridges, RDL interposer, and motherboard substrate led to warping and system failure. According to reports, Nvidia had to redesign the GPU silicon's top metal layers and bumps to improve yields. However, the company didn't provide details on the fix — instead, it simply said it had to create new masks. 

Nvidia says that no functional changes to Blackwell silicon were required, so all changes were made to improve yields and ensure a steady supply of B100 and B200 GPUs.  

It is hard to tell how many Blackwell GPUs Nvidia will ship in the fourth quarter of 2024. Keeping in mind that Nvidia is rumored to charge around $70,000 per module (keep in mind that pricing of data center hardware depends on volumes and demand, so take this number with a grain of salt), and the fact that Nvidia expects to post 'several billion dollars in Blackwell revenue' (i.e., more than $2 billion, but less than $10 billion) in Q4 2024, it's clear that the company will ship a substantial number of chips in the final quarter of this year. Naturally, the company won't disclose its actual shipment volumes.

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • Lucky_SLS
    All I am hearing : Low Blackwell yields -> cut down blackwell gaming GPUs (a man can hope!)
    Reply
  • KyaraM
    Lucky_SLS said:
    All i am hearing : Low Blackwell yeilds -> cut down blackwell gaming GPUs (a man can hope!)
    What exactly do you mean by 'cut down'? That they release the 5060 and 5070 first? I don't think that will happen...
    Reply
  • toffty
    I think he means that if not all transitors are working, see if they can be used in GPUs.

    Won't happen, two different use cases / architectures.
    Reply
  • thestryker
    Lucky_SLS said:
    All i am hearing : Low Blackwell yeilds -> cut down blackwell gaming GPUs (a man can hope!)
    It's not the singular core with the problems it's the dual core parts which are the super expensive halo ones.
    Reply
  • Amdlova
    They have problem (want more money)
    Nand and ddr products (we have problem on production)
    Ramp the prices for everyone :]
    Reply
  • Lucky_SLS
    toffty said:
    I think he means that if not all transitors are working, see if they can be used in GPUs.

    Won't happen, two different use cases / architectures.

    It might very well be like hopper, but no confirmation till now. If it is like ada lovelace, the GPU wafer can be used in both data center and gaming. Like I said, a man can hope XD

    https://www.techpowerup.com/gpu-specs/nvidia-gb202.g1072
    https://www.techpowerup.com/gpu-specs/nvidia-gb100.g1069
    Reply
  • Pierce2623
    An 800mm^2 die had yield problems? WHO could’ve predicted that? Oh wait….we ALL did.
    Reply
  • Pierce2623
    toffty said:
    I think he means that if not all transitors are working, see if they can be used in GPUs.

    Won't happen, two different use cases / architectures.
    Blackwell is equivalent to Ada and being used in gaming and data center. They may do another data center only architecture like Hopper but it’s not Blackwell. Data center that aren’t focused on AI had no motivation to select Hopper over Ada as it basically only had advantages in AI. If you have traditional computing to do, you need general purpose ALUs, not fixed function matrix math accelerators. The H100 tripled the price of getting roughly 17000-18000 cuda cores compared to an a100 ada($7000 vs $20,000). The AI fad chasers happily paid up for the extra matrix accelerators though.
    Reply
  • BearRaid
    Lucky_SLS said:
    All I am hearing : Low Blackwell yields -> cut down blackwell gaming GPUs (a man can hope!)
    The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
    Reply
  • edzieba
    BearRaid said:
    The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
    Bingo. the chances of the Blackwell dies for consumer cards using TSMC's not-EMIB process at all is slim to none. They will be good ol' monolithic dies with offboard GDDR as usual, complex packaging would shove the price up far more than any potential savings from modular dies.

    I'm guessing Nvidia are probably none too happy about being the guineapigs for CoWoS-L rather than sticking with CoWoS-S. Particularly as they were looking at Intel's fabs for packaging at the start of this year, who have years of experience with EMIB - bet there's behind closed doors 'well, we could have told ya that would happen's going on.
    Reply