Micron says high bandwidth memory is sold out for 2024 and most of 2025 — intense demand portends potential AI GPU production bottleneck

Micron DRAM fab, Taichung
(Image credit: Micron)

Micron is currently an underdog in the market of high-bandwidth memory, but it looks like things are changing rapidly as the company said that its supply of HBM3E memory had been sold out for 2024 and allocated for most of 2025. For now, Micron has said that its HBM3E will show up in Nvidia's H200 GPU for artificial intelligence and high-performance computing, so it looks like Micron is poised to grab a sizeable HBM market share. 

"Our HBM is sold out for calendar 2024, and the overwhelming majority of our 2025 supply has already been allocated," said Sanjay Mehrotra, chief executive of Micron, in prepared remarks for the company's earnings call this week. "We continue to expect HBM bit share equivalent to our overall DRAM bit share sometime in calendar 2025." 

Micron's initial HBM3E stacks are 24 GB 8Hi modules featuring a data transfer rate of 9.2 GT/s and a peak memory bandwidth of over 1.2 TB/s per device. Six of these stacks will be used for Nvidia's H200 GPU for AI and HPC to enable 141 GB of high-bandwidth memory in total. Since Micron is the first company to start shipments of HBM3E commercially, it is going to sell a boatload of its HBM3E packages. 

"We are on track to generate several hundred million dollars of revenue from HBM in fiscal 2024 and expect HBM revenues to be accretive to our DRAM and overall gross margins starting in the fiscal third quarter," said Mehrotra.  

The head of Micron said that it had started sampling of its 12-Hi HBM3E cubes, which increase memory capacity by 50% and therefore enable AI training of larger language models. These 36 GB HBM3E cubes will be used for next-generation AI processors and their production will ramp up in 2025. 

Since the manufacturing of HBM involves production of specialty DRAMs, ramp-up of HBM will greatly affect Micron's ability to make DRAM ICs for mainstream applications. 

"The ramp of HBM production will constrain supply growth in non-HBM products," Mehrotra said. "Industrywide, HBM3E consumes approximately three times the wafer supply as DDR5 to produce a given number of bits in the same technology node."

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • DavidLejdar
    Somewhat ironic, for memory to be the bottleneck, isn't it? :)
    Reply
  • bit_user
    This could be just the window of opportunity needed for competitors like Tenstorrent and Cerebras to gain a meaningful foothold in the market. Their solutions rely primarily on SRAM that's intermingled with the compute engines, on the same die. I'm not sure if either use HBM at all, but I know Tenstorrent's earlier hardware didn't.
    Reply
  • Co BIY
    3x the wafer for the same bits ?

    That explains the expense involved.
    Reply
  • bit_user
    Co BIY said:
    3x the wafer for the same bits ?
    Yeah, I wonder how much of that is from the TSVs...

    Co BIY said:
    That explains the expense involved.
    I wonder how much expense is added by the stack assembly, itself (as well as post-assembly yield).
    Reply