HBM2 Standard Beefed Up to 24GB Support, 307 GB/s per Stack

Credit: JEDECCredit: JEDEC

The JEDEC Solid State Technology Association has updated the existing HBM2 (High Bandwidth Memory 2) standard to support capacities up to 24GB per stack and speeds up to 307 GB/s.

HBM has been commonly employed in the graphics card, high-performance computing (HPC), server, networking and client industries. The HBM standard consists of stacking memory chips vertically on top of each other whereby each memory chip is interconnected to the next via through-silicon vias (TSV) and microbumps. When compared to DDR4 and GDDR5, HBM chips are considerably smaller, faster and also more power efficient. Traits like these are what propelled graphics card manufacturers, such as AMD and Nvidia, to implement HBM into their products.

The original HBM2 (the successor to HBM) standard (JESD235) comes with support for up to eight dies per stack for a maximum capacity of 8GB and a memory bandwidth of 256 GB/s per package. JEDEC's latest update to the HBM2 specification (JESD235B) has opened the doors for manufacturers to pack up to 12 dies per stack to achieve a maximum capacity of 24GB. JEDEC has also increased the memory bandwidth to 307 GB/s, which is delivered across a 1,024-bit memory interface that's separated into eight independent channels on each stack.

AMD's Radeon Instinct MI60 and Nvidia's Quadro GV100, Titan V CEO Edition and Tesla V100 graphics cards already feature 32GB of HBM2 memory. It certainly will be interesting to see if future professional and mainstream graphics cards can take advantage of JEDEC's update. 

Create a new thread in the News comments forum about this subject
This thread is closed for comments
5 comments
Comment from the forums
    Your comment
  • AgentLozen
    Does this bandwidth increase directly translate to a 20% improvement in total HBM2 bandwidth?
    I looked up the memory bandwidth on Wikipedia for the AMD Vega cards and it seems to be tied to a clock speed. Vega 56 is rated at 1600 MT/s and Vega 64 is rated at 1890MT/s.
  • lorfa
    Anyone know what the latency is on HBM? I've seen many questions about how HBM would fair as main memory but no one seems to know for sure what the deal breaker would be, besides the price of course.
  • bit_user
    717699 said:
    Anyone know what the latency is on HBM? I've seen many questions about how HBM would fair as main memory but no one seems to know for sure what the deal breaker would be, besides the price of course.

    I think it's just DRAM. I know the power savings comes from the short traces and running its wide interface at a lower clock.

    However, to use it efficiently, I think you might need to do a bit of rearchitecting. If you naively replace the 128-bit memory interface with HBM2's 1024-bit bus, the limit you'd probably run into is the 64-byte cacheline size. That's only 512-bits. So, I guess if you split the stack into two banks, then you could independently fetch two non-contiguous cachelines at a time.

    Of course, since it sounds like there are natively 8 independent banks, one could go further and probably trade some latency for a bit more throughput. On the other hand, if you can get the latency low enough, perhaps you can do away with L3 cache and save some money by reducing the die size of the SoC.

    I've been waiting to see HBM show up in smartphones. Perhaps we need Apple to do it first.