Samsung Soups Up 96 AMD MI100 GPUs With Radical Computational Memory

Samsung has built the world's first large scale computing system using GPUs with built in processing-in-memory (PIM) chips. These memory modules, which were loaded onto 96 AMD Instinct MI100 GPUs, increased AI training performance by 2.5x, according to a report by Business Korea.

PIM is a new generation of computer memory that can speed up computationally complex workflows handled by processors such as CPUs and GPUs. As the name suggests, each memory module is capable of processing data on its own, reducing the amount of data needed to travel between the memory and the processor.

Samsung has been developing PIM for some time now. The company demoed several implementations in 2021, involving several different memory types including DDR4, LPDDR5X, GDDR6, and HBM2. In LPDDR5 form, Samsung saw a 1.8x increase in performance with a 42.6% reduction in power consumption and a 70% reduction in latency on a test program involving a Meta AI workload. Even more impressive, these results were from a standard server system with no modifications to the motherboard or CPU (all that changed was a swap to PIM-enabled LPDDR5 DIMMs). 

Regardless, PIM looks like a potent solution to speeding up AI-accelerated workflows. "As the head of the AI research center, I want to make Samsung a semiconductor company that uses AI better than any other company," Choi Chang-kyu, vice president and head of the AI Research Center at Samsung Electronics Advanced Institute of Technology, told Business Korea. 

TOPICS
Aaron Klotz
Contributing Writer

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

  • rluker5
    Nice. I want some.
    Reply
  • Co BIY
    But what Tom's wants to know :

    Will it run Crisis ? ect ...

    Can it be GAMED?
    Reply
  • bit_user
    Co BIY said:
    But what Tom's wants to know :

    Will it run Crisis ? ect ...

    Can it be GAMED?
    No, the MI100 has no texture engines, ROPs, or display controllers, nor does it support graphics APIs like OpenGL or Direct3D.

    The MI100 was AMD's first CDNA product. None of the CDNA products do graphics. They do have video decode acceleration, but that's just for the benefit analyzing video streams with AI.
    Reply
  • bit_user
    This is just a taste. Wait until someone actually designs an entire accelerator around this stuff!

    The reason I say that is the PIM modules duplicate some functionality that's in the core compute die. So, if you removed that redundancy, it would free up some area in the core compute die for more compute that the PIM modules don't accelerate. The end result would be even greater speed up!
    Reply