Samsung Soups Up 96 AMD MI100 GPUs With Radical Computational Memory

Samsung has built the world's first large scale computing system using GPUs with built in processing-in-memory (PIM) chips. These memory modules, which were loaded onto 96 AMD Instinct MI100 GPUs, increased AI training performance by 2.5x, according to a report by Business Korea.

PIM is a new generation of computer memory that can speed up computationally complex workflows handled by processors such as CPUs and GPUs. As the name suggests, each memory module is capable of processing data on its own, reducing the amount of data needed to travel between the memory and the processor.

Samsung has been developing PIM for some time now. The company demoed several implementations in 2021, involving several different memory types including DDR4, LPDDR5X, GDDR6, and HBM2. In LPDDR5 form, Samsung saw a 1.8x increase in performance with a 42.6% reduction in power consumption and a 70% reduction in latency on a test program involving a Meta AI workload. Even more impressive, these results were from a standard server system with no modifications to the motherboard or CPU (all that changed was a swap to PIM-enabled LPDDR5 DIMMs).

Samsung isn't the only company developing PIM chips — SK hynix released its own PIM modules earlier this year. According to SK hynix's preliminary testing, its GDDR6-AiM (Accelerator in Memory) application sped up AI processing by 16x and reduced power consumption by 80%. That's a lot quicker than Samsung's modified MI100s, but we don't know what SK hynix used for testing so it's not a direct comparison.

Regardless, PIM looks like a potent solution to speeding up AI-accelerated workflows. "As the head of the AI research center, I want to make Samsung a semiconductor company that uses AI better than any other company," Choi Chang-kyu, vice president and head of the AI Research Center at Samsung Electronics Advanced Institute of Technology, told Business Korea.

See more GPUs News

TOPICS

Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.

4 Comments Comment from the forums

rluker5

Nice. I want some.
Reply
Co BIY

But what Tom's wants to know :

Will it run Crisis ? ect ...

Can it be GAMED?
Reply
bit_user

Co BIY said:
But what Tom's wants to know :

Will it run Crisis ? ect ...

Can it be GAMED?
No, the MI100 has no texture engines, ROPs, or display controllers, nor does it support graphics APIs like OpenGL or Direct3D.

The MI100 was AMD's first CDNA product. None of the CDNA products do graphics. They do have video decode acceleration, but that's just for the benefit analyzing video streams with AI.
Reply
bit_user

This is just a taste. Wait until someone actually designs an entire accelerator around this stuff!

The reason I say that is the PIM modules duplicate some functionality that's in the core compute die. So, if you removed that redundancy, it would free up some area in the core compute die for more compute that the PIM modules don't accelerate. The end result would be even greater speed up!
Reply