SMI CEO claims Nvidia wants SSDs with 100 million IOPS — up to 33X performance uplift could eliminate AI GPU bottlenecks

Kioxia
(Image credit: Kioxia)

Now that the AI industry has exceptionally high-performance GPUs with high-bandwidth memory (HBM), one of the bottlenecks that AI training and inference systems face is storage performance. To that end, Nvidia is working with partners to build SSDs that can hit random read performance of 100 million input/output operations per second (IOPS) in small-block workloads, according to Wallace C. Kuo, who spoke with Tom's Hardware in an exclusive interview.

"Right now, they are aiming for 100 million IOPS — which is huge," Kuo told Tom's Hardware.

Modern AI accelerators, such as Nvidia's B200, feature HBM3E memory bandwidth of around 8 TB/s, which significantly exceeds the capabilities of modern storage subsystems in both overall throughput and latency. Modern PCIe 5.0 x4 SSDs top at around 14.5 GB/s and deliver 2 – 3 million IOPS for both 4K and 512B random reads. Although 4K blocks are better suited for bandwidth, AI models typically perform small, random fetches, which makes 512B blocks a better fit for their latency-sensitive patterns. However, increasing the number of I/O operations per second by 33 times is hard, given the limitations of both SSD controllers and NAND memory.

In fact, Kioxia is already working on an 'AI SSD' based on its XL-Flash memory designed to surpass 10 million 512K IOPS. The company currently plans to release this drive during the second half of next year, possibly to coincide with the rollout of Nvidia's Vera Rubin platform. To get to 100 million IOPS, one might use multiple 'AI SSDs.'

However, the head of SMI believes that achieving 100 million IOPS on a single drive featuring conventional NAND with decent cost and power consumption will be extremely hard, so a new type of memory might be needed.

"I believe they are looking for a media change," said Kuo. "Optane was supposed to be the ideal solution, but it is gone now. Kioxia is trying to bring XL-NAND and improve its performance. SanDisk is trying to introduce High Bandwidth Flash (HBF), but honestly, I don't really believe in it. Right now, everyone is promoting their own technology, but the industry really needs something fundamentally new. Otherwise, it will be very hard to achieve 100 million IOPS and still be cost-effective."

Currently, many companies, including Micron and SanDisk, are developing new types of non-volatile memory. However, when these new types of memory will be commercially viable is something that even the head of Silicon Motion is not sure about.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

TOPICS
Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • John Nemesh
    Well, I want an affordable gaming GPU that doesn't suck...I guess we all can't get what we want now, can we?
    Reply
  • Notton
    Nvidia should buy Optane from Intel then.
    Reply
  • hotaru251
    Notton said:
    Nvidia should buy Optane from Intel then.
    pretty sure they sold that off ages ago didnt they??
    Reply
  • Notton
    hotaru251 said:
    pretty sure they sold that off ages ago didnt they??
    No, Intel only sold off their NAND division to SK Hynix.
    I'm pretty sure Intel still owns the IP rights to Optane.
    Reply
  • usertests
    Notton said:
    Nvidia should buy Optane from Intel then.
    Was high IOPS one of the benefits of Optane? I thought it was mostly about latency, low queue depth performance, and cost-per-bit being less than DRAM.

    There have been companies searching/working on would-be NAND and DRAM replacements for decades. If the hundreds of billions flowing into AI gets one of those technologies past the vaporware stage, that could have immense benefits for everyone.

    We don't even need a universal memory necessarily. You could kick NAND to the curb if you could match/beat it at some combination of latency, performance and endurance (which suffer as you go to QLC and beyond), and density/cost. Cost can be higher but fall as production scales up.
    Reply
  • JRStern
    Notton said:
    Nvidia should buy Optane from Intel then.
    Probably still find it at a Cupertino Goodwill store for a dollar.
    Reply
  • JRStern
    Is this really so hard? I mean, to fake? Get thirty-three slower drives, and a boatload of DRAM for buffers, and a pool of processors, and a little hack code to assure transaction consistency, and there you are. Sounds like a Google interview question.
    Reply
  • JRStern
    usertests said:
    Was high IOPS one of the benefits of Optane?
    read yes, down to bit or byte level.
    write, not so much, slow and power hungry and overheated chip.

    No flash SSD is going to enjoy being written for hours at ludicrous speed, either, but that shouldn't be a problem I don't think, need major clarification of the requirements on that point.
    Reply