Custom PCIe 5.0 SSD with 3D XL-Flash debuts — special Optane-like flash memory delivers up to 3.5 million random IOPS

InnoGrit's XL-Flash-based SSD prototype
(Image credit: Tom's Hardware)

Kioxia has introduced two generations of its XL-Flash storage class memory (SCM), a high-speed type of NAND memory originally designed to compete with Optane SSD technology by providing new levels of latency, performance, and endurance over standard flash-based storage. However, so far, only a few companies have released solid-state drives based on this type of NAND. This might change, as InnoGrit, a maker of SSD controllers, demonstrated a reference design PCIe 5.0 SSD based on XL-Flash memory at Computex that delivers up to an incredible 3.5 million random read IOPS. Therefore, there is a chance that independent makers of solid-state drives will release drives based on Kioxia's SCM.

InnoGrit's N3X SSD, based on the InnoGrit Tacoma IG5669 controller and Kioxia's 2nd Generation XL-Flash memory in SLC mode, is designed for latency-sensitive enterprise workloads that demand the utmost reliability. The NVMe 2.0-compliant controller supports a PCIe 5.0 x4 interface and enables sustained performance of up to 14 GB/s sequential read speed and up to 12 GB/s sequential write speed, as well as up to 3.5 million random read IOPS and 700 thousand random write IOPS.

Perhaps more importantly, there's less than 13 microseconds of read latency, a huge reduction compared to the ~50 – 100 µs for 3D TLC NAND, and 4 µs write latency, another massive decrease compared to ~200 – 400 µs for 3D TLC NAND, which is particularly useful for caching, AI inference, in-memory computing, and real-time analytics.

(Image credit: Tom's Hardware)

These N3X SSDs offer capacities from 400GB to 3.2TB using 2nd Gen XL-Flash in SLC mode (with configurations spanning from 32 to 256 NAND dies), making these SSDs a compelling alternative to discontinued Intel Optane solutions.

The endurance rating of the drive is particularly impressive: 50 DWPD over 5 years, far exceeding the standard enterprise NAND-based SSD endurance levels, which makes it ideal for write-intensive tasks such as caching, inference, and transactional workloads where longevity and consistent performance are critical.

InnoGrit envisions that its partners might build XL-Flash and IG5669-based SSDs in U.2 or E3.S form factors. Yet, they could also build drives in an add-on card form factor to cater to the needs of desktop workstations or even high-end desktops.

Kioxia's XL-Flash is a high-performance NAND technology designed to bridge the latency gap between DRAM and conventional flash memory, making it a viable SCM option for enterprise applications that require persistent memory with DRAM-like responsiveness. Kioxia's 2nd Generation XL-Flash doubles density by using an MLC (multi-level cell) architecture, raising die capacity from 128Gb to 256Gb. However, InnoGrit believes that it makes sense to use such memory in SLC configuration for higher performance, as at the end of the day, this is what users of XL-Flash demand the most.

Originally, Kioxia positioned XL-Flash as a competitor to Intel's discontinued Optane memory; however, for now, we can say that XL-Flash is taking over where Optane left off.

Kioxia has been the main supplier of XL-Flash-based solid-state drives, though we have seen SSDs featuring this type of SCM from companies like Memblaze, albeit based on the very expensive Microsemi Flashtec NVMe2016 controller. InnoGrit is a more democratic maker of SSD controllers, so one can expect N3X SSDs to become more widespread, assuming that drive manufacturers adopt this design. Keep in mind, however, that XL-Flash is a niche type of memory, so it is unlikely to become truly widespread.

Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

TOPICS
Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • bit_user
    The article said:
    Perhaps more importantly, there's less than 13 microseconds of read latency, a huge reduction compared to the ~50 – 100 µs for 3D TLC NAND, and 4 µs write latency, another massive decrease compared to ~200 – 400 µs for 3D TLC NAND
    Your range for read latency is skewed a little high. The best PCIe 4.0 drives dipped just below 40 microseconds. I think we've seen even lower, for PCIe 5.0 drives.

    Source: https://www.tomshardware.com/reviews/samsung-990-pro-ssd-review/2
    From the same review, we see write latencies a hair over 10 microseconds:

    Of course, those writes are to the SLC buffer, not at TLC density. So, that probably explains the 2 orders of magnitude difference. However, there are some TLC drives (IIRC faster Crucial models) which can sustain writes at like several GB/s and I'm not sure they could manage that if their tail latencies were so long.

    That said, SCM drives are targeted at enterprise use cases, which often involve sustained writes and require low tail latencies. So, tricks like SLC write-buffering don't really work for them. Instead, this new drive seems to run its entire capacity in SLC mode (which makes it correspondingly more expensive, per bit).

    The article said:
    Keep in mind, however, that XL-Flash is a niche type of memory, so it is unlikely to become truly widespread.
    Are there any consumer SSDs using it in TLC mode?
    Reply
  • jeremyj_83
    bit_user said:
    Are there any consumer SSDs using it in TLC mode?
    The link to the XL-Flash shows that Gen 2 is MLC not TLC.
    Reply
  • jeremyj_83
    700k random write IOPS isn't really any better than a mixed use PCIe 5.0 drive like the Micron 9550 MAX. Those have 540k - 720k random write IOPS depending on the size of the drive. However, I guess this might be able to do it at QD4 instead of QD32. Granted they both pale in comparison to the P5800X with 1.5M random write IOPS but only on a PCIe 4 bus.
    Reply
  • JRStern
    Well sure, anyone could have built this kind of stuff at any point, but it's all about pricing, so what's the pricing?
    Reply
  • jeremyj_83
    JRStern said:
    Well sure, anyone could have built this kind of stuff at any point, but it's all about pricing, so what's the pricing?
    If you have to ask how much it costs you cannot afford it.
    Reply
  • bit_user
    jeremyj_83 said:
    700k random write IOPS isn't really any better than a mixed use PCIe 5.0 drive like the Micron 9550 MAX.
    I also thought the write IOPS sounded low, but you can't argue with 50 DWPD for 5 years!
    Reply
  • Co BIY
    jeremyj_83 said:
    If you have to ask how much it costs you cannot afford it.

    Toms could afford to ask for us though!
    Reply
  • thestryker
    jeremyj_83 said:
    700k random write IOPS isn't really any better than a mixed use PCIe 5.0 drive like the Micron 9550 MAX. Those have 540k - 720k random write IOPS depending on the size of the drive. However, I guess this might be able to do it at QD4 instead of QD32. Granted they both pale in comparison to the P5800X with 1.5M random write IOPS but only on a PCIe 4 bus.
    Gen 1 XL-Flash was fairly close to Gen 2 3D XPoint in performance so I'm wondering if the lower write IOPS here is due to the controller. In SLC mode XL-Flash should absolutely be capable of over 1M IOPS.
    Reply
  • thestryker
    I would love to see consumer oriented SSDs using XL-Flash (much like I would have 3D XPoint), but I'm guessing the endurance is endemic as it was for 3D XPoint which makes that virtually impossible to execute. If there's high endurance on the consumer parts everyone will just buy those unless they have specific support contracting. It seems to be the only way to get latencies down and in turn low queue depth performance up.
    Reply
  • bit_user
    thestryker said:
    I would love to see consumer oriented SSDs using XL-Flash (much like I would have 3D XPoint),
    Well, in SLC mode, it should be > 3x as expensive per bit as TLC and > 4x as expensive per bit as QLC. That's going to limit the market, quite heavily. Even in MLC mode, it would be > 1.5x as expensive per bit as TLC NAND, yet not as fast as the SLC numbers quoted in the article. Not to mention that max capacity will be down by corresponding amounts.

    The part I don't know about is how much overhead the XL part accounts for. The original version touted shorter bitlines and wordlines, as well as more planes. That translates into lower density, while possibly also requiring a more expensive controller that features more channels.

    thestryker said:
    If there's high endurance on the consumer parts everyone will just buy those
    Well, the high endurance should mainly come from running it in SLC mode. And if Toshiba is selling XL on the open market, then there's no reason someone couldn't use it in a consumer drive. So, I don't think the lack of such a product is a sign of intentional market segmentation.

    People buy datacenter SSDs not just due to the endurance, but the capacity (which is enabled by the form factor), the low tail-latencies, and enterprise-oriented features like out-of-band management, end-to-end data protection, power-loss protection, and more.

    Look at it this way: there are datacenter TLC and QLC SSDs that are definitely drawing from the same NAND pools as high-end consumer drives. If you were right that endurance is the only thing that matters, then there's no way those should coexist in the market with consumer TLC and QLC SSDs.

    thestryker said:
    It seems to be the only way to get latencies down and in turn low queue depth performance up.
    I'm not sure that SLC benefits read latencies. I think the structure of XL-Flash is probably the main thing responsible for those.
    Reply