Samsung Outs Z-NAND's Performance; Intel's Optane On Notice

Samsung released a performance document for its SZ985 Z-NAND SSD. The new SSD uses a special flavor of Samsung's flash to offer unheard of performance from a NAND-based SSD, with the obvious intention of offering good-enough performance at a much lower price than Intel's DC P4800X Optane SSDs. For now, Samsung is pushing the SZ985 to its data center customers, but much like Intel does with its 3D XPoint-based SSDs, we could see a consumer version come to market for the enthusiast crowd.

Samsung's document laid out some rather impressive performance specifications for the new drive, but they're limited in scope. The Z-NAND SSD boasts up to 3.2 GB/s of sequential read/write throughput and 750,000/170,000 random read/write IOPS. The random read specifications are impressive compared to other NAND-based SSDs, but we have to take the measurements with a grain of salt.



Samsung spec'd the 3D TLC PM1725a at over 1 million random read IOPS but has never submitted the SSD for review and doesn't specify the type of workload it used to measure random read performance. We've seen independent third-party tests that put the number closer to ~700,000 IOPS with an industry-standard 4K random test.


Samsung SZ985 Z-NAND SSD
Intel Optane (3D XPoint)
Samsung PM1725a
Intel DC P3700 
Interface
PCIe 3.0 x4
PCIe 3.0 x4
PCIe 3.0 x8
PCIe 3.0 x4
Media
Z-NAND
3D XPoint
48-layer 3D TLC NAND
20nm MLC NAND
Sequential Read/Write (GB/s)
3.2 / 3.2
2.4 / 2
6.4 / 3
2.8 / 2
Random Read/Write IOPS
750,000 / 170,000
550,000 / 500,000
1,080,000* / 170,000
460,000 / 175,000
Random Read Latency
12 - 20µs
10µs
90µs
115µs
Random Write Latency (typ)
16µs
10µs
20µs
25µs
Drive Writes Per Day (DWPD)
30
30
5
17
Capacity
800 GB
350  / 750 GB
1.6 / 3.2 / 6.4 TB
400 -800 GB / 1.6 - 2 TB

However, Samsung's claimed latency reduction is striking. The company said the SZ985 delivers 12-20µs latency for random reads and 16µs for random writes. These latency numbers are particularly strong compared to the other NAND-based SSDs in the table, although they aren't quite as mind-bending as Intel's Optane. The SZ985's random write performance also lags behind the Optane DC P4800X, but Samsung's strategy is to offer good-enough performance at a much lower price point. In the enterprise, cheap and "good-enough" almost always wins.  

Samsung also provided performance results compared to its PM1725a in RocksDB workloads, along with a few synthetic performance measurements. We also included a picture of the company's performance data displayed at the Flash Memory Summit. The latency-to-IOPS comparisons highlight similarly solid low-QD performance scaling compared to Intel's Optane SSDs.

From the limited data, the SSD looks impressive, but there are still several unanswered questions. Samsung hasn't released QoS metrics, such as 99th percentile performance, that are a strength of 3D XPoint-powered SSDs. The company also hasn't released the mixed random workload specifications, which are a critical aspect of real-world application performance. Given the SZ985's NAND-like random write IOPS specification, we imagine it wouldn't fare as well during mixed workloads. The latency scaling chart, for instance, only measures performance with a random read workload, but the SSD might not scale as well when a mixture of random writes are added to the equation. We expect more information to trickle out as the SSDs come to market.

The SX985's endurance is impressive. In fact, its 30 DWPD of warrantied endurance rivals Intel's Optane SSDs. Samsung hasn't released detailed information about its Z-NAND, but it has confirmed our early speculation that it uses MLC NAND in an SLC configuration. That provides both performance and endurance boosts, but it's likely that Samsung's references to a "unique circuit design" means shorter bitlines and wordlines in the NAND package, which would improve performance. It's also possible that the NAND die have more planes, which are sections of the die that respond independently to data requests, to boost performance. We also know the company is using a new controller.

Samsung also revealed that it has an MLC variant in the works that will offer slightly less performance but much more capacity. That indicates the SZ985 isn't a one-off product and that we'll see further development in the future.

Samsung's strategy of leveraging proven and mature NAND technology is smart, particularly due to economics. Lowering the price of any new memory requires scale, and that's a process that can take years. In fact, Intel and Micron just announced last week that they have expanded 3D XPoint production, although we suspect that some of the expansion is targeted at the second generation of 3D XPoint, which many expect to come along sometime next year. In contrast, Samsung already has copious NAND fabrication capabilities, so spinning up a new version of NAND likely doesn't require extensive investments and should foster widespread availability.

Price will be key to Z-NAND's success. Most existing applications cannot fully utilize the performance of Intel's 3D XPoint-based SSDs, so providing enough performance to get within range of Optane's usable performance at a lower price point could be disruptive to Intel's ambitions. Unfortunately, we aren't sure of pricing or availability yet.

Create a new thread in the News comments forum about this subject
This thread is closed for comments
23 comments
Comment from the forums
    Your comment
  • tsnor
    3dxp (optane) drives get to 10 usec read times with 1-2 usec of media read time.

    Typical NAND flash read time is 50 usec.

    It'll be interesting to see how Samsung gets to 12 - 20µs read times with zFlash -- if this is just a DRAM buffer in front of FLASH number that would be very disappointing and applications with random read patterns would see bad read performance.

    If this is truly a consistent 12 - 20µs read time then this is something new for NAND. (aside: think this is what the author Paul Acorn meant by saying "Samsung hasn't released QoS metrics, such as 99th percentile performance".)
  • derekullo
    A DRAM buffer is actually very good for random reads.

    In fact Tom's has mentioned in previous articles, ssds without DRAM buffers tend to not have great performance.

    The main factors are:
    1. The speed of the DRAM buffer, not all emulated slc is the same.
    2. The algorithm used, magical black box that turns data into awesome data.
    3. Size of DRAM buffer, if you only have a 5 gigabyte buffer it doesn't matter what values you use for the first 2.

    A best case scenario, but still valid, the Samsung 850 Evo 4 Terabyte has 96 gigabytes of buffer before performance drops to tlc levels.

    http://www.tomshardware.com/reviews/samsung-850-evo-4tb-ssd-review,4623-3.html
  • tsnor
    212804 said:
    A DRAM buffer is actually very good for random reads. In fact Tom's has mentioned in previous articles, ssds without DRAM buffers tend to not have great performance. The main factors are: 1. The speed of the DRAM buffer, not all emulated slc is the same. 2. The algorithm used, magical black box that turns data into awesome data. 3. Size of DRAM buffer, if you only have a 5 gigabyte buffer it doesn't matter what values you use for the first 2. A best case scenario, but still valid, the Samsung 850 Evo 4 Terabyte has 96 gigabytes of buffer before performance drops to tlc levels. http://www.tomshardware.com/reviews/samsung-850-evo-4tb-ssd-review,4623-3.html


    "..ssds without DRAM buffers tend to not have great performance..." agree completely for NAND flash based SSDs today, this is not true for 3DXP based drives. THIS is exactly why the samsung z-nand flash is interesting. Hope it is *not* using a cache to get it's published numbers.

    "...A DRAM buffer is actually very good for random reads...." unless you get some form of locality (non-random) behavior a read cache is useless. For example, you do email for a while then shift to loading a game. The records you need to load the game are not in the cache. Unless you detect sequential (non-random) patterns and pre-stage none of the records you need will be in the cache. For random read workloads a 3xdp drive will have an advantage over a nand flash drive with a dram cache. For sequential and predictable workloads the NAND flash drive might do OK.

    ...has 96 gigabytes of buffer before performance drops to tlc levels..." This is for WRITES, not reads. It allows the drive to quickly consume a lot of writes (much faster response then if it tried to write to TLC flash). But that is not used for reads (unless the data happens to have just been written to the buffer and has not yet been destaged).