To read more on our test methodology visit How We Test Enterprise SSDs, which explains how to interpret our charts. The most crucial step to assuring accurate and repeatable measurements starts with a solid preconditioning methodology, which is covered on page three. We discuss 4KB random performance measurements on page four, explain latency metrics on page seven, and introduce QoS testing and the QoS domino effect on page nine.
We conducted our DC P4800X latency testing using a format that conforms to our earlier test data for the other samples. Our QD4 to QD256 data consists of a varying number of threads, as outlined in our How We Test article, whereas QD1 values are measured with a single thread. These allocations are representative of actual application environments. We also tailored some tests specifically for the DC P4800X's unique characteristics. We have yet to chart most of it, so we have much more data that we aren't presenting. Also, due to time constraints and platform disparities, in some tests we don't have comparative data for our other samples (yet). We've included a snippet of the raw data in the DC P4800X latency matrix (the last image, values in ms).
Mixed workloads are the most important measure of performance, and the 70/30 read/write latency chart highlights a commanding average latency lead for the DC P4800X (get used to seeing that). The gap widens into a chasm with 99.99th percentile measurements, which quantify the 0.01% worst-case latencies. This is crucial to ensuring consistent performance during transactional workloads. We dialed in more stringent percentile measurements (99.999/99.9999) in our dedicated latency chart, and it's notable that the DC P4800X achieves a lower 99.9999 measurement than the competing SSDs' average latency measurements. We don't test NAND-based SSDs to that level of granularity because, frankly, their results are atrocious. The DC P4800X's 0.145ms 99.9999% measurement is almost unbelievable, but we tested several times to confirm.
The average 4K read latency results are interesting, as they reveal the DC P4800X's tuning for low-QD workloads. The DC P4800X is brutally efficient at lower QD, but some NAND-based SSDs challenge it during unrealistically heavy loads. The 99.99th percentile 4K read chart shows the incredible read latency, which is staggeringly good from low to high QDs. The DC P4800X follows the same trend in the 99.99th percentile 4K write results.
The DC P4800X uses new media that's faster, but unleashing its performance requires new technology at every point in the architecture. For its 3D XPoint-powered QuantX products, Micron transitioned to a DDR-like interface between the 3D XPoint packages and SSD controller. The new interface is faster than the ONFI/Toggle industry standard protocols, which helps reduce communication latency to the controller. Intel also uses the same proprietary BGA alignment for 3D XPoint packages as Micron, so we suspect it is using a similar technique.
We're accustomed to seeing up to massive 16-channel fire-breathing controllers on standard enterprise PCIe SSDs, and even multiple beefy FPGAs working in parallel. Intel's wringing out this incredible performance from a single propietary seven-channel ASIC, which implies that it is incredibly powerful. ASICs are also more power-efficient than FPGAs. Intel indicated that it uses hardware-accelerated read/write paths, so there is no firmware to interfere with the data path.
Hat tip to Micron; the 9100 has the closest latency measurements to the DC P4800X under light read workloads, and actually beats the DC P4800X during a 4K write workload at QD1. That's really impressive for a NAND-based SSD, but its lead will evaporate quickly when we test in more diverse conditions.
MORE: Best Enterprise SSDs
MORE: How We Test Enterprise SSDs
Storage of all kinds and an increased focus on parallelism in software implementations is more exciting.