There's been plenty of speculation that 3D XPoint isn't hitting Intel's initial endurance projections, and there's ample evidence of this.
Intel and Micron already walked back their initial performance claims of "1000x faster than NAND" by clarifying that the comparisons were for the media itself, not the end device. This makes sense because there are several layers between the media and application, such as drivers, the operating system, and the file system. We covered those challenges here.
However, both companies also walked back endurance projections without a clear explanation. We dug in on the issue and found a few answers.
Intel also has Optane DIMMs in development. These will be used specifically as a memory device that leverages memory semantics to serve as an adjunct to DRAM. Using 3D XPoint as a dedicated memory device would require much more endurance. Outside of a brief mention in an earnings call, the company has remained curiously silent on the revised DIMM schedule, which has led to more speculation.
We boil SSD endurance metrics down to three values to help demystify longevity, cost, and the advantage of extra capacity. We derive pricing information from vendor-provided MSRPs for unreleased products, but use average retail prices for shipping products (YMMV). We calculate the price-per-GB to give an unfiltered view of cost, and then calculate the number of terabytes that you can write to the SSD per dollar spent (Endurance Per Dollar). This provides insight into the value of endurance that is far more indicative of reality, whereas DWPD (Drive Writes Per Day) metrics can be muddy due to varying warranty periods.
We also consider the advantages of additional capacity. For instance, Micron's 2.4TB 9100 Max weighs in with an endurance/price measurement of 2TB per dollar, which is less than the DC P3608's 2.42TB per dollar. However, it features an additional 800GB of usable space, which adds value. We utilize our Cost Efficiency Index to provide an amalgamation of the available capacity, cost, and endurance metrics by simply multiplying the capacity by the endurance-per-dollar.
The DC P4800X has the highest dollar-per-GB rating at $4.05, but it provides more than double the endurance-per-dollar than many of the competing SSDs. The DC P3700 provides the best endurance-per-dollar metrics, which rewards it with the highest cost efficiency score. Intel nearly doubled the DC P3700's endurance rating after a long-term study of reliability in the field, and the company also indicates it will increase the DC P4800X's endurance rating if it follows a similar trajectory. The inaugural 375GB model also only carries a three-year warranty, but the high-capacity models will provide the normal five-year warranty period.
Due to its small capacity and high dollar-per-GB metrics, the DC P4800X yields the lowest cost efficiency score, but we aren't comparing "like" products. The NAND-based enterprise SSDs all offer similar performance in applications, but the DC P4800X brings exponential performance gains that justify the premium.
All SSDs have spare area, which is a portion of the capacity that the user cannot address. Some of the spare area is dedicated to mitigating ECC overhead and device-level RAID implementations that boost reliability. The remainder of the spare area is usually dedicated to over-provisioning, which is traditionally used to increase performance and endurance. We don't know the actual resource provisioning for all of the SSDs in our comparison table, so we are simply boiling down all 'unused' capacity and classifying it as spare area.
Optane SSDs have spare area, but they don't use it in the traditional sense because many of the normal SSD rules don't apply. Instead, it uses a small amount of extra space for ECC and metadata at the device level, and unlike NAND, it doesn't include any built-in over-provisioning at the die level. This tracks well with information we learned last year. We can also assume that some of the extra capacity is used to replace failed cells.
|Header Cell - Column 0||375GB Intel DC P4800X||1.6TB Intel DC P3700||1.6TB Intel DC P3608||2.4TB Micron 9100 Max||2.7TB Mangstor MX6300|
|Endurance Per Usable GB||32.8 TB||27.35 TB||5.45 TB||2.73 TB||12.77 TB|
|Spare Area / %||73GB / 16.3%||400GB / 20%||700GB / 30.4%||1600GB / 40%||1300GB / 32.5%|
|Media Endurance Per Raw GB||27.4TB||21.9TB||3.8TB||1.6TB||8.63TB|
Unlike NAND, 3D XPoint is a write-in-place memory, so the SSD controller doesn't have to erase existing data before writing new data. Eliminating the extra write cycle increases endurance and performance. It also simplifies device management by removing the read-modify-write garbage collection process. The underlying storage medium is also bit-addressable, so in contrast to NAND-based SSDs that write in larger chunks, the DC P4800X can break small amounts of 4K data into several chunks and spread them across multiple die, which boosts performance. The DC P4800X uses wear-leveling algorithms to ensure consistent wear across the media.
The DC P4800X offers 32.8TB of endurance-per-usable-GB, and if we calculate the endurance of all the raw media on the device it falls to 27.4TB. Intel's own DC P3700 is the only SSD in our comparison pool that can challenge those metrics, but there's an easy explanation. The DC P3700 uses LDPC error correction, which boosts error recovery, thus increasing endurance, but it also incurs a significant performance overhead. LDPC can create severe I/O outliers, which are errant data requests that take more time to complete than a normal operation.
LDPC performs well during normal "hard decision" error processing when the errors are easy to correct, but it has a performance impact when the code transitions into "soft decision" mode for difficult-to-recover bits. Soft decision decode re-reads the cell and surrounding areas to determine the contents of the cell, which causes latency to unpredictably skyrocket into the hundreds of microseconds for some operations. This technology has adjustable parameters that allow the vendor to control error correction intensity. More robust error correction offers more endurance but kills performance, and most important, consistency. If Intel is using LDPC, which isn't confirmed, then it is likely a streamlined implementation. It's more likely that Intel, like Micron, is using low-overhead ECC, which doesn't confer as much endurance but has minimal performance overhead.
This means that Intel and Micron can both increase endurance if needed through more robust error correction algorithms, even if that doesn't please the high-performance target market. We still think that 3D XPoint can offer enough endurance for its intended applications as-is, though.
Endurance is like any other specification. Intel has a strong reliability track record and is known for conservative endurance specifications. As long as the vendor clearly states the expectations, and then meets them, the customer is satisfied. Endurance does have implications for Intel's profit margins, but most customers aren't concerned with the endurance of the underlying material as long as it meets the specifications. A quick glance at endurance-per-raw-GB, and how much it outstrips non-Intel competitors, should put the endurance fears to rest.
MORE: Best Enterprise SSDs
MORE: How We Test Enterprise SSDs
Storage of all kinds and an increased focus on parallelism in software implementations is more exciting.