Storage Or Memory, Or Both?
Earlier, we covered how 3D XPoint removes much of the internal device latency associated with accessing NAND storage. The Optimized System Interconnect slide (see slide album below) contains much of the same data, but it also brings the Intel DIMM into the picture, which has almost non-existent platform overhead.
The NVM Tread and NVM xfer, which are the process of reading the NAND media and transferring the data to the SSD controller, respectively, are almost completely removed within a 3D XPoint SSD. The new non-ONFI media interface also helps deliver that performance to the controller faster. The remaining overhead comes from the controller ASIC (ECC, etc.) and firmware (wear leveling, GC, etc.). 3D XPoint largely reduces those problems, but the platform "link xfer&protocol" (NVMe), driver, storage stack, and file system still stand in the way.
Removing these last obstacles will unlock 3D XPoint's full capabilities. One of the key differentiators between memory and storage access is how the system accesses data. Storage access requires the comparatively slow read/write commands that must traverse the driver and stack to get to the application, whereas memory uses load/store commands that have a faster path to the application, which results in the elimination of the additional overhead.
Usually, storage semantics require interrupts, but there is a growing movement to use polling instead. Polling simply checks for outstanding requests on a regular cadence, as opposed to waiting for an interrupt to begin processing. It requires 6.9 microseconds of OS time to issue and receive an interrupt, whereas it only requires 4.4 microseconds for polling. Polling is an emerging technique to boost low-QD performance, but it does come at a significant CPU expense. Intel already has polling provisions in its SPDK (Storage Performance Development Kit), so we expect it to become more commonplace, particularly with persistent memories. There are also Linux developments that only poll when the queue is active, which should help reduce overhead.
The Support Ecosystem
Intel is bringing its 3D XPoint DIMMs to market, and though the silence around them has fostered suspicions of a delay, the persistent memory ecosystem development continues unabated. We don't know if Intel will use NVDIMM approaches, or simply use its DIMMs as a standard memory replacement, or both.
Vendors can use either the PCIe bus or DIMM slots to employ 3D XPoint as either memory or storage, with the DIMM slot being the most desirable option due to its faster interface. In-memory compute has roared to the forefront, and the industry is already developing NVDIMM technology with NAND-based DIMMs. NVDIMMs (Non-Volatile DIMM) are electrically and physically DDR4-compatible DIMMs that support either storage or persistent memory use cases with NAND. 3D XPoint could serve the same purpose, but with more speed. Using NVDIMMs requires some manipulation of the existing stack, which is already well underway.
NVDIMMs come in many flavors. Some use a non-volatile memory to back volatile DRAM (byte- or block-addressable NVDIMM-N), address NAND flash (or other non-volatile memories) as block-accessed memory (NVDIMM-F), or address both DRAM and NAND as a combined memory pool for either persistent DRAM or block access (NVDIMM-P).
There are a number of companies working on blended software-defined memory (SDM) initiatives that unlock a new wave of combined memory and storage pools. Plexistor is top of mind, largely because Micron announced during the Flash Memory Summit that it is working with the company. There are also other techniques, like Diablo's Memory1, that simply use NAND as the primary memory pool.
Microsoft Dances An NVDIMM-N Jig
Much of the leading-edge work for persistent memory programming is still underway, but NVDIMMs have already hit prime time. Windows Server 2016 and a forthcoming Windows 10 build provide for using NVMDIMM-N either as block storage (with no app changes) or as Direct Access volumes (DAX) for byte-addressable, memory-mapped use cases. Microsoft even baked the functionality into Storage Spaces, which simplifies and expands use cases and allows for common management tasks like volume mirroring, striping, and write-back caching with NTFS and ReFS file systems.
The OS abstracts away the complexity and speaks to the underlying media with standard memory semantics, such as load/store and memcopies. Microsoft provided a demo of an NVMe SSD at QD1 compared to an NVDIMM-N block device. The SSD scores 14,177 IOPS at QD1 compared to 179,276 IOPS for the NVDIMM. Another interesting caveat is the new "Storage Class Memory disks" entry in the Device Management pane, which we can't wait to see on our own computers.
Proprietary Interconnects Go Big
Intel's Optane DIMM strategy is to use 3D XPoint as its own flavor of memory, which will bring the benefit of increased density. Intel disclosed that this would require proprietary extensions to the DDR4 interface due to the long "outliers," or delayed responses, that are inherent to any non-volatile storage medium. For instance, a read or write command may not process in the nice orderly fashion that the DDR4 interface requires based on an ECC event.
The DDR4 spec requires a fixed number of clock cycles for data to return, and it doesn't have provisions to accept commands with variable latency. Intel's proprietary DDR4 extensions will accommodate the need. Intel has noted that the DIMMs will require processor support, which is likely the result of some adjustments to the IMC (Integrated Memory Controller) to accommodate the new extensions.
The memory system is a complex tiered model of L1, L2, L3, and other volatile caches, so ensuring that data written to persistent memory is in a non-volatile location becomes a critical requirement. The operating system and applications have to be aware of the location of data and have the ability to steer it to the correct location.
The SNIA Open NVM programming model, which consists of multiple industry partners and covers Linux and Windows, has defined new libraries with x86 instructions designed to speed access and ensure that the "persisted" data is committed to the non-volatile portion of the memory hierarchy (the persistent memory DIMMs). The Linux pmem.io set of NVM libraries allows for using memory-mapped persistent memories, which communicate with the file system using load/store commands to increase performance with the same storage medium (such as NAND) by removing the storage-centric read/write push/get storage commands.
The Windows model defines new cache and memory management instructions such as CLFLUSHOPT (Optimized Cache Line Flush) and CLWB (Cache Line Write-Back) to supplement CLFLUSH (Cache-Line Flush). The instructions manage and ensure data flushes. These software approaches apply to either persistent NVDIMM or standard persistent DIMM uses.
However, this is a fast-moving space. The pcommitt command, which was a key portion of the strategy and one of the early signs that new persistent memories were on the horizon, was recently deprecated with little notice. There will likely be more changes as the industry hammers out the final specifications.
The initial indicators appear to suggest that 3D XPoint's endurance might not be suitable for use as a pure memory replacement, but there are ways to mitigate the lower endurance. Optane DIMMs could work in numerous ways, including with DDR4 used as a fast front-end cache for vast 3D XPoint memory pools (managed by the IMC), or even hybrid DIMMS with both DDR4 and 3D XPoint together. In either case, Intel's DIMMs may have hit a roadblock, but there is always the option to merely use the first-gen 3D XPoint with existing NVDIMM technology if endurance and thermals are concerns.
MORE: Best Enterprise SSDs