The Week In Storage: Self Serving Survey Says SSDs Are Screwy, NVMe Over Fabrics Released

This week in storage (well, the week prior) found us in hot, humid, steamy, rainy, sweaty and wonderful Taipei for Computex 2016. All brutal combinations of heat and precipitation aside, Computex was yet another exciting affair for storage, even though the show overall felt a bit wilted this year. Perhaps my personal perception is due to a seemingly flat year in terms of whizbang product announcements on the PC side. It might be that CES Asia, which occurred in Shanghai a few weeks prior, stole some of the Computex thunder from the cloudy Taipei skies.

There was still plenty to gawk at whilst rudely blocking the aisle, such as the Adata DRAMless SSDs or its latest Tigershark SSDs. True to Chris Ramseyer's Computex predictions, several vendors uncorked 2 TB SSDs, including Micron's 3D NAND-powered 2100 series, the Patriot Ignite, and the Corsair Nuetron XTi. Micron unveiled its forthcoming high-performance Ballistix M.2 SSDs, and Areca also unveiled its new high-capacity Thunderbolt 3 products.

Intel also joined in with a 3D XPoint demo, and as usual, Intel's as-yet undefined type of memory impressed us with its speed. More details also emerged on the Kaby Lake/NVDIMM front, which will be one of the vehicles that speed 3D XPoint into the system.

Self Serving Survey Says Your SSD Is Going To Die And Take Your Data With It

Another week, another self-serving survey. Self-serving surveys are (unfortunately) becoming increasingly popular among the storage marketing folk, and the more panic-inducing the survey, the better.

The process is simple: A company typically commissions a survey that justifies some aspect of its business model, thus drawing attention to the relevance of its products or services. The company then sends out a culled portion of the responses, preferably something shocking, as a press release to the awaiting news world. The media regurgitates the data, usually with little to no fact checking, thus placing said company directly in the limelight. Advertising by way of survey, what wonders the world beholds.

The latest hot topic is a press release from Kroll Ontrack titled "Data Loss from SSD Technology Increases with Wider Adoption." My eyes instinctively narrow at the sight of the title, as it is akin to saying "The Sky Is Blue, and Other Assorted Painfully Obvious Statements of Fact." There is no perfect data storage device known to man, so data loss is a given. Naturally, there will be no data loss if no one uses the storage device, and of course, data loss from any storage medium will increase with more use.

The survey goes on to reveal that, of nearly 2,000 respondents, 38 percent had an SSD failure. Of the 38 percent that experienced a failure, two-thirds permanently lost the data stored on the SSD. This means that roughly 24 percent of the respondents to the survey had lost data on an SSD. Curiously, there is no mention of how many reported lost data due to an HDD.

Queue the end-of-the-SSD-world headlines as some rush to proclaim that SSDs are failing at spectacular rates and that users likely will not be able to retrieve the data. The first tip-off that the survey is somewhat skewed comes in the form of the 92 percent of respondents that use SSDs, which is far above the mix of normal users that have SSDs in their systems. (SSDs make up only 10 percent of brand new desktop PCs, for instance.)

If we take a second to dig into the single annotation on the survey, we see that the survey pool consists of 1,849 Kroll Ontrack data recovery customers. This is a bit like shooting fish in a barrel, as data recovery customers have already experienced a failure and the resultant data recovery or loss.

In Kroll Ontrack's defense, it does note in the press release that the survey consisted of its own customers. Self-serving surveys can backfire though, as it is easy to misunderstand the context of a survey. For instance, if we consider that the respondents are all Kroll Ontrack customers, one might surmise (perhaps incorrectly) that Kroll Ontrack was unable to recover data from failed SSDs two-thirds of the time. This does not engender much faith; perhaps other companies are more successful.

In either case, it is important to note that the responses are not indicative of either the failure rate or recover-ability of SSDs as it pertains to the general population.

All storage devices suffer failures. There are significant hurdles to recovering data from SSDs, as there are with all forms of data storage. Fortunately, most SSD failures are in the SSD controller or logic and not a physical failure of the actual medium (NAND). This increases recover-ability. Clever data recovery vendors work with SSD manufacturers to understand the device, and thus recover the data.

HDDs (arguably) suffer a higher rate of physical damage to the actual storage medium (platters) due to head crashes. Head crashes consist of the heads colliding with the platter, usually due to shock or vibration. This literally scrapes away the data, which renders it unrecoverable (we are also scratching at this topic, as it would take several pages to explain SSD vs. HDD failures). 

SSDs are nearly impervious to shock and vibration, and publicly released SSD failure rate data from vendors indicates that SSD failure rates hover in the tenths of one percent, whereas the HDD AFR (Annual Failure Rate) is usually in the 4 to 8 percent range. It is hard to accurately gauge the typical HDD failure rate due to a lack of industry disclosure, which alone speaks volumes.

SSDs are more reliable than HDDs, but all storage devices will fail, given enough time. All data will be lost eventually, and it does not matter where it is stored. As always, we should store data on several different storage devices, and at multiple locations, to mitigate risk. 

Of course, these surveys are tightly controlled affairs; I have yet to see a company promote a "survey" that states it is irrelevant, or that its product/service is only marginally important. 

This Week's Storage Tidbit

NVMe is coming soon to a network near you.

NVMe is best known for its use as a storage protocol designed specifically for non-volatile memory (such as flash) that reduces latency and system overhead while boosting performance. The new protocol has gained broad industry acceptance for DAS (Direct Attached Storage) implementations, meaning directly inside of the computer or server.

The obvious performance and efficiency gains from NVMe are wonderful inside the system, but it was only a matter of time before our engineering friends decided to bring NVMe out of the box. The new NVMe over Fabrics 1.0 specification allows the protocol, which typically communicates over the PCIe bus, to communicate over other interconnects (such as Ethernet, Fibre Channel, InfiniBand and others).

PCIe can be used to connect servers, but it has length restrictions that limit applications, such as with the technique employed in the FSA 200 we recently evaluated. Expanding NVMe over the various networking options extends the range of shareable storage in datacenter applications and also provides much better performance than competing protocols. 

Protocol perfection is not achieved when there is nothing left to add, but rather when there is nothing left to remove. NVMe revolutionized storage performance by peeling back the performance-inhibiting layers of storage abstraction, and extending the speedy protocol over the network promises to revolutionize the performance of shared storage, as well.

Paul Alcorn is a Contributing Editor for Tom's Hardware, covering Storage. Follow him on Twitter and Google+.

Create a new thread in the US News comments forum about this subject
This thread is closed for comments
11 comments
Comment from the forums
    Your comment
  • sh4dow83
    That survey does seem questionable to say the least. Personally, I had about six HDDs fail on me over the years but no SSD so far. I own 3.
    And so I plan on migrating my HDD RAID1 to SSD once the prices drop a bit further still.
  • sh4dow83
    By which I'm by the way not claiming that MY experiences are representative of anything.
    By the way... why didn't you guys research the average failure rates of HDDs? Because I bothered doing that before relying so heavily on SSD and found that they indeed fail much more often.
  • John Lauro
    I have lost many hard drives and ssd drives. Not sure which I would say is more reliable, but what I have found is HDD can tend to be brought back to life a little and read most if not all the data. However, when a SSD decides to fail (and I am talking failure way prior to it's estimated life expectancy), it dies immediate, no way of setting up a machine to keep retrying and eventually getting the data off. Plus the reuse different bits and so it's also less likely a recovery service will be able to pull the data off the individual chips and put it back together.

    I think manufacturers should be learning how to have drives fail in a more readable state, but I haven't seen that yet with SSD. I currently have about 120 SSD and way more than that HDDs, and haven't lost any data from a SSD failure. However, all the SSDs are in RAID 1 or RAID 5 or 6, etc...

    Based on my experience, HDD give more warning signs they are degrading... SSD just die for no good reason. As long as you have automatic redundancy in place, the failure rates are not that significant even if one was 10x as likely to fail as the other. However, if you only have a single drive in a computer it can be critical...