Backblaze's HDD Reliability Report Questions Enterprise Drive Premium

Cloud backup service provider Backblaze recently published its hard drive reliability statistics for Q3 2017. The company’s analysis of its own data highlights something interesting, which is that the difference in observed reliability between its Seagate 8TB consumer and enterprise drives is only 0.1%. This led the company to pose the question of whether enterprise hard drives are worth the cost premium.

Backblaze’s data set currently consists of over 86,000 hard drives ranging from 3TB to 12TB and from manufacturers WDC, HGST, Seagate, and Toshiba. We’ve commented on Backblaze’s misuse of its hard drives before, but we understand that it’s altered its setup since then to address some of those issues. In any case, its data can still be a somewhat valuable additional point of comparison for making purchasing decisions.

The metrics in Backblaze’s data are straightforward. “Drive Count” is the total number of drives of a model. “Drive Days” is the combined total running time of all the drives that make up “Drive Count.” “Drive Failures” is the total number of drives that make up “Drive Count” that failed. “Annualized Failure Rate” is “Drive Failures” divided by “Drive Days” (converted to years) and it’s Backblaze’s chosen metric for drive reliability.

There are some things to keep in mind when looking at Backblaze’s data, however. First, a drive model’s annualized failure rate is meaningful only if it also has a high drive count and drive days. Otherwise, there probably isn’t enough data on it yet. Second, there is at least one unknown variable: individual drive life. All the drives are of different ages, so the data for a particular model with high drive days could still be composed of older and newer batches of drives. The implications therein depend on point of view; older drives could be closer to end-of-life and are therefore more likely to fail, but they could also be the ones that did not fail at the outset and are therefore more reliable.

Backblaze’s own analysis has an additional and particularly interesting point of note. At the beginning of 2017, the company added two models of 8TB Seagate drives: a consumer drive (ST8000DM002) and an enterprise drive (ST8000NM0055). The cumulative annualized failure rate for the two are 1.1% and 1.2%, respectively. The difference of only 0.1% led Backblaze to question whether the enterprise drive is worth the price difference.

However, this is only a preliminary finding, and it should be taken with a grain of salt. Backblaze does note in its report the difference in the respective warranty periods of these drives. The consumer drive has a one-year warranty, whereas the enterprise drive has a five-year warranty. The 0.1% difference is after only one year (actually, slightly less) of use, which is within the consumer drive’s warranty period. The advantage of the enterprise drive is that it should offer a stronger annualized failure rate over time. Therefore, the consistency between the two drives over the time period Backblaze tested them doesn’t tell us anything about the respective value of the drives when the consumer model extends past its warranty period.

It’ll certainly be interesting to see what happens as Backblaze continues to track these drives through their lifetime.

Create a new thread in the News comments forum about this subject
This thread is closed for comments
Comment from the forums
    Your comment
  • ammaross
    @Leon (Author)
    You also fail to point out that the new Enterprise drives have only been in service for 94 "drive days" (total drive days divided by number of drives) vs the consumer drive's 375 drive days making the consumer drives effectively in service for FOUR TIMES as long. This obviously will skew the "drive failures" value as there's been longer in-service time to have a failure.
  • bit_user
    I had similar doubts about their "AFR".

    What we really want to see is the drive mortality curve (i.e. % of drives still running after x days), for each model. All they'd have to do is keep a database with each individual drive's installation and decommission dates. It's not as much effort as it sounds, since you can read a drive's serial number via SMART.

    As a matter of fact, I'm pretty sure you can even query the number of hours the drive has been spinning.
  • ammaross
    Yep, since they have "drive hours" they obviously have an install date. It seems they would document serial number and drive type for "dead/retired" date since they have the failure rate by type. Would be a simple bit of reporting to give a bell curve of death, average life at death, etc. This kind of simple spreadsheet output is nearly useless. 1.1% drive death for drives with 94 days average service life vs 1.2% drive death for drives with 374 days drive life, while mildly useful, really doesn't give you a good dataset to work with.