Cloud Storage company Backblaze has published new data showing its SSD failure rates are almost as high as hard disk drives (HDDs). In a recent blog post, Backblaze explained its drive analysis for both SSDs and HDDs, which revolves around real-world use in a live environment. The company uses SMART stats to check drive health. Backblaze says it doesn't know why its SSDs have such a high failure rate at this time.
To analyze drive failures, Backblaze defines a drive failure as both a complete failure, or a drive failure that is imminent. To predict the latter, Backblaze uses the drives' internal SMART stats, recording read error rate, SSD wear leveling, power-on hours, program fail count total, and more.
To make the analysis more useful, Backblaze has only analyzed boot drives in its storage servers, instead of the main storage drives. Boot drives receive near-constant use from starting up the server to reading, writing, and deleting files, resulting in very little idle time.
Since 2018, Backblaze has used a combination of both SSDs and hard disk drives for boot drives in its servers, making the company the perfect candidate for this kind of testing.
In the first table, Backblaze shows the lifetime SSD and HDD failure rates starting from 2013. You can see that HDDs have a significantly higher failure rate than SSDs, making us think that SSDs are indeed much more durable than HDDs, like we've been told all along.
However, there are a few problems with this, the main one being drive age. Backblaze only began installing SSDs in 2018. But the company has data pertaining to hard drive health going all the way back to 2013, which is skewing the results quite a bit.
After taking into account drive age and equalizing it between SSDs and HDDs, we can see that the results have changed significantly. SSDs aren't that far behind hard drives in failure rate, with a 1.05% annualized failure rate compared to 1.38%.
Backblaze doesn't know why SSDs are falling so much, but the data definitely shows that SSDs aren't nearly as resilient as we once thought -- and to clarify, these SSDs are fully failing, not from maxing out the drive's write endurance, but from just from general use.
Backblaze also points out its hard drive failure rates increased a lot around the year 2018 to 2020 (before stabilizing in 2021) due to drive age (most of these drives were installed around 2014). This means SSDs could also suffer the same fate once Backblaze's SSDs become a few years older. But only time will tell if this is actually true.
For now, it's not clear why these SSDs are failing so often, especially when SSDs have no moving parts. It's not clear what make and model SSDs Backblaze is using. Some may have budget controllers and/or NAND flash, or perhaps other factors are at play.
As the company gets more long-term data using SSDs, the picture will likely become clearer as well. But whatever the reason for drive failure, this is a great reminder to always back up your storage drives, or use a RAID array with redundancy, whether your drives use spinning platters or non-moving NAND.