Why You Want RAID
Inside Guide: Making the Most of NAS for Backup
What's this
To only say that parity bits in RAIDs add data protection runs the risk of minimizing the importance of that protection. If one backs up a folder from an internal drive onto a NAS volume, doesn’t that constitute protection? Naturally, the answer is both yes and no. Yes, some protection is better than none at all. But as anyone with unplanned children will testify, life happens, and sometimes a little protection isn’t enough. Moreover, as storage becomes centralized and NAS becomes the primary storage target, local PC storage may no longer serve as a second data copy.

The most basic RAID level with parity protection is RAID 5, which requires a minimum of three drives. RAID 5 involves both striping for performance as well as computing parity for protection. A RAID 5 can sustain one drive failure without permanent data loss, and that alone makes it worth implementing. However, keep in mind that no RAID is invulnerable; all data risk boils down to a matter of odds. The more protection one employs, the better the chance of avoiding data loss, but also the higher cost per gigabyte of protected capacity.
To illustrate, consider some numbers pulled from the RAID calculator at www.memset.com/tools/raid-calculator. Imagine you have a RAID of three 3 TB drives. Also keep in mind that an array with a failed drive generally defaults into a lower-performing degraded mode while simultaneously performing XOR recovery of failed drive data, all of which contributes to lengthy rebuild times once the RAID owner gets around to replacing the failed drive and initiating the rebuild. (We’ll assume one week for the owner to do this, but the reality is that it could take much longer if the RAID manager is not configured to alert the owner when failure strikes.) A sustained rebuild rate of 10 MB/s is slightly optimistic but possible. We’ll also assume that the user has employed low-end consumer hard drives with one-year warranties.
The following chart shows the relationship between RAID capacity, array type (RAID 0 is striping only, RAID 1 is mirroring only), and the annual odds of data loss according to the above parameters.
Would anyone wish for a probability of losing data every year? Of course not, which is why RAID 0 is rarely used for NAS applications, particularly for backup. By replicating data three times throughout the array, RAID 1 shows vastly superior protection, but effectively getting one-third of the paid-for storage capacity is a bitter pill to swallow for most CFOs and budget managers. RAID 5 offers the best compromise of the three options.
Interestingly, adding one more 3 TB drive to the array bumps up the RAID 5 capacity to 9 TB, which cuts the loss odds to 1:5.2. Here’s the intriguing part: Leaving that fourth drive as a hot spare will keep the array capacity at 6 TB, but by practically eliminating the failed unit replacement time, loss odds swing to 1:12.7. (Again, the $120 one might pay today for something like a 3 TB Seagate NAS HDD seems a small cost for such improved security.) Some NAS solutions will support RAID 6, which uses two drives for redundancy and can so sustain two drive failures without data loss. While a RAID 6 would still only offer 6 TB of capacity in our scenario, the loss odds plummet to 1:124.8.
Once a potential NAS buyer agrees that RAID is a must-have, the next question is whether that RAID should be managed through hardware or software, meaning through an enclosure with a dedicated storage processor or a more general-purpose CPU that handles storage computation alongside the operating system and other tasks. This decades-old debate usually boils down to matters of cost, work load, and expected performance. Hardware RAID will cost more, but if this added cost can be offset through application productivity gains due to hardware’s higher performance, then the investment makes sense.
For backup applications, though, hardware RAID may not be necessary. Backup simply isn’t an I/O-intensive operation, and owners are unlikely to need the added IOPS that a hardware controller would provide. Additionally, software RAID tends to be more flexible in its ability to add more drives, expand volumes, mix and match drive models and capacities, and migrate volumes from one RAID type to another. This flexibility makes a software RAID NAS preferable for backup, even though the software approach is likely to cost less.