IDE Training Course, Part 2: Performance and Data Security with RAID

RAID: A Comparison Of Different Modes

RAID 0: Striping

Technically speaking, mode 0 doesn't adhere to the principles of a RAID, given the fact that an important factor, data redundancy, does not exist. Hence RAID 0 offers no advantages in terms of security - in fact, on the contrary. All of the data are evenly distributed to all of the existing drives; this array is called a stripe set. This process can best be described with the "zipper method." The benefits are clear: because the data stream can be allocated to all the different drives, the data transfer rate is multiplied by the number of drives. Here the upper limits are the maximum transfer rate per channel (max. 100 MB/s for UltraATA/100), or the maximum bandwidth of the controller on the PCI bus (266 MB/s at 66 MHz / 32 bit PCI). However, in reality this very drastic performance boost comes at the price of higher fault vulnerability. Instead of one, now all of the RAID drives must work error-free. If even one of those drive crashes, all of the stored data will be lost.

RAID 1: Mirroring

Mode 1 is basically the complete opposite of RAID 0. The goal here is not to boost performance, but to ensure data security. When reading or writing data, all drives of the array are used simultaneously. Hence, data is written synchronously to two or more drives, which is equivalent to a perfect backup copy - perfect because the data is always 100% up-to-date.

RAID 2: Striping

Striping is based on the same principle as RAID 0: the stripe set distributes the data to all drives, though not in block form, but, rather, on a bit level. This is necessary because an Error Correcting Code (ECC) is implemented in all transaction data. Additional hard drives are necessary to store the resulting additional volume. If you wanted to guarantee complete data security, you would have to deploy at least ten data disks and four ECC disks. The next level would entail 32 data disks and seven ECC disks. This explains why RAID 2 never caught on.

On top of that, performance is only mediocre as multiple access is not possible in bit stripe sets. The higher the number of accesses, and the shorter they are, the more lethargic RAID 2 gets.

RAID 3: Data Striping, Dedicated Parity

Level 3 incorporates prudent error correction. Data is allocated Byte by Byte to several hard drives, while the parity data is stored in a separate drive. This is exactly the disadvantage of RAID 3, as the parity drive has to be accessed with every access. So the advantage of RAID, bundling the disk performance by distributing access, is partially offset. RAID 3 needs a minimum of three drives.

This mode requires quite a complex controller, which is why RAID 3, similar to levels 4 and 5, never caught on in the mass market.

RAID 4: Data Striping, Dedicated Parity

The technology of RAID Level 4 is similar to that of level 3, except that the individual stripes are not written in Bytes, but in blocks. In theory, this should speed things up, but the parity drive still remains the bottleneck.

RAID 5: Distributed Data, Distributed Parity

RAID Level 5 is generally considered the best compromise between data security and performance. Not only the data, but also the parity information, is distributed to all the existing drives. The resulting advantage is that RAID is really only a bit slower than RAID 3. However, failure safety is limited, as only one hard drive can safely crash. At least three hard drives are required in each case.

RAID 6: Distributed Data, Distributed Parity

With RAID 6, you're really only talking about RAID 5 - except that twice the amount of parity information is stored. Though this cuts down on performance a bit, it allows up to two hard drives to crash. It does require, however, a minimum of five drives.