RAID stands for Redundant Array of Independent Drives (or Disks).
RAID has multiple purposes. Not all levels/methods of RAID offer all of these features:
1. Pooling of drive space. All RAID levels have the ability to combine the space from several physical disks and present it to the computer as one logical disk. For example, using two 320GB drives in a RAID 0 makes the computer believe that one 640GB drive is connected.
2. Redundancy. Most levels of RAID (exception: RAID 0) have some form of redundancy built in. This protects against a single hard drive failure, with the ability to recover the array using a new blank drive when one drive in the array fails. The price that you pay for having redundancy is that not all the physical space from the pooled drives will be available. The amount of physical space lost to redundancy varies with the RAID level. Redundancy is designed only to protect against hard drive failure, not as a general backup mechanism. RAID redundancy cannot protect against accidental file deletion, malware, viruses, or corruption caused by malfunctioning hardware or software.
3. Availability. A feature widely needed in servers, this allows the array to keep operating when a drive has failed. For some RAID levels, there is a loss in performance when the array is operating with one drive failed (called a degraded state).
4. Performance. Most levels of RAID can offer increased performance (in terms of the sequential transfer rate, known as STR) over that of a single drive. In some cases, the STR can be lower than that of a single drive, such as when writing to non-optimally designed RAID 5 arrays, or when operating a degraded RAID 5 array. Higher sequential transfer rates do not necessarily translate into higher overall performance; it is highly application-dependent.
Levels of RAID:
RAID 0: Takes 2 or more drives and stripes the data across them.
Pooling of drive space: All drive space is pooled together. The logical drive size is the sum of all physical drive sizes, resulting in 0% space lost.
Redundancy: RAID 0 is the only level of RAID that is not redundant at all, thus is sometimes called "AID 0". The loss of a single physical drive will lose all data on the array.
Availability: None. Loss of a single drive results in a non-operational array.
Performance: Highest STR of all RAID levels. Can theoretically reach n * the single drive transfer rate, where n is the number of drives in the array. This figure may be limited by other factors, such as interface speed or bus speed.
RAID 1: Takes exactly 2 drives are mirrors data between them, i.e. the same data resides on both drives.
Pooling of drive space: Drive space is not pooled together. The logical drive is the same size as one physical drive, resulting in a 50% space loss.
Redundancy: The array is redundant. Loss of either drive does not affect anything.
Availability: Array continues operating with no performance penalty if a drive fails. RAID 1 is also the only RAID level where a single drive of the array can be taken to another computer and read without using the RAID controller (true of most RAID 1 implementations, but not all).
Performance: Most RAID 1 controllers perform the same as a single drive. Some high-end controllers with a large amount of cache can improve read speed from a RAID 1, but not nearly as much as other RAID levels that incorporate some kind of striping.
RAID 2,3,4: Not covered here, not used anymore.
RAID 5: Takes 3 or more drives and stripes them with rotating block parity.
Pooling of drive space: Available space is the sum of one less than the number of drives in the array. (For example, in a 3 drive RAID 5, you get space equal to 2 drives. In a 5 drive RAID 5, you get space equal to 4 drives, etc.)
Redundancy: The array is redundant. Loss of any drive does not lose any data. Drive can be replaced and the array rebuilt.
Availability: The array is available even when a drive has failed. RAID 5 (depending on the controller) may have poor performance when operating in a degraded state.
Performance: RAID 5's performance is highly dependent on the implementation. Most controllers show good performance on reads, although not as good as RAID 0. RAID 5 write performance varies widely, with some implementations doing very well, others doing very poorly. If performance is a consideration, you must check/benchmark the implementation you're going to use.
RAID 6: Takes 4 or more drives and stripes them using rotating dual-block parity.
Pooling of drive space: Available space is the sum of two less than the number of drives in the array. (For example, in a 4 drive RAID 6, you get space equal to 2 drives. In a 7 drive RAID 6, you get space equal to 5 drives, etc.)
Redundancy: The array is doubly redundant. Loss of any two drives does not lose any data. Drives can be replaced and the array rebuilt.
Availability: The array is available even when up to 2 drives have failed. RAID 6 (depending on the controller) may have poor performance when operating in a degraded state.
Performance: Since RAID 6 is typically only available in high-end controllers, performance tends to be good given those implementations. Read performance is on par with RAID 5, not as good as RAID 0. Write performance tends to be decent given that only high-end controllers support this RAID level.
RAID 0+1 and RAID 10: Though people tend to group these two RAID levels together, they are not exactly the same. Both are "nested" RAID levels, in other words, a RAID of RAIDs. In RAID 0+1, two RAID 0 arrays are created, each with 2 or more drives. Then, one RAID 0 is mirrored to the other RAID 0. In RAID 10, drives are paired into RAID 1 mirrors. Then, data is striped across all the RAID 1 arrays to form a RAID 0. In both schemes, a minimum of 4 drives is required.
Pooling of drive space: Logical drive space is equal to half of the physical drive space.
Redundancy: The array is redundant. Any operating drive can be lost, it can be replaced and the array can be rebuilt without losing data.
Availability: The array is available while degraded and during rebuild.
Performance: RAID 0+1 or 10 can be a very fast array, especially for writes. RAID 10 is superior in the degraded states because under some conditions it can handle multiple drive failures, and during rebuild operations RAID 10 performs much better than RAID 0+1.
RAID 0+5 and RAID 50: Similar to RAID 0+1 and 10, but uses 3 or more drives in RAID 5s instead of using RAID 1s. Not widely used, requires minimum 6 drives, offers better space utilization than RAID 0+1 or 10 at the expense of somewhat lower performance.
Intel Matrix RAID: Not a RAID level but a scheme offered by some Intel RAID controllers that allows multiple
independent logical RAID drives, with possibly different RAID levels, to be implemented on the same set of physical drives. For example, if you wanted to have a RAID 0 on your machine for the OS/applications, and a RAID 5 for data, this would normally require 5 drives (2 for the RAID 0, 3 more for the RAID 5). With the Intel Matrix RAID, it can be done with only 3 drives. A portion of the 3 drives is striped into a RAID 0, and the remaining portion is striped into a RAID 5. Similarly, if you want a RAID 0 for performance and a separate RAID 1 for redundancy, this would normally require 4 drives. The Intel Matrix controller can set that up using only 2 drives. The Intel Matrix RAID controller can set up a maximum of 2 independent logical drives like this.
Now let's clear up a few other things:
RAID 1 ... Ie: One drive has a 5% chance of faliure. Use 2x in RAID the probability of BOTH drives failing halves.
No, the probability of each individual drive failure does not change. Each drive still has a 5% chance of failure. What changes is the probability of losing
data. To lose data in a RAID 1, both drives must fail. The probability of both drives failing within 1 year is 0.05 * 0.05 = 0.0025 = 0.25% chance of data loss in 1 year's time. It's much less if a failed drive is replaced immediately.
Intel's Matrix RAID is another layer of confusion, though it's a pretty impressive thing that it does. You can almost run the mixed RAID modes with fewer drives than necessary. Example: RAID0+1 needs 4 drives (you setup 2 as RAID0, the other 2 as RAID0 and then RAID1 the two RAID0 arrays), but with Matrix you can do it with 2 drives. Setting up a RAID0 across the 1st part of the drive, then across the 2nd part and then RAID1 the two subarrays. I suppose it is faster than just doing RAID1 -- you end up with the same usable space, so there would be no point if it were slower.
No, the Intel Matrix controller cannot do
nested RAID levels requiring 4 drives in 2 drives like this. It can do
independent RAID arrays requiring 4 drives using only 2.
RAID5 should really be left to an add in card, PCI-X or PCI-E x4/x8 with a XOR chip and a DIMM slot.
I tend to agree, but it's not required for performance. There are Linux software implementations of RAID 5 that achieve very high RAID 5 read and write rates. However, I know of no implementations like that on Windows. For Windows, I agree with the recommendation of a higher-end hardware RAID card for RAID 5 implementations.