I want to increase storage capacity and fault-tolerance on a computer whose most disk-intensive common task is giant compile jobs. It currently has a single 250GB disk and I can afford to spend only about $250 on more space, which rules out an SSD. I'm currently thinking either a RAID5 or RAID10 configuration built out of three or four 500GB disks. The motherboard has six SATA (3.0G/s) ports and four usable drive bays, and the power supply is 400W. Questions are:
1) RAID5 or RAID10? Either way it would have to be Linux's built-in software RAID.
2) Am I asking for trouble if I put four hard disks on a 400W power supply?
3) 500GB disks seem to come with either 16 or 64MB of onboard cache these days. In this configuration is it worth paying extra for the additional cache? Is there any other reason to prefer the WD "Black" series over the "Blue" series (or Seagate equivalent)? Note: I have no HD-manufacturer brand loyalty.
4) Should I spend my limited upgrade budget on something else instead? Note that I really do need more space, so "srsly get an SSD" is not an option unless you know where to find a terabyte of SSD that doesn't cost an arm and a leg (and can still be trusted to give back *all* of your data when asked).
(in case you're curious, the "giant compile jobs" are on the scale of a web browser: currently this takes ~30min from scratch with a single HDD; I don't expect to be able to get that to go much faster with spinning rust, but I need to not make it any *slower*.)
A good estimate of the power draw per disk is approximately 15 W per drive (this is a max so your expected draw is likely to be lower). Depending on your othe components the 400 W may be sufficient. But more power is always better.
Any RAID using a striping scheme to increase speed (RAID 0, 5 or 10) speeds up disk access by breaking data down into packets that are stored on a pair (or 4, or 6, or 8 etc.) of drives and taking advantage of the increased speed of R/W events. This is less dependent on disk caching than single drive configurations. So onboard cache is les valuable in a RAIDED array.
You should use matching HDDs in your RAID array. The consequences of non-matched speeds or cache often results in storage failure if you can get unmatched drives to work. Otherwise it is like a conductor trying to conduct at two tempi at once.
RAID 5 and 10 are great for fault tollerance, but their use in a single system is limited. RAID 5 & 10 are for mission critical situations where a server needs to have a mirror if there is mechanical failure. RAID does not relieve the need for backup of files in addition to their redundancy. If I were in your situation I would look at a RAID 0 of a pair of 500 GB drives with a regular scheduled backup to a 1 TB drive.
The advantage to the design I suggest is that while it does not give you a perfect mirror of your data storage, it does provide constantly renewed backup points for a restore. In the case of a malware breakdown a RAID merely multiplies the virus along the whole array, while a backup will restore all files to the last clean resotre point.