Server data protection: so many choices

jedi940

Distinguished
Mar 11, 2007
762
0
19,010
I am trying to get my HTPC up and running and I have spent many hours sifting through the various threads here looking at my options for storage. So far, I have been using a Rosewill RC-209 raid card and 2x250GB drives in RAID 1. Those just recently became full and now I am trying to add movies to the mix. I bought 2 1TB drives to add in RAID 1 but my card hangs while trying to detect them. I have already initialized and formated them in windows so I know they are good. Since I need to change the way my server is set up, I thought I would invest in a more permanent and expandable solution.

I liked the hardware option because I had data redundancy which would protect in case of a failure and with the RAID card, I could move the drives around with little hassle. I just needed to install the drivers for the card and all my data was instantly accessible. I have had the card in three different computers so far as I continually upgrade computers.

I will be storing everything on this server. Media files (movies, music, tv) as well as family pictures and things for work that I cannot loose. I need whatever I choose to be stable and able to withstand a drive failure. I have been reading about unRaid, flexraid and Raid 6.

My biggest concern is the future. When I add drives to unRaid will I be able to seemlessly move the the parity drive to a larger one without hassle as the drives I add will continually get bigger? Flexraid seems to easy to bring down since it has no protection against a gradually failing drive. Is Raid 6 a good option? If I choose RAID 6, can I easily add drives to the logical volume? I would like to not create a new drive letter every time I run out of space. Also, what about Migration? In the future, if SATA becomes obsolete, or the limits of my motherboard/processor limit the expansion of my system and I have to upgrade, will I be able to move my storage to the new system and retain my data with any of these options?

There are just so many different options, I'm not sure which is the best for my application.
 
First of all, I hope that you're not relying just on RAID to protect your data. You also need offline backups, and for the least risk at least one of them should be stored offsite. There are way too many risks to your data that RAID just can't protect against for you to rely on it as the sole means of defense.

In my experience, it's simpler if you treat each and every RAID volume you create as if it was a physical disk drive. With standard, non RAID disks when you want to add capacity you'd never consider opening up the disk and bolting another platter and heads onto it - you'd simply buy a new, larger disk and copy your data to it. If you treat your RAID sets that way then you eliminate a whole lot of complication and risks, IMHO.

Having external backups helps because it makes it possible to use them as a low-risk way to migrate to a larger/different RAID subsystem. First you back up all your data (preferably twice). Next, you reconfigure your RAID array - it doesn't matter whether you're adding drives to the same system or moving to a new firmware/software version or to new hardware altogether. Don't even worry about whether you can do this while preserving your data - you've got that stuff safely ensconced away on your backup disk(s). When you've got the new RAID all set up, just go ahead and restore the data.

Alternatively, and even easier assuming you can afford to have the old and new RAID configurations running at the same time, just set the new one up and then copy from the old to the new. Don't decommission the old one until you're satisfied that the new one is working and you're run at least one backup from it.

There are a lot of things that can go wrong with RAID, and the chances of a screwup increase exponentially when you're doing one-of-a-kind things you have no experience with like a RAID upgrade. If you're data is important then you need a backup anyway, and IMHO you should just use that to make the whole process simpler.
 

jedi940

Distinguished
Mar 11, 2007
762
0
19,010
The biggest thing is cost right now so I am trying to find the most efficient way to do all of this. Down time is not really that big of an issue so I guess I could get rid of raid all together. I have 2x250GB and 2x1TB drives so I could just split them up and problem is i have no means of offline backup right now. I could put the two drives in seperate computers for now but I would need an automated backup because I WILL forget to do regular backups. Eventually I will probably create an unraid server for my media only for my HTPC. Also, down the road, I would like to be able to set up a vpn so I can access my data from anywhere.

the budget right now is about $300 not including the 2x1TB drives I already have.
 
The biggest thing that redundant RAID buys you is zero downtime to recover from a drive failure. But it's useless at protecting your data from all the other risks.

If down time isn't a big issue then you're better off focusing your efforts on implementing a backup scheme than how to upgrade your RAID system. If you've got two 1TB drives with a copy of the data on both, that data is lot safer if the 2nd drive is offline (and preferably offsite) than if it's on an internal, real-time mirror of the 1st copy. For example: with a RAID setup, one slip of the finger instantly deletes BOTH copies of your data.

The cost is the same, it's just matter of getting your backup act together. Just my Humble Opinion... ;)
 

taso11

Distinguished
Aug 27, 2008
134
0
18,690
I agree with sminlal. The majority do not understand the purpose of a backup. You need to have multiple copies, the more the better, of your most critical data. The other thing you need to remember is that your backup should be protected from damage. Meaning definately not connected and running 24/7 and preferably able to be moved offsite. The purpose of a server and redundancy is to maximize availability of data and minimize downtime. So what I see you need to figure out is how much pictures, movies and documents you have. And then figure out how much of that you actually need backed up. Then you'll know how many gigabytes you'll need for you stuff and how many gigabytes you'll need for your backups. Remember movies and music, although time consuming, can be re ripped. Once you've figured that out then we can recommend how to use what you have and see if you need anything else.

Honestly, I too just recently realized what a backup really is. So now I'm almost finished reconfiguring my equipment as one media server for music and movies. Another server for important stuff like pictures, purchased software and keys, documents and tax stuff. These servers are 8 hard drives in raid6, created before my backup epifany!, and as energy efficient as I could make them. They are only on when I need them. A Windows Home Server box will be my backup solution. This box is almost done. It will take all my orphaned hard drives, 16 of them actually. They don't have to be the same model or size. Drive extender just takes care of everything. I have read it can be set to turn on automatically and perform it's backing up of my important document server and then shut down.

I know, I haven't figured out an offsite plan yet. I don't like the idea of backing up over the internet. My data is too far away! lol Hopefully my thinking makes sense and I didn't bore you too much. I'm still learning too! :)
 

jedi940

Distinguished
Mar 11, 2007
762
0
19,010
I guess there is no point in using RAID then. If I put the two hard drives in seperate computers and back up regularly, then I don't need it. If the hard drive fails, I can just link to the network folder for the time being and avoid any down time. Then, when I can afford it, figure out some sort of offline backup. None of it will be offsite though. Just not practical yet. I think I'll end up using unRaid or RAID 6 for my HTPC back end. It won't be the end of the world if I loose it because I have the disks, it just won't be fun :)
 

Kewlx25

Distinguished
A good RAID6 controller will let you expand. Also, most decent controllers also have a "Hot Spare", so if one drive is detected to be faulty, the RAID drive will immediately begin rebuilding the faulty drive on the spare.

You must also be careful about "soft" errors. A bit error could be introduced anywhere. Including your CPU/L1/L2/L3 cache/Ram/etc. Also, harddrives can "mix up" data over time.

In my more expensive "home setup", I would have a three server clustered file system running on ZFS and I would then share this out through Windows 2008.

Interesting note about ECC memory. I was looking up info about different clustered file system and one of the large GNU ones had an interesting note about ECC memory. They said their file system keeps hashes of the data blocks to make sure the data hasn't changed. When using non-ECC memory in their server, every few terrabtyes of data, one of the servers would report bad data. They had a hard time tracking down the issue, but in the end, switching to ECC memory completely removed these errors. Seems that every so often, a memory error would occur and cause a single bit to change in the replicated data.

Over time, files being copied back-and-forth between computers can become corrupted, And over time data degrades on any media.
 

taso11

Distinguished
Aug 27, 2008
134
0
18,690
I guess ECC memory will be my next upgrade to research. I need to see if it's useable with my Gigabyte MA78G-DS3HP motherboards in my two fileservers. I guess I'm safe for now since I mainly read data off them and the occasional movie and music rip write so I'm not modifying the data constantly that would increase the data corruption errors.

I started out with Areca as my first hardware raid controllers, two 1220's to be exact. I've never had any problems with them and they are fast.
 
Interesting story about the memory, thanks for sharing!

I still find it rather scandalous that desktop systems aren't considered "worthy" of ECC memory. Memory prices are dirt cheap these days, and EVERY other component in the computer system has at least error detection circuitry to detect problems. Even the caches use parity to protect against random bit changes. But you have to go to a "workstation" class CPU to get ECC support for main memory.

Computers are useful because they're fast and accurate. A solid computer works, or tells you when there's a problem. ECC memory is a must-have to meet those basic requirements, IMHO.
 

Kewlx25

Distinguished
Every part of the system has a chance of inducing errors, but main memory seems to be the largest offender. The only other problem with ECC is every block of memory copies has to be verified on copy, so that's added latency and more parts to add which increases cost.

But I can also see with modern processors, all the high threading, multi core, pre-fetching, added latency doesn't do what it use to. I would also like to see ECC become standard.