Sign in with
Sign up | Sign in
Your question

Linux Server Storage Setup

Last response: in Storage
Share
October 26, 2012 6:04:16 PM

Hey Everyone,

Basically, I'm giong to be building a Linux based home office server running Debian Squeeze. Now, at the moment I'm in a battle with myself as to how to go about backing up my data. Either RAID 1 or software like "rsync" and "cron" or similar.

(If you'd like to know more and have comments, check out my thread http://www.tomshardware.com/forum/forum2.php?config=tom...)

First, I thought I'd be doing RAID 1 per majority of the chats I've had. So I went out and bought two 2TB Western Digital Black HDDs for like $320 because they are basically the only hard drive suggested for consumer based RAID projects. And, they have a better warranty than most other drives (5 years).

Now it looks like I'm being told RAID is not a good solution for doing any sort of backups. If this is the case, I'm wondering if I should return the Blacks and get something a bit cheaper. Micro Center has 2TB Seagate Barracudas (I believe) for like $100 a drive. So, buying two of these over the the blacks would save me like a good $100.

I have also been looking at the new WD Red series hard drives, which run about $120 a drive. I don't know much about these yet though, besides that they are built and tested for consumer RAID setups. When I brought them up to the Micro Center techs, they still said the Blacks will be better for a RAID setup as they are supposedly better quality still.

Should I keep the WD Blacks, or downgrade to a cheaper hard drive if I do NOT end up going the route of RAID 1?


Thanks to all who comment!
October 27, 2012 5:07:18 PM

RAID is simply not for backups, security, or data redundancy or any kind. It's specifically designed to keep a system running - think of it like having a spare drive, so if one dies, your computer doesn't crash. (alternatively if you play board games, its perhaps akin to having an extra 'wound' - you're still alive if you take 1 hit :) 

That said, it has nothing to do with the redundancy of actual data (RAID 0 is a trivial case since it doesn't improve Uptime OR data redundancy).

Data backup involves creating a copy of the data (or an image or clone of the OS, or both, all depending on what you want backed up), and storing it on a DIFFERENT drive/system/site/network/planet to ensure that data loss in one place (an HDD dies) does not destroy the data itself, since you have a 'backup copy' somewhere else.

---

RAID for servers, again, is about keeping the server online no matter if a drive fails - Uptime for servers is extremely important in the enterprise environment, hence the use of RAID, despite its MANY problematic disadvantages.

RAID 1 (not my favorite flavor of RAID) DOES NOT DO what it seems to do on the surface. It DOES NOT simply clone the drive onto a second paired drive. It does something else.

RAID 1 sets up a meta-data template on one drive, and when data is written to that drive, it copies it to the second (and 3rd, 4th etc) member of the array WITH the meta-data given it by the RAID controller (either hardware or software).

So the RAID controller, if you like, is in charge - it is important to remember that RAID adds the complexity of a RAID controller to the equation of data storage, it's not just the disks that are involved anymore.

RAID 1 allows the volume mapped across the drives to survive 1 drive failure (presuming you have 2). i.e. the drive will fail, the system to switch to the other drive, and will alert you to replace the failed HDD. It will then rebuild the array.

HOWEVER, let's say your RAID controller dies, rather than the disks themselves. YOU WILL NOT (ceteris paribus) be able to simply grab a RAID 1 member HDD, plug it into a machine, and access the data. That data, as mentioned before, has been written to the drive by the RAID CONTROLLER with meta-data, offsets, etc. which make it unreadable for the most part save for the controller which wrote it.

(There are sophisticated ways, and just software, which in theory allow the reconstruction of data like this, but I personally have never seen them work, then again I'm not a sysadmin or anything, so perhaps they use them regularly and I just don't know).

---

As for the prices and the specific drives - I find it unlikely that a home server will see any benefit from different HDD products from the manufacturers - yes some HDDs are 'enterprise class' and are designed to work well/better with RAID setups, but in terms of failure rates, etc., I doubt there will be a pronounced difference. Drives fail, period. Hence the RAID in the first place.

m
0
l
October 27, 2012 10:29:36 PM

commissarmo said:
RAID is simply not for backups, security, or data redundancy or any kind. It's specifically designed to keep a system running - think of it like having a spare drive, so if one dies, your computer doesn't crash. (alternatively if you play board games, its perhaps akin to having an extra 'wound' - you're still alive if you take 1 hit :) 

That said, it has nothing to do with the redundancy of actual data (RAID 0 is a trivial case since it doesn't improve Uptime OR data redundancy).

Data backup involves creating a copy of the data (or an image or clone of the OS, or both, all depending on what you want backed up), and storing it on a DIFFERENT drive/system/site/network/planet to ensure that data loss in one place (an HDD dies) does not destroy the data itself, since you have a 'backup copy' somewhere else.

---

RAID for servers, again, is about keeping the server online no matter if a drive fails - Uptime for servers is extremely important in the enterprise environment, hence the use of RAID, despite its MANY problematic disadvantages.

RAID 1 (not my favorite flavor of RAID) DOES NOT DO what it seems to do on the surface. It DOES NOT simply clone the drive onto a second paired drive. It does something else.

RAID 1 sets up a meta-data template on one drive, and when data is written to that drive, it copies it to the second (and 3rd, 4th etc) member of the array WITH the meta-data given it by the RAID controller (either hardware or software).

So the RAID controller, if you like, is in charge - it is important to remember that RAID adds the complexity of a RAID controller to the equation of data storage, it's not just the disks that are involved anymore.

RAID 1 allows the volume mapped across the drives to survive 1 drive failure (presuming you have 2). i.e. the drive will fail, the system to switch to the other drive, and will alert you to replace the failed HDD. It will then rebuild the array.

HOWEVER, let's say your RAID controller dies, rather than the disks themselves. YOU WILL NOT (ceteris paribus) be able to simply grab a RAID 1 member HDD, plug it into a machine, and access the data. That data, as mentioned before, has been written to the drive by the RAID CONTROLLER with meta-data, offsets, etc. which make it unreadable for the most part save for the controller which wrote it.

(There are sophisticated ways, and just software, which in theory allow the reconstruction of data like this, but I personally have never seen them work, then again I'm not a sysadmin or anything, so perhaps they use them regularly and I just don't know).

---

As for the prices and the specific drives - I find it unlikely that a home server will see any benefit from different HDD products from the manufacturers - yes some HDDs are 'enterprise class' and are designed to work well/better with RAID setups, but in terms of failure rates, etc., I doubt there will be a pronounced difference. Drives fail, period. Hence the RAID in the first place.



Thanks for the breakdown... You are the first to mention what the controller writes to the disks and such and I did not know that. I did know that if I used a controller card and if failed I'd have to replace with the exact card. Which I'm not keen on this idea for if it fails in a few years there is the chance of it being discontinued. Then what? lol.

After chatting with a few others and seeing your reply, I'll be doing periodic backups using software like rsync and cron or similar. I figured with this software Drive B can just hang out and every few hours a script can mount Drive B and copy any new or changed files from Drive A the unmount Drive B.

Also, I have decided to stick with the WD Black drives... Figured I'd want to keep them for their quality so they will (hopefully) last me quite awhile.
m
0
l
October 28, 2012 1:35:58 PM

It's certainly true that if the controller card dies, you would need to replace it with the identical card (I'm not actually sure whether 'identical' in this case means same firmware, but I can't imagine that would matter, since a firmware update shouldn't change the card's functionality).

I don't think the risk of discontinuation is much of a fear, but certainly you're correct in realizing that there are simply a lot more variables that could 'go wrong' with relying on RAID for anything other than uptime.

I like the low-level backup methods you've described (personally), but of course you can also look into the myriad software solutions out there - I'm sure there's some functionality in them you might find useful.
m
0
l
!