Sign in with
Sign up | Sign in
Your question

Trying to get my head around RAID5

Last response: in Storage
Share
December 13, 2006 1:12:06 PM

Okay, I'm looking at buying a basic server for my office, it will probably be a Dell since our company has a business relationship with them.

I'm not too hot on server technology. The Dell advisor has suggested providing a system running RAID5 with three 80Gb disks.

Now from what I understand from his explanation this will give me 160Gb of storage (2 of the disks performing RAID0) but with hardware fault tolerance too (the 3rd disk performing RAID1). Is this correct?

I'm just a bit uncertain of how the 3rd disk comes in to it since it is only 80Gb while the disks it is mirroring amount to 160Gb :?:

Many thanks for your help!

Graham

More about : head raid5

December 13, 2006 1:22:47 PM

First a link

http://www.acnc.com/04_01_05.html

Now the explanation.

The 3rd drive does not "mirror" the other 2. Parity is calculated for the 8 bits in each byte of data. That parity information is written along with the actual data. Since this parity information is distributed across all the drives it is a RAID 5. RAID 3 would keep all the parity info on just on drive.
December 13, 2006 1:45:37 PM

Hmm, okay, I think I need a more fundamental explanation first, such as what is parity? :lol: 

I don't get what's going on there really. It looks like disk striping which is easy enough to understand, but it doesn't seem to give any form of hardware redundancy, ie, in the event of a hard disk failure you're screwed!

What does parity do exactly?
Related resources
December 13, 2006 2:00:16 PM

Ok, lemme see if I can do this from my instructor days.

Lets use data with only 2 bits. This gives us 4 possible different combinations. They are:

00
01
10
11

Parity would perform an Exclusive OR operation which will output a 1 only if the inputs are different. If we add the parity information, we get:

00 0
01 1
10 1
11 0

Now lets assume that each columb is a single drive. If the first drive (columb) fails it looks like this:

?0 0
?1 1
?0 1
?1 0

Notice that there are still 4 different combinations although they are different that the first example. If we replace the first drive (columb) with a working one, we can reverse the Exclusive OR operation, and rebuild the data. It is done like this:

Remember that we only get a 1 for parity if the data bits are different, so

?0 0 We have a 0 for parity so the data bits are the same (00 0)

?1 1 We have a 1 for parity so the data bits are different (01 1)

?0 1 We have a 1 for parity so the data bits are different (10 1)

?1 0 We have a 0 for parity so the data bits are the same (11 0)

Hope this helps
December 13, 2006 4:29:05 PM

Quote:
Ok, lemme see if I can do this from my instructor days.

Lets use data with only 2 bits. This gives us 4 possible different combinations. They are:

00
01
10
11

Parity would perform an Exclusive OR operation which will output a 1 only if the inputs are different. If we add the parity information, we get:

00 0
01 1
10 1
11 0

Now lets assume that each columb is a single drive. If the first drive (columb) fails it looks like this:

?0 0
?1 1
?0 1
?1 0

Notice that there are still 4 different combinations although they are different that the first example. If we replace the first drive (columb) with a working one, we can reverse the Exclusive OR operation, and rebuild the data. It is done like this:

Remember that we only get a 1 for parity if the data bits are different, so

?0 0 We have a 0 for parity so the data bits are the same (00 0)

?1 1 We have a 1 for parity so the data bits are different (01 1)

?0 1 We have a 1 for parity so the data bits are different (10 1)

?1 0 We have a 0 for parity so the data bits are the same (11 0)

Hope this helps


That helps a lot thanks! :D  Okay, so RAID5 does offer hardware fault tolerance, so why have I read that one disadvantage to RAID5 is that it is difficult to rebuild the array in the event of a hard drive failure?
December 13, 2006 4:47:48 PM

Not to underestimate DELL's ability to upsell - and now that you have a better feeling for RAID-5, the question you should ask yourself is wether 160 Gig of storage space is sufficient for what you need.

Down the road - if you had to get additional storage in this configuration - you would likely have to add another 80Gig HD, or get a complete set of larger hard drives. So plan accordingly.

If you do need to add storage down the road - you will need to first backup the RAID, then destroy the array definition, then re-create the array definition with the additional or replacement drives. Then restore the data to that array. It is not fun - but it is possible.

Just my 2 cents - take it for what it's worth.

Cheers.
December 13, 2006 5:01:02 PM

RAID 5 also works with 4 drives, which allows you to swap 1 in the event of failure, also allows you to change and rebuild while live.
least thats what i remember
December 13, 2006 5:07:51 PM

Quote:
RAID 5 also works with 4 drives, which allows you to swap 1 in the event of failure, also allows you to change and rebuild while live.
least thats what i remember


Raid 5 works with any number of drives above 3, and works exactly the same with more drives. i beleive what your talking about is just adding a spare drive.
Love your signature BTW. Good thing we have warnings though or you might not be around...
December 13, 2006 5:18:02 PM

One good feature of RAID 5 is that it doesn't matter which drive blows out... you have enough info to re-create it from the other two.

Another way to think of parity is that you add whatever bit is neccessary to make the sum of all three an even number.

so if your three drives have the following (where X is the unknown bit on a broken drive)...
0X0 then the missing bit is a zero (0+0+0 = and even number)
10X then the missing bit is a one (1+0+1 = an even number)
X11 then the missing bit is a zero (0+1+1 = and even number)

Also, as per the above post, take a look at one of these solutions:
http://www.extremetech.com/article2/0,1697,2018456,00.a...

You can talk to Dell about their PowerVault.

The idea here is to take the RAID functions out of the physical box of the server, and onto an easily acceccible rack. So you can swap drives in and out easily enough. Then you can hook up any old cheap computer to it and you have a nice file server... and if the cheap computer dies, you can replace it w/ another cheap computer without having to pull out all the hard drives and whatnot. Just swap in a spare computer, or make your repairs and re-connect it.
December 13, 2006 7:36:04 PM

lol....bitch....
guess i had that coming
December 13, 2006 7:58:20 PM

RAID 5 can be done with software or hardware. A software example would be Windows Server 2003. Once loaded onto a computer, it can take the free space on 3 drives and create a RAID 5 array. If a drive fails, Windows has to use the horsepower of the CPUs to rebuild the array.

A hardware solution would be a controller card (hence the THG article that brought us all together) which has an onboard processor to rebuild the lost data.

Notice with the windows RAID 5, the OS has to be loaded first before you can creat a RAID 5 array using disk manager, hence you cannot have the OS installed on said RAID 5 array. The hardware solution doesn't have that limitation.

Having an extra drive in case of drive failure is normally called a hot spare. If a drive fails, the RAID 5 array wil begin to rebuild itself on the hot spare without any user interaction. If you were to lose a second drive before the array is rebuilt, all data will be lost. In mission critical situations, I recommend RAID 6 which uses 2 drives for parity. You can lose 2 drives, and still have your data.

The have been many references to Dell in this forum also. You can build a much more reliable server for much less $ then Dell can provide, if you know what you are doing, and have the patience to do your homework and work out any driver and software issues. Redundancy in hard drives doesn't do a bit of good if you only have one power supply and it fails.

Just my 2 cents.
December 14, 2006 2:04:33 AM

Quote:
lol....bitch....
guess i had that coming

'I'm not saying that stupid people should be executed but...take the safety lable off and let the problem solve its self'


just a side note... when you start calling people stupid...
make sure you spell it right...
it is label not lable...

Just so that you know I kan't spell well either, for that there is spelling checker.

enough thread hijacking...
getting 3 80 gig drives for raid 5 is good... by why 80 gig...
i run 4 320 gig HDs raid 5 that gives me 894 GB. I can have any 1 HD fail and loose nothing... and for the cost of them... they are pretty cheep...
if you insist on getting a dell machine... buy the system with minimal ram and just the raid 5 controller... they buy more ram and the hard drives and put them in yourself.. and save yourself some $$$

my 2.5ยข worth
December 14, 2006 2:27:20 AM

i bow to you...
guess i really should read...
December 14, 2006 2:38:38 AM

Don't do software RAID. Buy a good controller card. Software RAID 5 is slow and kinda sketchy. Recommend four drives too with RAID 6
December 14, 2006 8:56:19 AM

Hi Greyknight,

We've got loads of Dell servers at my workplace running RAID1 and RAID5 setups so I feel I'm in a good position to make a recommendation to you. :) 

What is the server going to be used for? You don't seem to need that much disk space from what you are saying. A simple RAID1 setup with 2 160GB disks will give you exactly the same fault tolerance and usable disk space as a RAID5 setup with 3 80GB disks, you might as well just go for the cheaper option.

Personally I'd go for RAID1 because it is a simpler setup but then I don't have to worry about the price difference, your circumstances may be different.
December 14, 2006 9:31:57 AM

Quote:
Hi Greyknight,

We've got loads of Dell servers at my workplace running RAID1 and RAID5 setups so I feel I'm in a good position to make a recommendation to you. :) 

What is the server going to be used for? You don't seem to need that much disk space from what you are saying. A simple RAID1 setup with 2 160GB disks will give you exactly the same fault tolerance and usable disk space as a RAID5 setup with 3 80GB disks, you might as well just go for the cheaper option.

Personally I'd go for RAID1 because it is a simpler setup but then I don't have to worry about the price difference, your circumstances may be different.


Hi. The 80Gb drives had me worried anyway, I agree they're not big enough! The server is going to be a file server first and foremost, it will store an 80Gb MS Access contacts database and all our Invoice and customer contact information, most of which is MS Word files. It will also be used eventually to host Sage Line 50 and Act which we plan to start using some time next year.

RAID1 was my first choice before Dell started recommending RAID5, I may just go back to them and say I want a couple of big disks running RAID1...
December 14, 2006 10:32:19 AM

Yeah I would, they're probably just after some extra cash. :) 
December 14, 2006 11:06:19 AM

Quote:
Hi. The 80Gb drives had me worried anyway, I agree they're not big enough! The server is going to be a file server first and foremost, it will store an 80Gb MS Access contacts database and all our Invoice and customer contact information, most of which is MS Word files. It will also be used eventually to host Sage Line 50 and Act which we plan to start using some time next year.

RAID1 was my first choice before Dell started recommending RAID5, I may just go back to them and say I want a couple of big disks running RAID1...


Well, RAID 5 offers faster read speed than RAID 1, and in some controllers, you can even add drives to the array and expand them. Personally, I go with RAID 5 on all of my servers, if at all possible. I just like the ability to be able to expand if need be, without the need for reloading everything.

Additionally, the only extra cost is that of the extra disk. Most likely, the onboard controller will do RAID 1 and 5, so you are only talking an extra couple hundred dollars or so. Much more convenient, if you ask me, to buy that extra disk.
December 14, 2006 11:06:40 AM

you could use 2 160GB drives using raid 1 = 160 GB usable space
if this isn't enough after a while, add another 160 GB and place them in a raid 5; so you get 320 GB and you keep the safety.
December 14, 2006 11:58:16 AM

Due to my company, I've been using Dell since the beginning. They are getting better, but their RAID software/hardware has a lot left to be desired with the setup compared to other manufacturers that I have also used. I recommend a simple mirror setup unless if you really understand RAID 5 and have used it several times in a disaster scenario. The last thing you need is to have to figure out how the setup configuration works and you have data that needs to be recovered that could be lost through a simple incorrect choice during recovery.
December 14, 2006 11:58:45 AM

If you are running large databases on a file server, you want something fast, especially if the information is going to be shared with multiple employees.

While the Dell guy may be trying to upsell you, he is probably just giving you advice that he has learned from dealing with companies in similar situations as yours. Most companies do go with RAID 5 or 6 in the long run. It keeps your data protected and is efficient. RAID six keeps you up and running 24/7.
December 14, 2006 12:39:18 PM

The fact that the server will be running a database is a key point.

The question remains as to what load the machine is going to bear. Even a database server doesn't need RAID 5 if it is only going to be used by a handful of simultaneous users. There is also a question of how much performance gain you'll get using RAID 5 with 3 disks. Sire, it's going to be more than 2 disks in RAID 1 but not twice as fast.

However, the fact that this server is going to host a database is a point I glossed over earlier and it does change my stance. With a database server you are better safe than sorry, RAID 5 is the safer option here, but if performance is a key issue you may want more than 3 disks.
December 14, 2006 10:44:02 PM

Quote:
I recommend a simple mirror setup unless if you really understand RAID 5 and have used it several times in a disaster scenario. The last thing you need is to have to figure out how the setup configuration works and you have data that needs to be recovered that could be lost through a simple incorrect choice during recovery.


What's there to understand? Even the older Dells I have (retiring soon, thankfully), if the drive fails, hot swap it and it rebuilds automatically. There's nothing harder about RAID 5 versus RAID 1. It's a selection in the array controller, that's it. I'm not really sure why anyone would recommend RAID 1 due to RAID 5 being harder to recover. It's not harder to recover at all.

Also, with RAID 1, you don't get the advantage of multiple spindles reading data, unless they have changed the way they read recently. RAID 1 offers no read benefit over a single disk, since the data is not spread over the disks as in RAID0/5/etc.
December 14, 2006 10:46:16 PM

Quote:
The fact that the server will be running a database is a key point.

The question remains as to what load the machine is going to bear. Even a database server doesn't need RAID 5 if it is only going to be used by a handful of simultaneous users. There is also a question of how much performance gain you'll get using RAID 5 with 3 disks. Sire, it's going to be more than 2 disks in RAID 1 but not twice as fast.

However, the fact that this server is going to host a database is a point I glossed over earlier and it does change my stance. With a database server you are better safe than sorry, RAID 5 is the safer option here, but if performance is a key issue you may want more than 3 disks.


Bah, it's an Access database, so I wouldn't go overboard. No way to fine tune it for performance. :) 
December 14, 2006 11:15:22 PM

Quote:
Also, with RAID 1, you don't get the advantage of multiple spindles reading data, unless they have changed the way they read recently. RAID 1 offers no read benefit over a single disk, since the data is not spread over the disks as in RAID0/5/etc.


RAID 1 on modern enterprise-level controllers will read faster than a single disk. The controller will intelligently distribute the read commands over both disks, taking advantage of the fact that there are 2 copies of the data. (Typically this results in reads that are faster than single drive, but not as fast as RAID-0 or RAID-5 where there are synchronous blocks on separate drives). Writes on RAID-1 are always the same speed as a single disk, however.

RAID 1 might be chosen over RAID 5 in certain situations. For servers that don't need a massive amount of storage space (i.e. web servers, some database servers, servers that get their storage from a SAN), the boot drive (maybe with a data partition) is typically the only drive in the machine. Typically, the server manufacturer's entry-level RAID option will do RAID 1 just fine, and you may not want to pay for the upper level RAID controller since you don't need RAID 5. In these cases, RAID 1 for the boot drive works very well.

Further, many of these kinds of servers are only 1U rack unit high. Many servers in that form factor won't hold 3 drives (minimum required for RAID 5), but most will hold 2 (perfect for RAID 1).

And, the general rule of thumb is to separate boot drives from data drives, either via partitioning or separate physical drives. All of my servers that I manage have a RAID 1 array for the boot drive, and (if necessary) a separate RAID 5 array for the data (or they get their data storage space from a SAN).

This mechanism allows me maximum redundancy and maximum flexibility for my servers, while saving rack space and money.
December 14, 2006 11:44:36 PM

Quote:
it will store an 80Gb MS Access contacts database

Does anyone else think this might be a bad idea?

MS Access doesn't have the same quality administration tools, backup utilities, access control, transaction control and other features as a full-fledged database - even one as "simple" as SQL Server or as inexpensive as MySQL or PostGreSQL.

I know Access has gotten much better over the years, but I guess I still reel from the early days of database corruption, multi-user lockouts, accidental data erasure and other problems that can be more effectively prevented or protected against with an RDBMS.

FWIW, I would go with 2 x 250 GB drives in RAID 1. You can often get them for ~$60 each online from places such as tigerdirect, and they would provide you with lots of space to spare. Don't forget that you will need to defragment your drives from time to time, so designing too close to your storage needs will make that slow or even impossible. Often, the larger drives will have more platters than their smaller cousins, which means that they will be able to read and write data faster.

RAID 5 does not buy you better redundancy than RAID 1. In either case, if more than one drive fails you lose your data. In fact, with three drives you have a lower MTBF for the collection than you would have for just two. RAID 5 will also incur a write-speed hit. That may not be an issue in your environment, however. Finally, RAID 5 can lose data if you have a failure in the midst of a data write - i.e. if the data hasn't been fully updated on all the disks yet but has on some. This is a problem if you don't have battery backup, and enable write back caching on a per-disk basis. When that happens, your RAID array will rebuild, and your disk access will be slower until that finishes. RAID 1 doesn't exhibit the same rebuild slowdown.

Don't get me wrong: In certain scenarios, RAID 5 is a far better solution than RAID 1. I use it at home for a picture and song library because of the write-once-read-many characteristics which emphasize RAID 5's strengths of efficiency, read-speed and redundancy / recoverability. But I wouldn't choose it lightly, or without a full understanding of what it's good for and what it's not.
December 15, 2006 6:54:02 AM

Quote:
Bah, it's an Access database, so I wouldn't go overboard. No way to fine tune it for performance. :) 


True, but 80GB is rather large for an Access database and simultaneous use and load is still the key factor.

Quote:
And, the general rule of thumb is to separate boot drives from data drives, either via partitioning or separate physical drives. All of my servers that I manage have a RAID 1 array for the boot drive, and (if necessary) a separate RAID 5 array for the data (or they get their data storage space from a SAN).


I agree with having separate partitions for OS and data but disagree with the need to have the OS on a separate disk or RAID array. If you use the RAID 1 disks to add another drive and a hot spare to the RAID 5 array this will give you maximum redundancy for the server as a whole, you'd have to lose 3 disks of the RAID 5 array before you lose the filesystem.
December 15, 2006 2:30:33 PM

Quote:
it will store an 80Gb MS Access contacts database

Does anyone else think this might be a bad idea?

MS Access doesn't have the same quality administration tools, backup utilities, access control, transaction control and other features as a full-fledged database - even one as "simple" as SQL Server or as inexpensive as MySQL or PostGreSQL.

Well that was going to be my response. Having an 80GB Access database is asking for trouble. Anything over a few hundred MB, in my opinion, should be moved out of Access.

No matter how fast the disks are, Access will be the bottleneck.
December 15, 2006 11:42:11 PM

I thought the maximum size of an Access database is is 2 GB.

I have a 100 MB Access DB on a RAID 0+1 with a hotspare w/ ~10 users. I use the OS on one partition with the DB and all other data on a 2nd partition. The only time it messes up is when the application makes the error.

Also, I don't have the luxury of choosing what database to run, and I doubt a majority of IT for SMB don't either. My job is just to make sure the servers have maximum uptime and a solid network structure for everything.

Next year we are replacing the servers and will be switching to a RAID 6 using 7 drives + 1 Hot Spare. They are 2 years old now and will retire at 3.
December 16, 2006 11:50:39 AM

Quote:
I thought the maximum size of an Access database is is 2 GB.

I have a 100 MB Access DB on a RAID 0+1 with a hotspare w/ ~10 users. I use the OS on one partition with the DB and all other data on a 2nd partition. The only time it messes up is when the application makes the error.

Also, I don't have the luxury of choosing what database to run, and I doubt a majority of IT for SMB don't either. My job is just to make sure the servers have maximum uptime and a solid network structure for everything.

Next year we are replacing the servers and will be switching to a RAID 6 using 7 drives + 1 Hot Spare. They are 2 years old now and will retire at 3.


You're probably right, but some applications use .MDB files for their databases, but the application is not Access. Either that, or someone meant MB instead of GB.

Is your server doing anything else besides the Access database? I surely hope so as RAID 0+1 is overkill for that.

As for RAID 6, I have yet to even try it. No real need for it yet and I'm not planning on staying in the hardware space much longer.
!