A RAID that just works - no matter what

Status
Not open for further replies.

ratbat

Distinguished
Mar 20, 2009
7
0
18,510
I have had a couple of years using desktop RAID and have come to the conclusion that this is a habit I have to lose!

When it works it works very well, but all too often I see that old “RAID Degrade” error message on boot up and then have to do a whole lot of messing around to get it fixed (another machine went this morning after 6 months of good service). It is never a hardware failure, it is always driver issues, different RAID levels (1, 5 and 10) and different machines. There has to be a better way and I am turning to this forum for advice on what system I should implement.

Why do I bother with RAID? In my line of work (computer support) I see other people’s hard drives fail all the time. There is no way I am going to put my data onto a single drive, it has to be at the very least mirrored onto a second drive (and backed up in multiple other ways). The vision is that I want to be able to laugh at a disk failure and slap in a replacement without a care in the world and with no noticeable interruption to my computing experience. The trouble is, it never seems to work out that way.

I love the Drobo concept of the storage telling about a broken drive, you slip in a replacement and it automatic puts all the data where is belongs to make sure nothing ever gets lost and expansion is easy. I have not gone that way for two reasons: I need something I can boot from and I also worry about recovery. If the data is in a standard RAID configuration I can use products like Runtime Software RAID Reconstructor (something I use in my line of work and know to be good) to get the data back if it all goes wrong. If it is a proprietary format like Drobo, I am stuffed. To the best of my knowledge, there is no recovery software for its file format.

It is clear to me that what I need is a dedicated RAID controller with thick RAID. Something where there are no drivers for the OS to screw up. Something where the OS sees a big single drive and the controller sorts out all of the difficult stuff. But which one?

There are lots of good resources for comparing other PC components, but comparative reviews of RAID controllers seem few and far between. How do you go about choosing?

My principal criteria is that I need it to be bullet proof. I want the failure of any one drive to be a non issue. I have read about controllers that have a spare drive as a hot standby, so that in the event of failure everything is copied over to that, that’s the sort of “just keep going” I am looking for.

I want a highly resilient setup for me wife’s PC (top tip for you boys out there, always make sure your wife’s PC just works no matter what and you will lead a happy life). Currently that is 2 x 400Gb drives in a desktop RAID 1 using the onboard Intel controller.

Mine is a 4 x 1Tb drives in a RAID 10, because I need the redundancy and the extra speed (a super fast controller with a ton of cache would be nice for me. I would buy a couple of OCZ Z-Drives if I had that much spare pocket money, but not for now ;-). I don’t mind changing any of the hardware, I just what a RAID that works, no matter what.

What do you recommend?
 

4745454b

Titan
Moderator
I laughed a bit because you said you wanted a system you didn't have to tinker with, and thats what I have. I simply use the drives, no form of RAID at all. I haven't had to mess with any drivers, and I haven't lost any data. All of my important stuff is stored also on my wifes machine, and burned to DVD. If I lost one of my drives, I wouldn't be that bad off.

I would NEVER put my OS drive on a RAID array. NEVER. As I'm sure you've noticed, when (and not if) the driver gets corrupted, not only do you get to repair the array, but reload windows and ALL those programs as well. Also, while RAID 0/10 has wonderful sustained transfer speeds, it sucks for random seeks. Access time is longer, which is not good for an OS. RAID1/5 can be good for data drives.

My suggestion is to drop RAID. You didn't list anything that would make me think you really need it. Pull one of your TB drives and turn it into an external drive. Put your really important data on it, and remove it from the computer. (put it in your office desk draw, gym locker, etc if your really paranoid.) Not only do you now have 3TBs of storage, but external backups, and faster seek times. This might not be very "cool", but it works very well.
 

ratbat

Distinguished
Mar 20, 2009
7
0
18,510
Thanks for taking the time to reply. Such input makes the forums go around.

>I laughed a bit because you said you wanted a system you didn't have to tinker with

My aversion is not to tinkering with IT, if it was I would be out of a career.

No, my aversion is to having a big drama (or any kind of a drama) when a hard drive fails.

Your single drive solution is ok in so far as it goes, but that drive will fail (because all drives fail as I am reminded every day) and it when it does it will create pain. Even if I take an image copy to a server (which I do), I still need to recover it when the drive fails. When modern drives volumes are so high, that takes a lot of time and creates a lot of disruption.

>Also, while RAID 0/10 has wonderful sustained transfer speeds, it sucks for random seeks. Access time is longer, which is not good for an OS. RAID1/5 can be good for data drives.

I did use to run RAID 5, but it is not so good for writing speeds, which is why I move to RAID 10. My plan is to put the OS onto an SSD when the prices drop some more. For now, it is fast enough. The PC is very high spec by current standards (overclocked i7, Windows 7 64 bit, 12Gb RAM etc). I run a lot of stuff at the same time (multiple virtual machines) and various applications. You would think that poor seek times would slow it down, but it never does, RAID feels a good bit faster than single a drive. There is not much I have to wait for.

Interestingly, the RAID degrade I had this morning sorted itself out in a couple of hours. The machine was unusable whilst it did it, and gave no indication that it was doing it, but it did do it. When I had one like this on XP it took 2 days to sort itself out. Vista just complained but the drivers and never sorted it. Another positive score for Windows 7.

Remember the vision: storage where I can just laugh at a disk failure and carry on working. I am not there yet, but I sure would like to hear from anyone who thinks they have a solution.

 

sub mesa

Distinguished
Maybe this is off-topic, but ZFS would come close to a zero-maintenance just-works filesystem that is highly resistant to lost dirty buffers, you do not need hardware RAID to achieve high performance. But if you want to combine Windows operating system with RAID5/6 you should pick Hardware RAID instead, with a BBU.
 

ralphael

Distinguished
May 25, 2009
9
0
18,510
well, since you ask...

first thing first, don't buy a drobo!!

With the recent seagate fiasco (delay write), drobo takes the whole array down by marking the drives as failed and with seagate, 2 drive are takened down very quickly due to the automated data migration when one drive fails, the second follows shortly after.

Even though you know the data is still on the drives, it is proprietary format.

Now on to the other question on non IT type raids. Your best bet are probably the QNAP NAS systems as they provide excellent Web interfaces for creating and maintaining your raid.

For the more techie type, I highly recommend the intel ICH9R/ICH10R chipset on many mobo.

I bought a shuttle SP35P2 to replace my failed mobo thinking that I will do without a raid due to the small form factor; however much to my surprise the SP35P2 has 4 internal sata and 2 external esatas. Originally, since there are two hdd drive trays, I had planned on mirroring the OS so that if the OS fails, at least I could boot up and not lose my data. However, the floppy 3.5 bay beckoned to me to insert another drive, thus I thought it would be nice to get 1.8TB of data from a raid 5 setup. As I inserted the 3rd drive, I noticed my DVDrom bay of which I have two external USB dvdwriters, thereby I could theorectically add a fourth drive into my SFF shuttle; which I did... Carving from the 2.72TB, I now have a 100GB boot raid 5 and 2.62TB of raid 5 data.

Now what makes it all interesting is the intel ICH9R chipset since it can migrate from a single drive to a mirror raid, then migrate to raid 5 without losing your original data.

Now the story doesn't end here as prior to all raiding my shuttle, I had searched for the ultimate raid box and found it under the guise of mediasonic HFR2-S3B for $232 cdn which can fit 4x2TB drives into it. If a drive fails, you can hotswap a new drive into it and the raid will automatically rebuild. It connects via any normal esata (no need of a port multiplier esata port), or USB/firewire 400/800.

The only fault of this unit is that it won't expand like a drobo; but hey this is real raid and not that fake raid from Drobo. Max speed on Drobo through USB is 30MB/s, Max speed on Mediasonic is 105MB/s through esata.
 
If your using intel raid, all you have to do is install the intel matrix storage manager. I often get those "errors" on my RAID 0. but it takes like 2 seconds to dismiss the error in the storage manager. I've dismissed it like 3 times now in 1 year and yet to have s serious problem in my RAID 0. As always any important data in a RAID 0 should be backed up regardless of the RAID status.
 


First thing to do is ditch the onboard RAID and buy yourself a quality dedicated hardware RAID controller card with at least 8 SATA ports, preferably either 3Ware or Areca. Then with get yourself 8-500GB drives, then create a 2TB RAID 10 array.

However, if your vision is to create a storage array where you can laugh off an occasional disk failure, set up a RAID5 or RAID6 array with at least 5 drives and a hot spare. RAID5 or RAID6 arrays with at least 5 drives can lose up to two drives and still not worry.

Onboard mobo RAID is nice/cute for the gamer/enthusiast who wants faster load times for games, but for seriously maintaining large amounts of data, a dedicated hardware RAID controller card is best.

 

rand_79

Distinguished
Apr 10, 2009
829
0
19,010
I have a motherboard with the ICH9R chipset been running it in raid 0 for over a year 0 problems .. just recently upgraded to 3 drives.. still 0 problems.

almost 1.5years and counting only problem I had with it was with some older hitachi drives they liked to spin down overnight and not wakeup.. even with all the power saving settings turned off.
 

ratbat

Distinguished
Mar 20, 2009
7
0
18,510
>preferably either 3Ware or Areca.

That hot swap facility sounds interesting. Does any have any real life experience of what happens on one of these when a drive fails?

When the Intel motherboard ICH9R fails it slows the machine to a girding halt whilst it sorts itself out. I have seen it take up to 2 days with the machine being completely useable whilst it re-mirrors (it also gives no indication that that is what it is doing). I love the idea of a controller that will sort things out in its own time whilst I carry on working having barely noticed there has been a problem.
 

ralphael

Distinguished
May 25, 2009
9
0
18,510


It was problematic with the intel Matrix console and drivers V7.5 with hesitations when one of the esata drives is connected but powered off. BSOD abounds when a drive is hotswapped.

However that all changed with v8.8. I can hotswap to my heart content without BSOD or system slowdowns. Even with a rebuild, there is no slowdown.
 


In the RAID controller BIOS it will give the option of assigning a spare drive to an array as a hot swap. In the event that a drive in the array fails, the controller automatically removes the bad drive from of the array and replaces it with the hot spare.

I do not doubt the Intel Matrix Storage and ICH9R/ICH10R chipsets, nor do I doubt that many users have had great success using them. Not for nothing, but I have an nVidia based mobo and am currently using 2-80GB drives in RAID0 for my dual boot between XP and Windows7. However, I can not stress enough that a dedicated hardware controller is a much better solution.
 

ShadowFlash

Distinguished
Feb 28, 2009
166
0
18,690
Absolutely a "true" hardware RAID controller. This almost always means a dedicated cache and battery back-up unit. I've had plenty of the issues you described with software RAID, and I would agree with never putting your OS on a software RAID array, but with a good hardware controller, by all means do, as long as it isn't any parity type RAID level. The simplest no-hassle RAID level there is, has and always will be, RAID 1. Even if things go really bad, you can just pop out a single drive, and easily recover your data. No other RAID level can make this claim. The downside is, although RAID 1 can provide some hard to quantify performance benefits such as increased I/O and simultaneous read performance ( controller card dependant ), it does not improve any sequential access. RAID 1 is perfect for an OS drive only, but games and such that would benefit from the reduced load times of improved sequential reads would be better suited to RAID 10. RAID 10 has no real flaws other than the increased number of physical disks required for equal space. RAID 5 and 6 attempt to solve this problem, but make sacrifices to do so.

In the Holy 3 of RAID: Redundancy, Performance, and Capacity, you can only pick 2. RAID 10 sacrifices capacity, RAID 5 sacrifices a little of both performance and redundancy, and RAID 6 sacrifices more performance and capacity to increase redundancy. RAID 5/6=Jack-of-al-trades, but master of none. If capacity is not an issue, always go with RAID 10 for performance. If iron-clad simplicty and redundancy is your goal, RAID 1 is your answer.

Edit: I don't know why I keep seeing it mentioned, but in any of my many, many RAID set-ups, RAID 0,1, and 10 has never produced higher random seek times. In fact, with the right stripe size for the application, they are usually improved.
 

jrst

Distinguished
May 8, 2009
83
0
18,630


If a drive fails the controller marks it offline and continues, although performance may suffer; how much depends on your configuration. For RAID-10 the hit shouldn't be significant; for RAID-5/6, the hit can be significant (e.g., see http://www.tomshardware.com/reviews/adaptec-serial-controllers,1806.html ).

Hot-swapping a drive should be a non-event for any decent controller. What it does when that happens is generally configurable. (E.g., wait for you to tell it, automatically start rebuild, etc.)

Rebuild time is highly variable; some controllers have the ability to throttle rebuild and initialization rate (e.g., minimize rebuild time or minimize system impact).

Rebuild time should be much less than with host-based/fakeraid, as the data is moving between the controller and the drives, not drive-bus-CPU/memory. However, I haven't compared them (haven't used Intel RAID in a few years), but, e.g., rebuilds on an 8-drive RAID-5 array (LSI controller) take 4-10 hours depending on throttling and load), with little discernible impact.

Event notification and status capabilities vary, but most any decent controller should provide enough options that you'll find one suitable (e.g., SNMP, email, host storage manager, etc.). Management software is typically pretty good about telling you what's happening. (As you might imagine in a typical enterprise environment, knowing what it's doing is important--and answering the inevitable "is it done yet? home much longer?" questions every 15 minutes. :)

A. Things to keep in mind:

1. If the RAID controller fails, you'll need another similar (or possibly identical) model. The vendor can tell you what mix is supported. (I keep a spare controller sitting on the shelf. I hope to eliminate that need in the near future.) That's one benefit of the Intel ICH solution... you can typically move the array to a different mobo with similar ICH if yours fails.

2. You either want a controller with on-board BBU, or the entire system on a UPS that will provide a graceful shutdown. Some controllers won't allow write-back (write caching) unless the on-board BBU is present, which can seriously degrade write performance. On-board BBU is preferable as it is more resilient and goof-proof.

3. You want to use qualified drives, not typical consumer drives, otherwise you're likely to see spurious drive dropouts (and then the inevitable rebuild). One of the differences between, e.g., the WD "enterprise" and "desktop" drives. Limits your choices a bit, and adds a bit of a premium, but if you're going to spend the money on a decent controller...

4. And what chunkymoster and ShadowFlash said. And what sub mesa said (ZFS) if you're building a SAN (assumes you're not going to jettison Windows are your primary host).
 

ShadowFlash

Distinguished
Feb 28, 2009
166
0
18,690
Just a quick clarification on terminology....
Hot Swap = ability to physically replace a failed drive with a replacement ( cold spare ) while the system is fully functioning.
Hot Spare = an unused drive already installed in the system which enables automated rebuilding without user interaction.
I'm pretty sure everyone knew what everyone else was refering too, but just in case.....
 

ratbat

Distinguished
Mar 20, 2009
7
0
18,510
Firstly, thank you all for your carefully considered and very full replies, an excellent resource!

I was particularly taken by the case put forward by ShadowFlash’s for RAID 1. Also jrst’s point about the need for keeping a spare controller on the shelf if I used anything more fancy.

It is clear from this that I need to have a dedicated controller running RAID 1, so that if the controller fails I can take out of the drives, plug it directly into the motherboard and have a system working in no time. I have been playing with RAID 5 and 10 for performance increases, but I can see now I need to come back to 1 as I need the reliability more.

Thanks again.
 

wuzy

Distinguished
Jun 1, 2009
900
0
19,010
If it's purely on the scale of reliability, out of all software RAID5 solutions I would pick Windows Server after having experimented with it. I would also recommend experimenting with *nix's implantation as they're even more mature in the development of software RAID than Windows.
I have lost data due to early day experiences with cheap onboard/add-on software RAID before so when it came to moving my three ST31000340AS out from the clutches of Intel Matrix RAID (ICH9-R. which btw I never had problem with in RAID5) to a dedicated NAS/HTPC I had the choice between going pure software or pure hardware RAID5 this time. Since I was already running Windows Server 2008 x64 I decided to try it out for a month or so and found it reliable as expected. Then I caught an extremely cheap deal on a server-class Dell PERC 5/i 512MB +BBU which then I added another ST31000340AS (online RAID expansion rocks!), it's now ran for almost 4months straight so far with minimal reboots in between.

I still backup my more important data onto a 1.5TB of course through incremental backup. It's my double layered defense against drive failures.

[EDITED]Write performance isn't of great importance to me as the array is being used for archival. But recently it's been under multi-user usage via GbE (acting as a fileserver) so the hardware RAID5 with cache+BBU definitely came in handy for performance.

Reading through the thread a bit more, I haven't put an OS drive under any form of RAID for a very long time. Mainly because my typical usage can't take the full advantage of RAID0 or 10 or 0+1. I did it back when I started out computing because it was 'cool', but once I start to analyse the actual benefits of it I realised how limited in usage it can be to extract its full potential.
SS

An SSD (of non-JMF602 based) on the other hand would benefit me the most. As soon as they drop below $2.5/GB I'm diving in.
 

techdad

Distinguished
Jan 14, 2010
1
0
18,510
To be able to truly "ha - I laugh in the face of danger", I have found the KISS principle to be the most important. (for ordinary personal home use)

I get many more and frequent kudos for recovering 'lost' files (from wife and kids), than have to deal with broken HDDs :D

I use a spare device - another HD, external HD etc. on which I run an automated backup scheme - the elegant TimeMachine (mac) or something like secondcopy (windows). Then I backup the archive on different hardware - and I keep a HD offsite for fire/water damage scenario.

Not the answer for a mission critical, enterprise level backup scheme, but adequate for regular idiot proof home use.

to upgrade this to enterprise level - you have to go with one of these fancy shmancy raid thingies.... ;)
 

Malcolmk

Distinguished
May 31, 2007
181
0
18,680
Well I have four Areca raid controllers and there the best things I have ever spent money on. Incredibly fast and the most reliable component in any system I have ever owned. Yes get rid of the $5 intel crap controller and get a dedicated raid card. My largest card at the moment is the Areca 1231ML and it's worth it's wait in gold. I have 8 Samsung F1 1TB drives in a raid 5 and 2 SSD's in a raid 0. That will soon be 4 SSD's. I just wish I had a 16 or 24 channel card. Ok to give you an idea of what the raid 5 can do. It's got a read speed peaking at 1019mb/s and an internal file copy speed of 380mb/s. I can transfer a 4gb folder of RAW files in 10 seconds.
 

4745454b

Titan
Moderator
Which is impressive, but how often you think most home users/gamers need to transfer that much data? How big do you think a level is? For users that need that kind of speed thats a great way to go. For gamers, the money is better spent on PhysX cards, Eyefinity, fast clocked CPUs with low latency RAM, etc. Personally I'd rather wait twice as long as everyone else if I get to game on three 24" monitors with the details maxed @60FPS.
 

Malcolmk

Distinguished
May 31, 2007
181
0
18,680
The point is I started with an Areca four channel raid card the 1210 and I am still using it four years later. There $350 in the states or you could find one on ebay for about $250. $250 for 5-10 years use is the cheapest component anyone will ever buy. How many cpu's, mainboards, memory and hard drives do people go through in that time.
 

Malcolmk

Distinguished
May 31, 2007
181
0
18,680
Actually the ARC-1210 is $300 at newegg and I picked up a 1220 from ebay for $285. Also games like Modern Warefare 2 are very drive intensive and need a drive speed of 150-200mb/s to run smoothly. It's cheaper than one 128gb ssd and a lot better value. The other thing you must remember is the standard magnetic hard drive is the slowest and most unreliable component in any system. People spend *** loads on cpu's, Mainboards, graphics cards, a second graphics card, ssd drives and monitors but they never think to spend a few dollars on a raid card that will drastically improve drive speed and reliability.
 

4745454b

Titan
Moderator
I think your missing my point. For some of us, we can't afford to spend $300 for a drive controller. That $300 would be better spent on a faster CPU, more ram, larger LCD, faster GPU, etc.

Also games like Modern Warefare 2 are very drive intensive and need a drive speed of 150-200mb/s to run smoothly.

Flat out BS. Its like any other game, running it on a normal drive will be fine. It will take a bit more for it to load the game and levles, but ONCE ITS IN RAM IT WON'T TOUCH THE DRIVE AGAIN. (until the next level of course.) If your doing something that requires that kind of speed then go for it. But I'd rather move up to Eyefinity then some crazy 6 disk RAID array.

Its not a popular thing to say on these forums, but I stand by my first post. As a gamer, you don't really need RAID. Besides, we are quickly reaching the point where SSDs will take over and no one will want the "old" drives except for data storage.
 

Malcolmk

Distinguished
May 31, 2007
181
0
18,680
Oh, of course and why was the 320gb drive in my games PC being maxed out in cod6 and causing the game to glitch. Anything with lots of buildings and bots has to retrieve data from the hard drive as your playing. Run the performance monitor the next time your playing and see what your hard drive is running at.

The other thing is we are talking about reliable storage. Yes in 10 years time we might all be using SSD drives for storage but at the moment I haven't seen any 1tb ssd drives for $100. Let me know when you find some. Also we are talking about a $300 component that will get 5-10 years use. I know it's a hard concept to understand but that's like a $1 a week for fast reliable storage. I've already had my four channel card for four years so it's a very affordable component.
 

4745454b

Titan
Moderator
What was your ram amount? It should load everything into the ram, and be done until you need to load the next level. You should only have what your talking about if you have 1-2GBs of ram. I have 4GBs on my machine, and don't have any slowdowns as all. (I admit I haven't played COD:MW2 so its possible this is an issue.) I'd still like a link you can show me that says the game needs 200MBs drives.
 
Status
Not open for further replies.