Besides a drive failure what can kill a Raid 0 array

pkquat

Distinguished
Apr 26, 2005
92
0
18,630
What other types of failures and problems can cause a Raid 0 array to be unrecoverable? If its a long list, what can you still recover from? If its explained somewhere a link would be great.

I understand that improper shut downs and "lost clusters" and such are most often recoverable. However "bad sectors" are 50/50. What else?

TIA
 

pkquat

Distinguished
Apr 26, 2005
92
0
18,630
What I am looking for is a better definition of zero fault tolerance. Some things I have read make it sound as if there is any glitch everything is lost. However have heard that its not that bad, but nothing is substantiated.
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
Well, look at it like this. What can kill a single drive?

- Power supply surge can kill electronics or cause an inadvertant write to the platters.
- Extreme temperatures can affect bearings and associated spin speed, ruining data that is written at the time.
- Cabling malfunctions can cause all sorts of havoc.
- Shock can knock heads out of alignment or cause physical damage inside the drive.
- Bad sectors (from internal contamination, degradation of the magnetic material, head crash, etc.) will lose data.

Now, the purpose of a RAID array is fault tolerance. RAID levels 1 and up can handle any of those things to a single drive. There is redundancy in the system that will allow a single drive to be replaced for any reason and then the data on that drive to be rebuilt.

RAID 0 has no fault tolerance. Anything that takes out a single drive takes out data on the entire array.

If you look at it that way, then there's no situation that can affect a RAID 0 any more than it would affect a single drive. The difference is that by definition, a single drive has some small chance of any of the problems occurring, while the RAID 0 array has X times that chance, where X is the number of drives in the array. In other words, if a single drive could be expected to fail by one of those mechanisms once every 4 years, and you build a 4-drive RAID 0 array out of those drives, then you can expect the array to fail and lose all data once a year. The fact that you're using 4 drives instead of one opens you up to 4 times the chance of a problem.

Note that this is a different situation that if I have 4 drives in my computer that are individual drives, not in a RAID 0. Using the same drive as we used above which can be expected to fail once every 4 years, I also can expect one of my drives to fail every year. But without them being in a RAID 0, I don't lose all data, I only lose the data on the drive that fails. If that's a drive that has unimportant data on it or data that's backed up somewhere else, then I have no problem.

The purpose of most RAID arrays is to protect from hard drive failure (RAID levels 1 and up), and even more so (and this is frequently not mentioned) to keep the system online and operational when a drive fails. This is a main reason why RAID is used in servers, and only secondarily to protect against hard drive failure. Servers are generally backed up through some kind of backup system (usually tape). Single drives would then be fine if all you were worried about was hard drive failure, but you don't want the server down while you recover. Thus RAID.

The purpose of a RAID 0 array is different - it is purely for speed considerations. There is no fault tolerance, so if important data needs to be stored on it, it obviously needs to be backed up somewhere else.
 

pkquat

Distinguished
Apr 26, 2005
92
0
18,630
That sounds like most of what I thought. If I follow correctly if it won't kill a single drive there is a maybe a ? 90% ? chance that it won't kill the Raid 0 array.

I am familiar with the whole dead drive will kill the array and that reliability goes down for Raid 0 etc. I plan to have backups of the OS, programs etc. and have some schedule for it. It's more I'd rather not have to have the headache because XP crashed or an install went bad. And partially that if it was an issue other than bad sectors, I could have easily fixed it if I were running a single drive, ie a non critical issue.
 

Harvey_Scorp

Distinguished
Feb 6, 2007
3
0
18,510
Nice to see you guys are having this conversation. I've been running my PC as a Raid 0 and I just lost a hard drive. The computer is only about 1 and half years old. When I bought it, it was just a gaming computer. But as my old computer died, I’ve became more dependant on it as my work horse and everyday use. When I bought it I really didn’t understand how the RAIDS worked, but everyone said RAID 0 was the way to go for a gaming pc. (A little short sited on my part) I’m a programmer not a hardware guy, what can I say.

So now that I’ve lost a hard drive and I’ve read the postings. It seems that I’ve lost everything from what you say? Is there anyway at all to save this data, at least from the good drive?

One of the tech guys at work said to change the bios and disable the RAID altogether. Then plug in each drive separately to and the one that isn’t bad should boot the system. That seemed a little too simple. However, when I go to disable the RAID I get a warning that changing may keep my system from rebooting and require a reinstall. Do you know anything about making this change to save the data?

Any advice?

Keith
 

Harvey_Scorp

Distinguished
Feb 6, 2007
3
0
18,510
Nice to see you guys are having this conversation. I've been running my PC as a Raid 0 and I just lost a hard drive. The computer is only about 1 and half years old. When I bought it, it was just a gaming computer. But as my old computer died, I’ve became more dependant on it as my work horse and everyday use. When I bought it I really didn’t understand how the RAIDS worked, but everyone said RAID 0 was the way to go for a gaming pc. (A little short sited on my part) I’m a programmer not a hardware guy, what can I say.

So now that I’ve lost a hard drive and I’ve read the postings. It seems that I’ve lost everything from what you say? Is there anyway at all to save this data, at least from the good drive?

One of the tech guys at work said to change the bios and disable the RAID altogether. Then plug in each drive separately to and the one that isn’t bad should boot the system. That seemed a little too simple. However, when I go to disable the RAID I get a warning that changing may keep my system from rebooting and require a reinstall. Do you know anything about making this change to save the data?

Any advice?

Keith
 

SomeJoe7777

Distinguished
Apr 14, 2006
1,081
0
19,280
So now that I’ve lost a hard drive and I’ve read the postings. It seems that I’ve lost everything from what you say? Is there anyway at all to save this data, at least from the good drive?

One of the tech guys at work said to change the bios and disable the RAID altogether. Then plug in each drive separately to and the one that isn’t bad should boot the system. That seemed a little too simple. However, when I go to disable the RAID I get a warning that changing may keep my system from rebooting and require a reinstall. Do you know anything about making this change to save the data?

Unfortunately, RAID 0 is not redundant/fault tolerant at all. If one drive is no longer readable, then no files are readable from the array as a whole. By definition, a RAID 0 array takes a file and distributes parts of it to each drive in the array to increase the read & write speeds. Thus if one drive of a 2-drive array is not readable, then 1/2 the sectors of every file you stored are not readable. You cannot recover any good files off the remaining drive, all of the data stored there is only 1/2 of the file.

Turning off RAID in the BIOS will not allow the remaining drive to boot the system obviously, the files are not intact.

The only method to recover the data is to recover the data off the bad drive by some method, then piece it together with the remaining data on the good drive.

I have managed to do this before with Runtime's GetDataBack and RAID Reconstructor. However, the process is prone to errors, and the chance of recovery directly depends on what's wrong with the bad drive. If it's only got a few bad sectors, you might be able to get most of the data off the array. If it's completely hosed, then you're better off sending the drive(s) off for data recovery.
 

jt001

Distinguished
Dec 31, 2006
449
0
18,780
A drive failure isn't the only thing that will corrupt an array. I've been running raid for many years now and 90% of the time data loss results from something other than mechanical failure. SATA cables aren't secure at all, let's say one gets bumped and disconnects the drive...it falls out of the array and data loss is likely, some controllers handle this differently; drives can also fall out for no apparent reason. If data loss is a concern RAID 0 is a bad idea, if you wanna go that route get another drive and do a daily backup to it.
 

jonkc

Distinguished
Apr 14, 2006
99
0
18,630
What I am looking for is a better definition of zero fault tolerance.

That statement is a contradiction in terms. Raid 0 offers NO fault tolerance what so ever. It is for pure speed. I still don't know why they label it as RAID when it has by definition no redundancy. If you want speed and fault tolerance then RAID 5 or better is your option.
 

pkquat

Distinguished
Apr 26, 2005
92
0
18,630
Ok rephrase, what are the faults, aside from the ones listed above that can kill it, ie back to my original question. If any computer hang causing an unconventional shutdown was a "fault" then I doubt many people would be running Raid 0.

In this case data security is not important. I will have backups. Speed is great, but..........

I don't want to have to rebuild etc too often because annoyances, ie both drives are fine, and their are bad sectors. In the past WITH A SINGLE DRIVE I have had to do some fixing via Norton on Win98 (from some type of crash or a bad install / update/ uninstall, etc), and at least once had to do a major recovery in XP, ie work from the command prompt and manually copy some files etc. The normal "recovery" option did not work. Thank goodness I had another computer to copy all the process steps off of the MS site and others. I have no idea what the cause was, just one day it didn't boot. This was still quicker than trying to rebuild everything and maybe faster than restoring a back up. A few other times I needed to boot from a previous date. If these type of errors or "faults" would kill the array then I don't want the headache.
 

dropadrop

Distinguished
Oct 21, 2003
14
0
18,510
Apart from the listed errors it could also be killed due to data synchronization problems. These could be caused by a SATA cable becoming disconnected (as mentioned above), or just disturbance in the cable. It could also caused by an error in the storage controller, it's driver or the operating system (as you will most probably be running a software RAID setup).

The more I work with servers that have RAID disks, the less I feel I would use anything apart from RAID 1 at home (with cheap equipment).