Sign in with
Sign up | Sign in
Your question

TEFLON Storage System Organization?

Tags:
  • Hardware
  • NAS / RAID
  • Controller
  • Storage
Last response: in Storage
Share
January 23, 2011 9:58:19 AM

I have now survived two OS RAID 0 fail attempts on my life, the first a bad HDD (so I switched to SSDs) and the second a bad Intel controller...

I lost no discrete data, but lost app configurations, OS, etc., not to mention time.

My (former) system was OS Drive 2x SSDs in RAID 0 (failed), 4x SSDs in RAID 10 on an Adaptec Hardware RAID, and 2x HDDs in RAID 1 for storage.

It was my belief originally that the RAID 10 on a Hardware RAID controller card was fairly robust. Given that I have now had a Mobo controller die though, I've reconsidered life - what would happen if my Hardware RAID controller also died? (and this presumes that the drives themselves don't go).

In addition to all this, I'm fairly paranoid by this point so even a disc failure on a RAID 10 that would in principle rebuild frankly frightens me.

What I'm wondering is the following:

1. Is there a way to avoid RAID altogether and simply keep CLONED drives? I would like to have an OS drive fail (no controller issues since I'm not in RAID) and simply not even know it happened - have the Clone drive kick in seamlessly, and just replace the dead one at my leisure.

I'm pretty sure with a combination of Hot Swappable and/or Hot Spare drives, this should be possible? Can you create a constantly CLONED drive without RAID 1?

RAID 1 is problematic in my opinion because it's not really 2 INDEPENDENT IDEs if the controller writes metadata and/or messes with the MBR on it, so you're at the data recovery stage already at that point.

2. I know that Disk Images, Ghosting, System Restore, etc. are all viable options. Correct me someone if I'm wrong, but an Imaged Drive could be treated as a CLONE drive (just at a single point in time)? If your drive fails, and you have an image of the drive, can you immediately swap and boot that imaged drive?

3. Even if you can boot from an image, these options do not allow uninterrupted service, which is what I want.

4. RAID 10 :: I didn't like RAID 1's performance comparison with RAID 0, so I used RAID 10 for my data on a hardware controller - RAID 10 I believe DOES allow seamless service with dropped drives (like RAID 1 but with performance boost of some kind) BUT still suffers from controller failure.

What exactly does happen if the hardware RAID controller dies? I'm new to this, so I have no idea - I'm guessing that buying another identical controller would be too easy... Also, does the hardware RAID controller suffer from battery death, or other power-related issues (like unplugging the drives - will that cause a failure?)

I would imagine not since it should have on-board non-volatile storage of some kind involving the RAID meta-data?

5. Can anyone suggest a storage organization that might even come close to what I'm looking to build (again)?


HUGE Thanks to anyone who even read through all that.

More about : teflon storage system organization

January 23, 2011 1:47:08 PM

What you are basically trying to accomplish is mutli path IO redunduncy. Or MPIO in short.

This is usually done on servers and requires expensive hardware. If you want to survive a controller failure, you need a second controller of the same kind AND they need to be able to work together and transparently failover.

This normally requires high end SCSI or Fiberchannel cards.

So keeping a cloned drive without RAID is possbile but either requires a software solution (which wont automatically kickin) or an external block level solution as used in enterprise SAN environments.

Protecting yourself against a data corruption, is almost impossible.. you wish to have a clone to seemless takes over but at the same time you want it to survive such corruption? You would need software to monitor and detect such corruption... it does exist and is used in a lag replication scenario.

Anyway if this is for personnal use, stick with RAID 1 or 10 and yes if your hardware controller dies, it is a simple as swapping the controller out if you know the block size and disk order of your configuration
January 23, 2011 7:21:29 PM

Wow - thanks for the comprehensive reply - makes sense that server tech is mostly uninterruptible.

A few followup questions, for anyone...

1) I do know of Clone-keeping software (ReBit for example, <www.rebit.com&gt;) which would effectively enable even NAS-style clones which as lafontma mentioned, obviously disrupt service, but should, I believe, allow nigh perfect OS/system restoration in the form of a bootable clone drive un-differentiable from the original. Furthermore, they would enable RAID performance benefits while introducing the minimal latency of the background software which is copying the RAID to a non-RAID drive.

IMPORTANTLY though - I'm not sure that such a software cloned drive would allow the OS to boot? Would the OS boot record be 'installed' on this clone drive? (or would the files merely be copied to it, akin to me copying the C drive to an external HDD?)

i.e. Does this kind of software, or does any, create "bootable clone drives?" I emphasize the "Bootable" part since this would be pretty close to going from 0% to 100% service minus the time it takes to swap the clone drive.

2) I'm not familiar beyond basic awareness of how SANs work, or if I could/should/would want to build one - I will investigate that.

3) Data corruption, without seeming to trivialize a very-real problem, I could probably chalk up to the vicissitudes of the world. I suppose ZFS and maybe other types of systems are good for that. In general though, as lafontma says, it's not something that I feel I could worry about and actually make extensive progress on, so I'm not going to.

4) ***Most importantly, in terms of RAID Controllers - (when) my Adaptec card fails for whatever reason, am I correct in understanding that I can simply purchase a new one (i.e. open the spare one I had bought already) replace it, and upon boot everything will be back to normal?

I just have this (founded only in paranoia and not evidence, observation or experience) inkling that the card writes metadata into the RAID 10 drives (or some kind of controller-drive interaction occurs) and thus might present issues when a new one is merely substituted for it.

Presumably someone out there can answer this last question with a simple, "Yes I've swapped controller cards before, it always works" response... or not. :) 

Related resources
January 24, 2011 12:03:27 PM

1) All depends on the software. If you are not sure if the software does copy the MBR. you can always do it manually the first time you install the spare drive.

4) the metadata on the disk is acutally there for the purpose of storing the array configuration. So yes i have swapped out some raid cards without any issues. If they are the same model, it is a very smooth process. when you swap in the new card, it scans the disks and will find the metada and configure itself.
January 24, 2011 1:57:56 PM

1) I spoke to REBIT support (a bit brusque if you ask me) and they said Rebit does NOT create a bootable clone drive.

When I asked them if copying the MBR by hand would enable a bootable clone, they said "No," and then essentially told me that they can only answer questions relating to Rebit software...

i.e. Rebit just copies the files and folders automatically (incrementally) to an external, it doesn't clone the drive, with all its associated MBR, partition info, etc.

So I need to find software which DOES create bootable clones.

You said that I can manually copy the MBR (and this would allow a bootable) - they said "No." I'm definitely confused about this, so maybe I need to research more fully how MBR works, and what exactly makes for a bootable drive identical to the drive it was cloned from, since I'm not sure at this point.

Now I'm wondering if the popular Acronis TrueImage software creates bootable clones... Investigating...

4) That's great news - thanks for the reassurance.

!