Serveraid Array problem

Rich

Distinguished
Mar 31, 2004
943
0
18,980
Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Hi,
I have a Netfinity 5500 with a builtin ServeraidII.
It was flashed to Bios 7.0 a month ago, working fine.
System has 4 18GB drives divided into 5 logical ones.
I have the configuration logs and details.
But yesterday, we brought down the system (Win2KS)normally.
During powerup, we noticed one of the drives did not appear to be
seated properly. So since the system was still in POST and Windows had
not started booting yet, The drive was pulled and reseated. To make a
long story short, we ended up with three of the drives being marked as
defunct. But at no time, did windows even begin to startup, so I
believe the information on the logical drives of the Array should be
clean. If I reset the controller and try to read the configuration
from the drives, I end up with the same level of deFUNCTness. We did
not try to delete, redefine, re-mark any of the drives or array.
It seems strange that a Raid-5 array can be so fragile. I would think
that there should be some way to easily unDEFUNCTify all the drives,
just to try some form of recovery. Since parity is involved,
revalidating the logical drives shouldnt be a big deal (rather than
forcing the whole Array to be inaccessible).
At this point, a backup system is running fine, so I have time to
attempt/practice a disaster recovery scenario on this ole-dawg.
I've meandered through a lot of the online Serveraid docs, with no
clear solution. I feel like i'm missing something easy. Anyone have
any experiences or ideas that could help bring this array back from
the dead?

I also have a new Serveraid-4Mx card to play with too, if it can help.
Thanks!
rich
 
G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"rich" <rctech@attglobal.net> wrote in message
news:e89721fa.0408122146.63d8b850@posting.google.com...
> Hi,
> I have a Netfinity 5500 with a builtin ServeraidII.
> It was flashed to Bios 7.0 a month ago, working fine.
> System has 4 18GB drives divided into 5 logical ones.
> I have the configuration logs and details.
> But yesterday, we brought down the system (Win2KS)normally.
> During powerup, we noticed one of the drives did not appear to be
> seated properly. So since the system was still in POST and Windows had
> not started booting yet, The drive was pulled and reseated. To make a
> long story short, we ended up with three of the drives being marked as
> defunct. But at no time, did windows even begin to startup, so I
> believe the information on the logical drives of the Array should be
> clean. If I reset the controller and try to read the configuration
> from the drives, I end up with the same level of deFUNCTness. We did
> not try to delete, redefine, re-mark any of the drives or array.
> It seems strange that a Raid-5 array can be so fragile. I would think
> that there should be some way to easily unDEFUNCTify all the drives,
> just to try some form of recovery. Since parity is involved,
> revalidating the logical drives shouldnt be a big deal (rather than
> forcing the whole Array to be inaccessible).
> At this point, a backup system is running fine, so I have time to
> attempt/practice a disaster recovery scenario on this ole-dawg.
> I've meandered through a lot of the online Serveraid docs, with no
> clear solution. I feel like i'm missing something easy. Anyone have
> any experiences or ideas that could help bring this array back from
> the dead?
>
> I also have a new Serveraid-4Mx card to play with too, if it can help.
> Thanks!
> rich

Never reseat ServeRaid drives while there power on the system eventhoug no
OS i running, the raid controller detects that the drives has been removed
and fails these. To recover from this, use the ServeRaid CD or in some
cases the only thing that works is the DOS config v3.50c.
Boot the system and start the config util, set one drive at the time to
"Online", remenber that the last disk has to be rebuild AND THIS DISK HAS TO
BE THE ONE WHO FAILED FIRST !!!!!

I have done this lots of times, and it works i most cases unless you dont do
the first faled disk last, or you have Hardware failures on multiple disks.

PS. Remember to upgrade: Disk firmware, ServeRaid Driver, Server
Administration program, Bios, system management processor Firmware, Diags,
Director...... and so on.... do not do upgrades of just ServeRaid firmware.
IF you chose to upgrade do it all or leave it alone.


Best Regrads


Johnny
 

Rich

Distinguished
Mar 31, 2004
943
0
18,980
Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Thanks for the info Johnny,
The serveraid CD doesnt give me the option to bring the drives back
online. I suspect that is because there is more than one DDD beast in
the array. There are three DDD and one Online drive. I think I know
which drive needs to be rebuilt (the one that was first reseated). So
I assume that I need to make a HSP drive in the array and then rebuild
that first-failed DDD drive onto it? Will(should?) the other two DDD
drives then go online at that point, or will they need to be rebuilt
also?
Where can I get a copy of the DOS Config v3.50c? The earliest copy of
the Serveraid stuff I have is around 4.6.


>
> Never reseat ServeRaid drives while there power on the system eventhoug no
> OS i running, the raid controller detects that the drives has been removed
> and fails these. To recover from this, use the ServeRaid CD or in some
> cases the only thing that works is the DOS config v3.50c.
> Boot the system and start the config util, set one drive at the time to
> "Online", remenber that the last disk has to be rebuild AND THIS DISK HAS TO
> BE THE ONE WHO FAILED FIRST !!!!!
>
> I have done this lots of times, and it works i most cases unless you dont do
> the first faled disk last, or you have Hardware failures on multiple disks.
>
> PS. Remember to upgrade: Disk firmware, ServeRaid Driver, Server
> Administration program, Bios, system management processor Firmware, Diags,
> Director...... and so on.... do not do upgrades of just ServeRaid firmware.
> IF you chose to upgrade do it all or leave it alone.
When I did it, I was confused about upgrading everything at once. I
understand the requirement, but is there a proper sequence of events
(since in reality, it can't all be done at the same instant). Do I
need to uninstall/Reinstall director? I seem to remember that I had
some problem with something called twintail (which we dont do), that
was backleveled and wasnt part of the upgrade download. We finally got
it working, but lost Director functionality. Since this server was
going to be upgraded to SBS2k3 soon, we decided to leave well enough
alone.
Thanks for all your advice, this all proves that understanding theory
doesnt help much in testing a real Disaster recovery plan.

Rich
>
>
> Best Regrads
>
>
> Johnny