Sign in with
Sign up | Sign in
Your question

Help: Failed Drive in ICH9R RAID 1 System Disk

Last response: in Storage
Share
November 13, 2008 1:11:41 PM

I have had a drive failure in a RAID1 array and need some step-by-step help to replace it. The motherboard is an ASUS P5E WS Pro with ICH9R Intel onboard controller. I have two questions: (1) How do I tell which drive is dead so that I can replace it? I am concerned that I might blow the good drive away if I do something wrong to diagnose them. (2) What is the process with the BIOS after CTL-I? When I go into the controller BIOS during boot there are only four choices: Create, Delete, Reset and Exit. I expect I need to Reset the dead drive but I have found no clear, step-by-step instructions and I am quite paranoid about blowing my system away totally (I've been having all sorts of problems for months which is why I just finished building this box, sigh.) I have some experience with LSI and 3Ware RAID5 and hot swapping drives but they have friendly interfaces.

And as long as I have your attention, what is the recommended way to get a restorable boot drive (system) image ... you know, in case all else fails?
November 13, 2008 10:53:44 PM

Well, after a good night's sleep I am thinking more clearly! I discovered that Intel Provides a Windows GUI monitor for their ACHx(R) drivers. I installed this package and it detected the failed drive even before installation completed. I have used it to mark the failed drive OK and attempt a rebuild (which is ongoing as we speak.) If it fails I should be able to add a new drive, build to the new drive, and then remove the bad drive afterwords ... I have two spares on the shelf. OTOH, it may well have been a glitch and everything may be fine!

I also located Acronis and expect I will be using that for an additional level of recovery.

I hope the next guy with this issue finds this thread and saves themselves some problems.
November 14, 2008 3:29:32 PM

Does you Windows utility list the serial number for each drive? Normally I like to view the serial of the defective, power down the machine and remove the drive and make sure the serial matches. This can not always be done but its a good method when it can.
Related resources
November 14, 2008 3:53:53 PM

Rozar, yes, this Intel utility for the Matrix controller chipset exposes information about the individual drives including serial number, port number, model number and size. It does not give a clue about how many reassignable sectors are left or type/number of errors detected (that I can see) so it could go farther. OTOH it does give a %age rebuild complete. And it does provide an option for 'marking' the "failed" drive 'OK' via context menu on individual drive and having it rebuild automagically, which is what I did successfully.

I would also note that the ReadMe that comes with the utility list several known significant exposures in the current release which seems to include problems with the driver itself, but they are just cryptic enough that it is hard to tell. I would hope that they will continue development of the chipsets, drivers and monitor utilities until these issues are corrected ... after all, these precise hardware/software tools are supposedly their area of expertise, not just CPUs!
November 14, 2008 6:39:01 PM

You have to keep in mind that these onboard controllers are not as good as 3rd party Processor based controllers. This also holds true in their ability to detect bad sectors and mark a drive offline. If you suspect that you have bad sectors on a drive I would power off, go into bios and set the controller to IDE, set the optical drive to boot first if its not already, reboot and run seatools to scan the drives. As long as you dont let the machine try to boot from a drive it wont alter its timestamp and wont harm the array. When this is complete go back into bios and return all settings as the were. Reboot as normal.
November 14, 2008 6:51:26 PM

I'm not overly worried about the health of the drives, just pointing out the level of the monitor. This box also has an LSI SATA 300-8X controller running RAID0 and RAID5 arrays but they are not boot devices. I'm a photographer and have almost a TB of images that I work with continuously so I like to have an online backup as well as backups on another machine and offsite backups ... to lose my images would be a huge disaster. I'm about to launch a project to burn them all to DVDs, too.

I expect to build yet another new machine next summer in which I plan to go Quad core, 8GB RAM (finally dare to try Vista 64, sigh), and get the Adaptec 12-port SAS controller. At the moment I'm thinking a SuperMicro motherboard. But that's another project all together. Right now I just want this machine to settle down!
!