Restoring my second RAID1 array after reinstalling XP

m610

Distinguished
Aug 30, 2009
23
0
18,520
Well, I had a RAID1 array for my system drive. Had!

When I got back in town on Wednesday and turned on my PC one drive had failed. This array contained my OS and other files. I copied all of my personal files, I hope, to another drive. I had another one ready to take it's place but I had to leave town again in a few hours. So, I turned the PC off and left , came back and turned it on, and the second drive failed. The whole array failed within 2 days.

I bought two new 1TB SATA drives today and the OS (XP, sp3) is up and running again, and the new 1 TB RAID1 array is partitioned and formatted and is working. But... the second array, also RAID1, the one that did not fail, doesn't come up in Windows. The BIOS sees it. Device manager sees it and says it is working. Disk Manager sees it, says it is healthy and active, gets the disk spaced used stats right, and even gets its name right, but the only option it gives me is to delete the partition. It is formatted and has files on it that I need to keep. Seems to me it just need a drive letter.

Any ideas?
 
Solution
What I see is that, of two drives in a RAID1 array, one has failed with a SMART error, and the other is OK. Your Windows installation may be getting confusing information because to it, that RAID1 array is just one drive.

Before dong anything that follows, find and read the instructions on your RAID system. The should be in a file on your hard drive or on a CD that came with your PC. If not, check the manufacturer's website for such a document, or go to the website of the maker of your mobo's chipset for it.

I suggest you re-connect the two drives in your machine, preferably to the same mobo ports as before, if possible. When you boot, don't go all the way into Windows. Early in the process as the BIOS does it checks, I'm pretty sure...

m610

Distinguished
Aug 30, 2009
23
0
18,520
Fixed it. Took some time, and searching.

Turns out that Norton GoBack was the culprit. I had it installed before the failure and it seems that it modifies the MBA database, or whatever it is called. Symantec has a program to take care of this. You download the ISO image file, copy it to a CD, reboot from the CD, select opton 1, then wait for about an hour and a half, and finally reboot, and now everything is good. Well, except for the two drives that failed, and for these I am beginning to suspect a driver problem that causes random SMART errors.

So, I solved a noob's problem. What do I win? ;)
 

Paperdoc

Polypheme
Ambassador
Very nicely done! You win a lot of peace of mind - hard to fix that with a $ number, but anyone who's suffered through losing all their files will know it has real value! Like you, though, I'm still struck by the coincidence of two HDD failures in the same RAID1 array within a couple of days. Seems VERY unlikely to be a HDD hardware issue.

I'd start by re-installing the two failed drives, one at a time, and using disk diagnostic utilities downloaded for free from their manufacturer's website, test them fully, saving the info. If the drives show real problems, contact the manufacturer about replacement if they are still warrantied.

If there are no hardware problems, start looking for OS or driver issues. Also remember that apparent hard drive errors sometimes can be caused by poor connections at sockets on each end of a cable. These can happen just by age and oxidation of the metal surfaces, or by having a cable slowly vibrate loose.
 

m610

Distinguished
Aug 30, 2009
23
0
18,520
Thanks.

I haven't given up on those drives. They are Seagate Baracudas and I might be able to use SeaTools to find out what is going on. Seagate says that 40% of drives returned to them are fine that that SeaTools can diagnose the problem, maybe clear it up.

I hope to be able to get to the data off these drives. I'm not sure I had everything backed up. Recent emails, web preferences and bookmarks, etc. I think I found tools to convert safety to/from RAID.
 

m610

Distinguished
Aug 30, 2009
23
0
18,520
Update and more details.

The Windows version of Seatools that came with the new drives I bought yesterday failed to run. "Counter less than zero error", or something like that.

The Windows version I downloaded today started, and scanned the drives, but would not run any test on any drive.

The DOS version I downloaded today is working.

The two drives that failed on me last week, within a day of each other, were the two drives in my system RAID1 array. Odd that they failed one after the other.

My motherboard (Intel D955XBK) had reported that both drives had failed. These were running on the ICH7R controller.

DOS Seatools says one is good and no SMART events had occurred. It also passed the short test.

DOS Seatools says the other drive has has SMART tripped. When running the basic test it warned me that the drive had experienced temperatures over 70C. ???? It was mounted right next to the other one. Anyway, the short test failed and the long test is now running.

Both drives are ST3500641AS, firmware versions are 3.AAB .

I don't know why my MB reported SMART errors on both drives. I'm still hoping I can recover data from them. I had backed up most, but not all, of my files.
 

Paperdoc

Polypheme
Ambassador
What I see is that, of two drives in a RAID1 array, one has failed with a SMART error, and the other is OK. Your Windows installation may be getting confusing information because to it, that RAID1 array is just one drive.

Before dong anything that follows, find and read the instructions on your RAID system. The should be in a file on your hard drive or on a CD that came with your PC. If not, check the manufacturer's website for such a document, or go to the website of the maker of your mobo's chipset for it.

I suggest you re-connect the two drives in your machine, preferably to the same mobo ports as before, if possible. When you boot, don't go all the way into Windows. Early in the process as the BIOS does it checks, I'm pretty sure it will put up a screen prompt (and wait for your input before continuing) asking whether you want to enter the RAID setup utilities package. This is part of the on-board RAID management system. Do that and find the area that shows you the status of your RAID array. It should tell you what Seatools did - that one of the disks is faulty. It may already have taken the step of Breaking the array back down to individual disks so that you can work only with the good one. Or, it may still be keeping the array under RAID and handling the problem internally.

From there you have two options. One is to let the RAID system continue to manage the situation. What it expects you to do is install a replacement for the failed drive. Then you go back into the RAID management system and tell it that there's a new drive in there and it should rebuild the RAID1 array so it is fully functional again, and it should do that.

The other option is to abandon this RAID array. To do that, if it's not done already, you tell the system to break the RAID1 array. It will separate the two drives so that Windows now recognizes and uses each separately as normal drives. Of course, one won't be usable. But the good drive will now appear to be just a normal drive, and you can use any regular means to back up the data on it. After that you can decide what you want to do.
 
Solution

m610

Distinguished
Aug 30, 2009
23
0
18,520
Thanks. The last paragraph seems to be on target.

Here's what I did, and to spoil the ending, I got my files and everything is running.

After the SeaTools tests where I put the drives in the external SATA bay I put them back into the system to verify that the system could not use them. I tried, one, both, then switched them (swapped cables). The BIOS reported one had failed and the other was offline, or not working, depending which I hooked up with which cable. Anyway, it was clear I wouldn't be getting the system to boot this way. I thought about going into the RAID BIOS utility and switching them back to solo drives but was concerned about data loss.

To back up a bit, I've now replaced my 2 x 500 GB RAID1 array that held my OS and other files with 2 x 1TB drives, installed the OS and a few apps, and that's working.

I found a file DIY recovery tool at Seagate and was all ready to try out the demo version, just to see if I could get my files back. I installed the still-good drive in an external SATA bay and booted the system, and it booted to THAT drive! That was probably the best luck I could have had, because now I could just copy my files to another drive. I guess the boot order in my BIOS had reverted to using the Promise SATA card. I backed up my files asap, plus I downloaded a program that would backup my Thunderbird email and FireFox settings and backed them up. I shut down, removed the former RAID1 C: drive, rebooted, and the new system came up. I then installed Firefox, then restored my old settings. I then repeated the process for Thunderbird. Done. Now I can finally put the top back on the case, finish installing apps, and get back to work.

I was surprised that the system would boot on this drive and even more surprised that everything was intact. RAID1 is mirroring, so there should be duplicates of the files on each drive, but I still expected issues with the file names, partitions, etc.