Tom's Hardware > Forum > Storage > NAS/RAID & Technologies > ICH9R; RAID 5; cannot re-add drive to volume

ICH9R; RAID 5; cannot re-add drive to volume

Forum Storage : NAS/RAID & Technologies - ICH9R; RAID 5; cannot re-add drive to volume

Tom's Hardware: Over 1.4 million members in 6 different countries available to answer all your high-tech questions. Sign up now! Its free!
Word :    Username :           
 

Hi,

i cannot re-add the 3rd drive to my raid 5 volume after a crash. im very sad about it because i need data-redundancy (important data -.- cant afford to lose it)

Mainboard: GA-P35C-DS3R (ICH9R-chipset)
Storage: 3x SAMSUNG HD501LJ (500GB) @ RAID 5

the intel matrix storage manager gives me the following error when i klick "rebuild on selected drive":

Rebuild RAID Volume Wizard
The volume cannot be rebuilt to the selected hard drive due to one of the following reasons:
- The hard drive contains system files or is the system hard drive
- The hard drive is not large enough to be used for the rebuild action
- The hard drive has reported a SMART event
- The hard drive has reported a failure



to solve the problem...

- i did a lowlevel format on the "not anymore accepted drive" -> no systemdata etc on that drive -> NOT MY PROBLEM
- i checked the size: the "not anymore accepted drive" was in that array before the crash -> if it has not changed its size (and i realy belief that it did not), it cant be to small -> NOT MY PROBLEM
- i checked the SMART-status with speedFan -> all ok, no errors, no warnings -> NOT MY PROBLEM
- checked the raidBIOS -> drive is in "Non-Raid-Mode" and no errors or warnings at all -> NOT MY PROBLEM

so, what is the problem? i dont know what to do... plz help... *cry* :/

thanks in advice,
ic3


Sponsored Links
Register or log in to remove.

Please post the system report from the Intel Matrix Storage Manager (got to File | Save System Report). Make sure to do this with all hard drives attached.

Reply to rockchalk

OMG I've just encountered a very similar problem and I'm at my wits end. Seems like intel RAID marks the drive somewhere so it knows that the drive has been used in the array before but for some reason got broken. Like ic3f1re I've done all the low lvl diagnostics even clearing the mbr/bootcode sectors of the drive. I've had to get another harddrive and the intel raid console happily rebuilt the array with the new drive.

I know the rebuild works in some cases as I've a friend who was pulling in and out one of his hd's to use as a temp storage. For his system, the intel bios did detect when the drive was put back and automatically started the rebuild, and on another occasion he manually used the console to begin the rebuilding process, all performed on the same array / harddrives.

I've got a feeling we're missing something about the "broken" drive...

System Information

Kit Installed: 7.8.0.1012
Kit Install History: 7.8.0.1012
Shell Version: 7.8.0.1013

OS Name: Microsoft Windows XP Professional
OS Version: 5.1.2600 Service Pack 2 Build 2600
System Name: MAGICAL3
System Manufacturer: Gigabyte Technology Co., Ltd.
System Model: P35-DS3R
Processor: Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
BIOS Version/Date: Award Software International, Inc. F11, 01/04/2008

Language: ENU



Intel(R) RAID Technology

Intel RAID Controller: Intel(R) ICH8R/ICH9R SATA RAID Controller
Number of Serial ATA ports: 6

RAID Option ROM Version: 7.5.0.1017
Driver Version: 7.8.0.1012
RAID Plug-In Version: 7.8.0.1013
Language Resource Version of the RAID Plug-In: 7.8.0.1013
Create Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume Wizard: 7.8.0.1013
Create Volume from Existing Hard Drive Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume from Existing Hard Drive Wizard: 7.8.0.1013
Modify Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Modify Volume Wizard: 7.8.0.1013
Delete Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Delete Volume Wizard: 7.8.0.1013
ISDI Library Version: 7.8.0.1013
Event Monitor User Notification Tool Version: 7.8.0.1013
Language Resource Version of the Event Monitor User Notification Tool: 7.8.0.1013
Event Monitor Version: 7.8.0.1013

Array_0000
Status: No active migration(s)
Hard Drive Write Cache Enabled: Yes
Size: 1192.3 GB
Free Space: 0 GB
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Number of Volumes: 1
Volume Member 1: Data

Data
Status: Degraded
System Volume: No
Volume Write-Back Cache Enabled: Yes
RAID Level: RAID 5 (striping with parity)
Strip Size: 64 KB
Size: 894.1 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Parent Array: Array_0000

Hard Drive 0
Usage: Non-RAID hard drive
Status: Normal
Device Port: 0
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81PAJVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
System Hard Drive: No
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes

Hard Drive 1
Usage: Array member
Status: Normal
Device Port: 1
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81JXTWN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 2
Usage: Array member
Status: Normal
Device Port: 2
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R818297N
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 3
Usage: Array member
Status: Normal
Device Port: 3
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81K8HVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 4
Status: Missing

Unused Port 0
Device Port: 4
Device Port Location: Internal

Unused Port 1
Device Port: 5
Device Port Location: Internal


Message edited by lkwjeremy on 01-21-2008 at 05:15:33 PM
Reply to lkwjeremy

Good luck with this, maybe someone with more experience with RAID 5 can help.
It appears that you set up a RAID 5 array and considered it good enough to be a backup.
"Redundacy" is a situation where a server or PC can stay up and keep working in the event of a drive failure. It is not any way a form of, or considered a "backup", which is what you should have.
If you have data that you can't afford to lose, you need to back it up onto another type of media, like
CD's, or DVD's. That way in the event of a failure, like the one you are experiencing now, you do not lose your important data.


Reply to jitpublisher

lkwjeremy wrote :

OMG I've just encounter a very similar problem and I'm at my wits end. Seems like intel RAID marks the drive somewhere so it knows that the drive has been used in the array before but for some reason got broken...

 

I know the rebuild works in some cases as I've a friend who was pulling in and out one of his hd's to use as a temp storage. For his system, the intel bios did detect when the drive was put back and automatically started the rebuild, and on another occasion he manually used the console to begin the rebuilding process, all performed on the same array / harddrives.

 

I don't run RAID5, but throughout all my reading on various websites everyday, I've come too understand that once a drive is reported "bad", then it's marked somehow so that you cannot use the drive in your RAID5 again to help prevent data loss. Maybe your S.M.A.R.T. shows everything ok now, but maybe when if first dropped out it reported an error and that's when it was "marked" "bad". What HDD's do you have (brand/model)? Maybe you'll have to buy another one, it should work fine then. Anything as large or larger than the other 2 (or more) drives working in the RAID5 will work, but hopefully you'll get the same model.

 

Maybe search around about this "marking" of the drives to see if you can undo it somehow. That is, if you really trust that the drive is ok, then go ahead and try and use it again. Good luck, keep me posted, I'm curious as how to fix this. :)


Message edited by gwolfman on 01-21-2008 at 04:05:56 PM
Reply to gwolfman

lkwjeremy -- when you took this snapshot, were you unable to rebuild to the drive on port 0? Also, you said it was a very similar problem-- did you get the same error message?

lkwjeremy or ic3f1re
Can you please try the following: Add one additional SATA disk to your system that is larger than your existing disks on one of the unused ports. This should let you get into the rebuild wizard, but once you're in the rebuild wizard, select the drive that you really want to rebuild to. I expect this will get around this issue. Please let me know if it works.

Reply to rockchalk

rockchalk wrote :

Can you please try the following: Add one additional SATA disk to your system that is larger than your existing disks on one of the unused ports. This should let you get into the rebuild wizard, but once you're in the rebuild wizard, select the drive that you really want to rebuild to. I expect this will get around this issue. Please let me know if it works.


Interesting...

Were you able to try this yet?

Reply to gwolfman

This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array. I talked to engineers and Western Digital, Seagate and Samsung who explained that, when a desktop drives performs internal error correction, the amount of time required exceeds the time that RAID controllers allow before they drop the drive out of the array. When this occurs, the RAID controller writes something (I can't remember what or where) to the drive so that it can't be used again in the array. The Seagate engineer explained that some type of low level reformatting needs to be done before it can be used again.

This is the reason that both Seagate and WD offer RAID drives - they're a $10 to $30 more than their standard drives. Samsung does NOT have a similar drive - I got this from their engineer. These RAID drives limit the amount of time they report to the controller so that they will not be dropped from the array. Two examples of these RAID drives are: Western Digital Caviar RE2 WD5001ABYS, and Seagate Barracuda ES.2 ST3500320NS

Note the "YS" in the Western Digital suffix and the "NS" in the Seagate suffix. Both Seagate and WD also indicate that these drives are built to a higher spec.

Therefore, the person in this thread that indicated that there was no need for special RAID drives was incorrect, probably because non-RAID drives in an array will work.... for a while.

On a personal note, I learned all this AFTER I purchased a pair of Samsung 500GB Desktop drives for my RAID 1 configuration. They've been in my array for the last three months without any problems. That said, I'm going to replace them with RAID-certified drives in the next 60 days.

I just wish that the vendors, like NewEgg, would warn us and educate us with the facts so that we could make informed decsions.

Reply to techjunkiewest

Good point. When the Intel Matrix Storage Manager determines that a disk is failed, it will mark the state as "Failed." You can tell it to unmark it as failed by right-clicking on a failed disk and select "Mark as Normal." This will tell the driver to unset that failed bit and treat the disk as healthy again.

While this may have caused the initial RAID degradation above, it does not appear to be the reason you can't re-build. Please note in the attached system report that Hard Drive 0 is not marked as failed. It is possible that it was failed, and the user already marked the disk as normal.

Reply to rockchalk
- 0 +

Hard drives in RAID arrays have "meta data" on the drive that tells the controller about the drive. It sounds like your controller has somehow "marked" your drive as "defective". If you need to make that "meta data" go away so the controller will see that drive as a "new drive", you will need to do a low lvl format of that drive. You will need to have that drive on a standard controller (not RAID) for this to work. Here is one tool that works well.

http://hddguru.com/content/en/soft [...] rmat-Tool/

It would also be a good idea to check that drive with a drive scanning tool while you are at it. I use Seatools a lot, it works for most any drive.

http://www.seagate.com/www/en-us/s [...] /seatools/

Reply to rozar

If you do use one of those tools, if it will report the actual drive size (not partition size) with a precision of bytes or kilobytes (greater precision than MB), could you please report that number here as well?

Reply to rockchalk

techjunkiewest wrote :

This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array.


I've been running my 3-disk RAID-0 for over a year with "desktop" drives and have never had a problem. Maybe you just had a faulty drive, which happens.

Reply to gwolfman

gwolfman,

Re-read my post. If any internal error correction goes on within a healthy single drive - which does occur to prolong the life of a drive - it will take longer than the controller will allow before the controller forces it out of the array. RAID-0 configurations would therefore be especially vulnerable because, as you know, if a drive drops out, for whatever reason, you've lost your stripe.

So, the fact that you've been running over a year without any problem doesn't diminish the fact that you are unnecessarily at risk. At risk because you could lose it all when a healthy drive, undergoing a normal internal process, drops out.


Message edited by techjunkiewest on 01-25-2008 at 08:03:53 PM
Reply to techjunkiewest

Okay guys, I've reproduced the issue on one of my systems, and have two viable workarounds.

Based on the system report posted by lkwjeremy, it looks like he has already reset the problem drive to non-RAID either through the UI or OROM by selecting "Reset to Non-RAID." (or by zeroing the entire disk using a non-RAID system, which should do about the same thing, just take longer) My workarounds both start from this configuration, as anybody with this issue should be able to perform that task simply.

The first workaround is fairly easy. When you go into the OROM (the Ctrl-I utility), it should pop up a dialog asking you if you want to rebuild the volume to a drive. Rebuild to the disk that was originally part of the volume (Be very careful not to accidentally select a drive with data on it, as you would lose that data). When you boot into Windows, the volume will start rebuilding.

The second workaround requires that you boot the system with the problem disk still inserted, and no other disks from the volume. (Note that if you haven't marked the disk as non-raid, you will see a failed volume show up here, and you won't be able to continue this process) You will now be able to right-click on the drive and "mark as spare." (This right-click menu only shows up when there are no degraded volumes in the system). Now, shut down the system and re-add the all the volume disks you took out before. When Windows boots, the driver will automatically pick up the spare drive and start rebuilding the volume to it.

As a note, I have successfully tested both of these workarounds.

Reply to rockchalk

Im having the same issue. I actually had to disable raid to get memtest86+ to run, and when it was re-enabled, one of the disks was 'non-raid member', and raid has a missing member and is marked as degraded. I do not get a rebuild raid option.

The disk is already marked as a non-raid member so resetting it in ROM interface is not an option.
I don't quiet understand workaround 2 - the raid is my system drive....???

rockchalk wrote :

Okay guys, I've reproduced the issue on one of my systems, and have two viable workarounds.

Based on the system report posted by lkwjeremy, it looks like he has already reset the problem drive to non-RAID either through the UI or OROM by selecting "Reset to Non-RAID." (or by zeroing the entire disk using a non-RAID system, which should do about the same thing, just take longer) My workarounds both start from this configuration, as anybody with this issue should be able to perform that task simply.

The first workaround is fairly easy. When you go into the OROM (the Ctrl-I utility), it should pop up a dialog asking you if you want to rebuild the volume to a drive. Rebuild to the disk that was originally part of the volume (Be very careful not to accidentally select a drive with data on it, as you would lose that data). When you boot into Windows, the volume will start rebuilding.

The second workaround requires that you boot the system with the problem disk still inserted, and no other disks from the volume. (Note that if you haven't marked the disk as non-raid, you will see a failed volume show up here, and you won't be able to continue this process) You will now be able to right-click on the drive and "mark as spare." (This right-click menu only shows up when there are no degraded volumes in the system). Now, shut down the system and re-add the all the volume disks you took out before. When Windows boots, the driver will automatically pick up the spare drive and start rebuilding the volume to it.

As a note, I have successfully tested both of these workarounds.


Reply to speeder2000
- 0 +

Hi,

I was getting this same error and was googling all over the place trying to find a solution.

I think I finally found one and thought I would share this just in case someone is ever in my shoes.


I have 5x500 gb hard drives in a RAID 5 configuration. After flashing my motherboard bios (Gigabyte GA-P35-DS4) my raid got disabled. (I guess when I flashed it took it back to defaults consequently shutting down the RAID. ) After enabling this one of my disks fell out of the array. I tried to recover the disk in the intel console but got the error message listed in the first post. I tried several times to low level format but this didn't work.

Finally, I powered off the raid and only powered on the disk that fell out of the array to do another low level format. When I booted the system up the intel console read the drive and I was able to mark it as a spare. I shut down the PC, powered on the RAID and booted the PC. Immediately the intel console started rebuilding the RAID.

My suggestion would be the above. That is, if you haven't given up yet.

Reply to Skallah

Quote :

Finally, I powered off the raid and only powered on the disk that fell out of the array to do another low level format. When I booted the system up the intel console read the drive and I was able to mark it as a spare. I shut down the PC, powered on the RAID and booted the PC. Immediately the intel console started rebuilding the RAID.



by console, do you mean the raid bios, or intel console in windows? if you power down the raid, how do you get into the intel console assuming that your raid drives are your system drives as well

Reply to speeder2000

I ran into a problem yesterday with my Intel RAID 5. I hope someone has a clue how I can rescue this..

Setup:

ASUS P5LD2 Deluxe (with the latest BIOS)
Intel Matrix Storage Manager ICH7R
3x 500GB Samsung SATA drives

When I boot up, the concole says "Status Failed" and below lists all 3 drive members but shows no errors, all show "member Disk(0)" under "Type/Status(Vol ID)".

Is there any way I can get these back up? My OS was on the raid so the machine won't boot.. The boot up screen shows all 3 drives with no problems.

Dennis

Reply to GhostLobster
- 0 +

Quote :

by console, do you mean the raid bios, or intel console in windows? if you power down the raid, how do you get into the intel console assuming that your raid drives are your system drives as well



I meant the intel console in windows. I didn't mess with any settings in the raid bios. The way my system is set up is the raid acts as a big storage drive. No system files (other than raid specific) are stored on it. The system drive I use is a standard 150 gig hard drive which is not associated the raid at all.

The raid carriage I have lets me individually power on each drive (if needed). Not too sure how common this is in most carriages but this allowed me to fix the problem I was encountering in my first post.

Reply to Skallah

this is a totally annoying situation

sent intel support an email....


Message edited by speeder2000 on 11-15-2008 at 04:14:54 AM
Reply to speeder2000

GhostLobster wrote :

I ran into a problem yesterday with my Intel RAID 5. I hope someone has a clue how I can rescue this..

Setup:

ASUS P5LD2 Deluxe (with the latest BIOS)
Intel Matrix Storage Manager ICH7R
3x 500GB Samsung SATA drives

When I boot up, the concole says "Status Failed" and below lists all 3 drive members but shows no errors, all show "member Disk(0)" under "Type/Status(Vol ID)".

Is there any way I can get these back up? My OS was on the raid so the machine won't boot.. The boot up screen shows all 3 drives with no problems.

Dennis




Im having the same problem. my 4x Raptors are in RAID 5 with the OS and everything (games, photos). I have an old Mobo with only 4 ports. I cant rebuild the system in the BIOS. Anyone, please, correct me if Im wrong:

1) RAID can only be repaired at Windows, never at BIOS.
2) Doesnt mattter wich RAID you use, make sure to have a spare HD with an OS for you to boot in case things go badly. Even HDs built for RAID like the NS and YS may fail for some unexpected reason.

My next system will be just 1 Raptor with the OS and games + 2x 1TB RAID certified drives in RAID 1 for important stuff. This will give me all storage I need and safety. It wont be as fast, but Ill be able to relax and enjoy my PC.

RAID needs to be improved. In fact, there is an article saying that RAID will be dangerous in the future, as HD capacity increases, so does the probability of error during reconstruction of a failed arry, therfore making RAIDs impossibe to repair if you have a 999999999TB HD, wich will happen someday in the future.

Reply to leandrodafontoura

I was able to get my RAID 5 back by putting in a new drive, install the OS onto it, then the intel raid software. Once booted on the new OS drive, I started the intel manager, and did a recover to the raid drives which worked! i then shutdown, removed the new OS drive and booted back on my raid 5 without any problems.

First thing I did was a BACKUP!

Reply to GhostLobster
- 0 +

I like SATA RAID5. After changing motherboards, the RAID5 was seen and functional without needing addtional word. The Intel Matrix Storage Manager is very simplified, but seems to work well. When a drive failed, just replaced with a new drive and the array was automatically rebuilt. When another drive failed, just formatted it, marked it as normal, then the array was rebuilt.
Yes, RAID 5 has problems, but it's a whole lot better than nothing for rebuilding data.
just my 2 cents worth.

Reply to enewmen
- 0 +

I am really glad I found this post, I had a very similar experience with my 4 drive RAID10 system.

The solution as proposed to boot with one drive and mark it as a spare worked, but getting an OS to boot was a big problem.

Here is what I did:
1. Vista Ultimate x64 SP1, 8GB RAM, 4x500GB HDD, 120GB RAID10 for OS, 800GB RAID10 for data, Gigabyte EP45-DS3R, ICH10R RAID chipset, Q9650 3GHz.
2. Update BIOS from F9 to F10, load optimized defaults, reboot.
3. The loading of the optimized defaults changed the drive config from RAID back to AHCI, I think this somehow confused the RAID chipset.
4. Changed BIOS config back to RAID, rebooted.
5. Now the Intel RAID BIOS reported that drive-0 is non-RAID, that drive-1 to drive-3 is RAID, and that the two RAID10 volumes are degraded.
6. I entered the RAID config by pressing Ctrl-I, but there were no options to let me move the drive back to RAID.
7. I found an Intel KB that says I am supposed to see a dialog that lets me re-assign the drive, but I had no options to do anything with the non-raid drive, see: http://support.intel.com/support/c [...] 021017.htm
8. So I figured I can still boot since the RAID1 portion of the RAID10 will still work. Two things went wrong.
9. Booting Vista gave me a boot volume can't be mounted bluescreen error. I think this was because the non-raid disk was listed before the raid boot volume in the BIOS.
10. I changed the BIOS to put the single non-raid drive after the two raid volumes.
11. I booted again and this time Vista told me it can't find winload.exe.
12. I also tried booting with the non-raid drive unplugged, same problem.
13. I booted from the Vista DVD and selected repair, but the repair could not find any Vista installs.
14. I ordered a replacement drive and paid for next day delivery, turns out I don't need it, but I will set it up as a spare.
15. I wanted to get the Intel Storage Manager software running, but I need a working OS to do this.
16. I plugged in an external eSATA drive, unplugged all internal drives, and installed Vista and the Intel version 8.6 storage manager on the eSATA drive.
17. I plugged all the internal drives back in, and booted from the eSATA drive.
18. On logging in the Intel storage manager tray icon told me that the drive is degraded, and that I must right click on the new drive and to select rebuild.
19. I right clicked on the non-raid drive and selected it for the rebuild, but got the stupid error that lists the various possible causes of the problem, but not the actual problem.
20. Intel, this is the dumbest error message ever, I want to know exactly what failed, not some list of possible causes.
21. One of the possible causes mentioned was that there is data on the drive. I opened disk manager and saw that the non-raid disk had an assigned drive letter, and had a 120GB partition on the 500GB disk.
22. I deleted the partition, rebooted, tried again, and the same winload.exe error.
23. Now that I had the (stupid) error dialog, google pointed me to this thread.
24. This is what really helped me; I removed all but the non-raid drive, and selected that disk for use as a spare.
25. I shutdown, reconnected all drives, rebooted, still booting on the esata drive, this time the rebuild automatically started.
26. After a few hours the rebuild completed, I ran chkdsk on the raid volumes, the first volume had some errors, repaired them, the second volume was fine.
27. Now I had to get Vista booting from the original boot volume again.
28. I changed the BIOS settings to place the raid volumes first, and I rebooted, but I still got the winload.exe error.
29. I booted from the Vista DVD, and selected repair, this time the repair tool found two install, the one on the boot volume and the one on the eSATA drive, and performed some repair actions.
30. On booting again I had two “Vista Ultimate (repaired)” OS boot options, the first entry booted from the eSATA drive, the second entry booted from the RAID boot volume.
31. I powered down the eSATA drive and rebooted. I ran bcdedit.exe and deleted the first entry.
32. After rebooting the correct OS automatically loaded.
33. I think if I had ran the boot repair with the eSATA drive disconnected I could probably have skipped the bcdedit step.

This still leaves me some questions:
1. Why did the drive turn into a non-RAID drive, this is of greatest concern to me for I am bound to update the BIOS again, and switching from RAID to AHCI is bound to happen?
2. Why did the Intel RAID BIOS not allow me to fix the problem from the BIOS, but forced me to do it from Windows?
3. Why did the right-click on the drive fail, and required that the drive be individually marked as a spare?

Thanks again for the advice in this thread.

P.

Reply to ptr727



Oh my god, thank you ptr727 for posting this up. I recently setup my X58 i7 with 4 500gb Seagates in RAID10
Unknowingly updated my bios for better stability and whamo! the first disk of the raid dropped out.

Now ive gone about things a little differently then you. I'm back in the operating system with disk 2+3+4 but disk one is out of action and not allowing me to use it for the rebuild, even after using diskpart and 'cleaning' the disk information.

What I did to get back into the OS, was I unplugged the dropped disk, inserted my vista cd and allowed it to write the Vista Ultimate... (repaired) bcd. that got me around the Winload.exe problem, the bsod i got before the winload problem :cry:
I was baffled why the OROM or the Intel software wouldn't allow me to rebuild the array.

So all i need to do is unplug all but the dropped one and it will allow me to mark as spare in the OROM?
Wont all the remaining 3 get marked as Missing? will that fix itself when I plug them back in after?

I'm concerned I'm going to ruin it further. however i will follow your steps bit by bit to ensure I can boot again from it as long as it doesn't all go down hill on me. and Hopefully I can reuse the disk since its not bad, its a nasty flaw in the software which I think Intel should be working with motherboard manufactures to correct as it is integrated RAID.

Thanks again ptr727 for posting your raid 10 solution, it will prove a data saver for me :)
Thanks TH & all the Posters who have contributed to this thread, this issue is a real doozie.

Reply to qwertylesh

For those that got this to work... I'm confused by one thing...

Quote :

I was able to get my RAID 5 back by putting in a new drive, install the OS onto it, then the intel raid software. Once booted on the new OS drive, I started the intel manager, and did a recover to the raid drives which worked! i then shutdown, removed the new OS drive and booted back on my raid 5 without any problems.

First thing I did was a BACKUP!



Are you saying you configure your drives as SATA (not RAID), then get into windows that way.....
I'm OK up until this part.

But now when I want to install Intel Storage Matrix, it won't let me (says I don't meet the requirements). My guess is that it doesn't see any raid setup?

Correct me where I go wrong:

I have a RAID 10 w/ 4 drives. One shows as "non raid" and I cant boot....

Solution:
Unplug all but non-raid disk
Config BIOS from RAID to SATA
Install Vista on non-raid disk
Install Storage Matrix on non-raid disk (this is where I'm stuck)
Repair drives w/ Intel Storage Matrix
Shutdown, setup bios back for RAID again.
Cross fingers.

I'm going wrong somewhere , just not sure?

Reply to bluearchtop

the method i used was like this;
I detached all 4 or my raid 10 disks.

plugged in two non raid standard disks from an old computer.
kept the motherboard bios in raid mode, and then installed windows onto one of the two non raid disks from the old computer. (the data on it didnt matter to me)
Installed the matrix storage manager on the new windows.
shutdown, added the one failed raid disk to the other two new disks.
started up and loaded the Matrix storage manager.
right clicked the failed disk and got to choose mark as spare.

dont setup the windows onto the failed disk, set it up on a disk that is not part of your raid array. then you 'spare' the failed disk with that install, then your done with that install. you go back to rebuilding your array with its remains after you get the disk spare.

perhaps someone could explain it better. but the method does work, it might be a PITA and you'll prolly never bios update again on a raid system because of it, but at least its repairable.

Reply to qwertylesh
- 0 +

Guys,
I've had a different problem with Intel Matrix Storage but I thought it will be nice to share the solution here so people searching the net can find it easily.
My problem was that I had two disks in mixed RAID0-RAID1 configuration for both speed and safety. While I was updating the BIOS of my motherboard (Gigabyte GA-P35-DS4 ICH9R) the matrix storage decided to drop-out one the disks and label it as "Non-Raid" which led to Degraded RAID1 and FAILED :( RAID0. AS my OS (Windows XP + Windows 7 dual boot) was on the RAID0 I felt "toasted". I did a lot of reading the other day and found a solution that helped me not loose ANY data at all :)
This post here "showed me the light"
http://forums.extremeoverclocking. [...] ostcount=6

 

As the OP wrote, deleting RAID arrays in the BIOS (O-ROM) doesn't actually delete your data - only the metadata that tells the Matrix Storage what arrays are configured. It also clears the MBR and partition tables but with a tool like TestDisk you can easily fix this.

 

This absolutely saved my day (and a couple of nights to be precise).

 

I hope this helps someone else too


Message edited by pankov on 04-19-2009 at 04:43:12 PM
Reply to pankov

I have a somewhat similar problem at the moment. I also have a Raid 10 array with 4 150GB Raptors and one of them was reported as failed. I have not had any problem booting into XP except when I relocated the "Failed" disk to a JMicron port and booting up would stall at the XP splash screen. I am using an ASUS P5B Deluxe with ICH8R.

To save anyone else from a possible heart attack or crapping your pants, don't do what I tried. Within Windows using the Intel Matrix Storage program, I selected the "Mark drive as normal" to see if this would fix anything. When I did this, the entire PC froze. I mean the entire system - nothing worked including putting Caps on or Num Lock. After a few minutes, it came back alive for a few seconds and then froze again. During this, I was freakin out. After another few minutes, it came back alive and the "Failed" drive was marked as missing. Now, in the Intel Raid BIOS, it shows the bad drive as a SMART event.

I have tried everything to see if the WD Diagnostics program could check the drive but nothing works. When booting to the DOS disk of the WD tool, it says No Drives exist. What friggin crap.

I now figure the drive is actually dead. Lets hope that WD sends me a Velociraptor for my replacement as they did for another member here.

PS The default TLER has not been changed.

Reply to specialk90

If the drive is faulty, don't try to resilver your array with it. You should replace it. :P

What maybe happened is that the array would incorporate that faulty disk as soon as you clicked mark drive as normal and requests to it timed out probably due to it not responding in time because its damaged in some way.

As for the WD diagnostic tool not recognising your disk, can you put Serial ATA transfer mode to "IDE" or "LEGACY" or something? That would enable DOS-apps to see your disk.

------------------------------ ...man will occasionally stumble over the truth, but usually manages to pick himself up, walk over or around it, and carry on.
Reply to sub mesa

Sub mesa, I actually did try and set the BIOS to IDE mode and then booted to the DOS disk and it still said there were no drives. Great minds think alike :)

Reply to specialk90
- 0 +

Just for info,

Gigabyte X38-DQ6 Motherboard everything working fine with BIOS F8, but as I was updating all drivers, I decided to update the BIOS to F9I !!!! what a mistake !

On rebooting I lost one of my RAID5 hard drives and I could not do anything to get it back in. Hours of work ensued backing up my work from the two functioning drives...... fortunately my configuration is

2xWD Raptors in RAID0 = 148Gb Boot Drive with all Software (no problems to date, approx 18 months old)
3xSamsung 500Gb SATAII= 938Gb Data Drive with all work

The steps I took are as follows

1. tried to use intel raid manager program to put disk back - failed
2. stupidly formatted hard drive that appeared out of the array
3. tried to re-add now hard drive appeared in Windows as 500GB HDD - Failed
4. removed partition from drive and made it raw, tried to add to array - failed
5. Backed up all data to external drive(S)
6. Used Samsung HDD utility to low level drive with problem (all others disconnected)
7. Rebooted and pressed CTRL+I removed RAID array
8. Recreated Array with all drives - success
9. restarted Windows Vista
10. Vista found hard drives - blank
11. Tried to recover drives using test disk - requested reboot - on reboot windows wants to run chkdsk on RAID5 Drive, skipping causes windows to fail - Vista would not load.
12. Forced reboot, removed and re-created array
13. Created new 934Gb partition in Windows Vista and now formatting
14. Copy data back and hope I got all of it.

Hindsight

sould have backed up then removed array and re-created - I think this would have worked.

DON'T UPDATE BIOS !!!! It doesn't always make it go faster as I had hoped

SPEC

Gigabyte X38-DQ6 BIOS F9I
Intel 6850 3.0Ghz Retail - Stock H/S & Fan
2x 2GB Corsair 6400 CL5 DDR2
2x 1GB Corsair 6300 CL5 DDR2
Gainward GeForce GT8800 512Mb
2xWD Raptor 74GB RAID 0 - Windows Vista 32bit & Other OS related software
3xSamsung 500GB RAID 5 - DATA and backups
2xPioneer 112 IDE DVDRW
1xSony 1.44Mb
Antec Sonata III Case
Antec 500W Earthwatt PSU

Reply to rrm

I had a similar problem and today i managed to resolve it as follows:

Intel ICHR-9, Raid 5 with 3 drives.

Port 0 Drive failed, so it was kicked out from raid volume.

After try with intel matrix raid manager to put the disk back, it refuses to restore the volume.
The disk was detected from non-raid member to 'spare' if i tried to restore the volume.
Intel bios utility does not help in any way.

Well, the problem was that Intel Matrix or maybe Ichr9 bios, had set a little HPA (Host Protected Area) active on the kicked disk.
So the disk had one megabyte less than before.

To resolve this you need to disable HPA with some program like HDAT2. After that, the disk is detected as offline-raid member,
and you can start a successfull restore of your raid volume.

I hope this help.

Reply to Anonymous

I had same issue 1+0 raid 4 disc 500 mb vista 64 bit
1. low level format (hddguru.com) the drive that the raid does not like. I had to go to separate computer.
2. quick format same drive.
3. unplug all the raid drives.
4. plug in separate drive and install vista 64.
I had issue of pc being very very slow. i found that i had a usb flash drive in the back i forgot about. was using this, because it is supposed to speed up loading of programs. It was causing my pc to come to a stand still. After removing install went very fas.
5. plug in drive that did not work.
6. install and run intel matrix storage manager.
7. right click drive that did not work and mark as spare.
8. shut down
9. attach other raid hdd(s)
10. when pc loads again the rebuild will start.

I cannot thank this group enough!
I will look into getting better drives when I get the money.

Reply to Anonymous

Really, isn't it just the meta-sector that you need to wipe from the "broken" disk?

For example, you could also boot to ubuntu livecd (doesn't require any installation) and overwrite the last sector of that drive, with a dd-command in a terminal. That would wipe the (stale) RAID config data which is probably what is the curlpit. Since its only 512 bytes its also much faster than a complete zero-write. :)

A windows "format" (even a full format) doesn't actually write zeroes to all sectors, just the beginning (first 20MB or so). The rest is just reading the sectors, but not zero-writing them. Thus a windows format would not work in this case.

After the drive looses its last sector with RAID config data, the RAID engine won't know this disk and will mark it as free/unused.

------------------------------ ...man will occasionally stumble over the truth, but usually manages to pick himself up, walk over or around it, and carry on.
Reply to sub mesa

This seems the closest I've found to the problem I am having. I recently set up RAID5 using 4x320Gb Seagate HDDs, on a Gigabyte GA-965P-DQ6 motherboard (with intel ICH8R controller). Everything ran fine for the first day (numerous reboots for a fresh install). The next day when I booted the fourth drive in the array was misdetected by the RAID BIOS and it stalled. Removed the drive everything ran fine in degraded mode, took the drive back and exchanged it for a new one. Installed the new drive, let it rebuild the array, on the next reboot it was misdetected by the BIOS and the system stalls. So I booted it as a non RAID member, tested it (no errors), reformatted it, plugged it back in and it was detected fine, rebuilt the array next reboot it was misdetected.

For clarification on my meaning of 'misdetected', originally I had "SERIAL ATA AHCI BIOS Version iSRC 1.07", but after having troubles I tried updating so now have "SERIAL ATA AHCI BIOS Version iSRC 1.20E" When it starts the first line states there are 6 ports and 4 devices, it then lists the first three drives/ports, when it reaches the fourth it halts without writing anything. So it is detecting the drive exists, but can't seem to determine what kind of drive it is but only after it has been incorporated into the array otherwise the drive has no problems.

The problem has me stumped, if anyone has any ideas of what might be causing it I'd be happy to hear them :/

Perhaps the oddest part of it all is I have my suspicions that one of the older drives is occasionally invoking ERC which is meant to be a sure fire way to get a drive booted from an array, yet the controller has had no problems with that drive.

Reply to Anonymous

After pulling whats left of my hair out trying to find a solution to my problem, I found this thread. Wow! Great information - sure glad I found it:

I am running 4 - Seagate 500 Gigs in a RAID 10. Over the past few months or so it has started to give me problems with one drive. I would get a BSOD pegging the Iastor.sys file as the culprit. On reboot, my RAID setup entered the Rebuild array state and would do so.

After having this happen about 10 times in 3 months, the drive was reported as an offline member. Try as I might I could not get it back online. That is, until I read this thread.

I have spent countless hours playing around with my system to try and figure it out. Updating the drivers to the latest version did nothing. At one point, I tried to determine which one of my 6 SATA connectors was Port 0, the offending one. Well, I figured wrong and unplugged the wrong drive! Now I couldn't even boot my system. It then reported 3 of my 4 drives as offline. Oh crap! My data!!!! After much effort and experimenting, I found an easy way to tell which port is hooked to which drive. Reading the info on the RAID setup gives the serial numbers of the drives on the 4 ports. So then it was a matter of pulling the drives out, reading the serial numbers and going from there.

I got my system back up and running on 3 drives again by unplugging the one originally reported as Offline. Windows reports this drive as "Misssing"

After I found this thread, I right clicked on the offending drive and was able to set it to a non-raid disk. After doing that I was able to right click again and select rebuild array to this disk. That is what it is doing right now. With over 600 gigs of stuff, it takes a while.

Now my only concern is that this drive may be going south and perhaps that is the reason it was "marked" in the first place.

Bottom line here is, thank you to all who have posted in this thread for the last 21 months or so. This information has been very valuable to me.

BTW, I am running the RAID 10 on an Asus Maximus Extreme board, liquid cooled, dual video cards, and all the other good stuff.

Reply to BobsYourUncle
- 0 +

techjunkiewest wrote :

This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array. I talked to engineers and Western Digital, Seagate and Samsung who explained that, when a desktop drives performs internal error correction, the amount of time required exceeds the time that RAID controllers allow before they drop the drive out of the array. When this occurs, the RAID controller writes something (I can't remember what or where) to the drive so that it can't be used again in the array. The Seagate engineer explained that some type of low level reformatting needs to be done before it can be used again.

This is the reason that both Seagate and WD offer RAID drives - they're a $10 to $30 more than their standard drives...

I just wish that the vendors, like NewEgg, would warn us and educate us with the facts so that we could make informed decsions.



Awesome info, I never knew this either, same thing, built a RAID 5 from off-the-shelf Seagate drives, already had a drive drop out of the RAID but the Intel driver fortunately let me "mark as normal" to get it to rebuild without a LL format. I will replace these with RAID-spec HDD at my next upgrade. Thanks again for sharing!

Reply to mrm125

Disks with TLER drop even sooner out of the array; they just give up and report a bad sector. RAID driver hears about bad sector.. and kicks the drive out of the array.

TLER is usefull to make server systems with RAID not 'stall' or 'freeze' whenever an HDD is not working properly. Its a "do or die" setting; either work a 100% or you'll be kicked out of the array really soon. No mercy! TLER is not useful for consumers.

Also, consumer disks like WD Green do support TLER; its just disabled by default because consumers have little need for this feature. So you don't need any "RAID edition" disks; i've used countless desktop-class HDDs in RAID, although i use more advanced software RAID instead of all the onboard RAID windows-only drivers.

------------------------------ ...man will occasionally stumble over the truth, but usually manages to pick himself up, walk over or around it, and carry on.
Reply to sub mesa

sub mesa wrote :

So you don't need any "RAID edition" disks; i've used countless desktop-class HDDs in RAID, although i use more advanced software RAID instead of all the onboard RAID windows-only drivers.



Can you elaborate a bit on the advanced RAID software?
And what is TLER?

Reply to BobsYourUncle

Advanced RAID software: BSD geom raid drivers, md-raid and/or ZFS i would consider advanced RAID. Unfortunately windows does not offer any advanced RAID.

TLER = Time-Limited Error Recovery, basically the feature many people are using to say you should get RAID edition disks and not 'normal disks'. This statement can be untrue for two reasons:
1) normal disks sometimes have TLER too, but disabled by default. For example WD Green disks.
2) consumers should not need TLER and it might even harm them. TLER is for servers not for desktop systems.

TLER also won't prevent disks from dropping out of the array. TLER will even make this happen more quickly than disks without TLER. So this is a misconception to what TLER is useful. Its useful for VERY IMPORTANT servers who cannot permit even half a minute to not work or be frozen. If one disk even scratches its bum, they want to kick out that disk and throw in the waste bin, and drop in any of their 200 replacements disks lying on the shelves. This is what TLER is useful for.

------------------------------ ...man will occasionally stumble over the truth, but usually manages to pick himself up, walk over or around it, and carry on.
Reply to sub mesa
Tom's Hardware > Forum > Storage > NAS/RAID & Technologies > ICH9R; RAID 5; cannot re-add drive to volume
Go to:

There are 632 identified and unidentified users. To see the list of identified users, Click here.

Sponsored links
  • Ask the community now
  • Publish
Ad
They won a badge
Join us in greeting them