ICH9R; RAID 5; cannot re-add drive to volume

ic3f1re

Distinguished
Jan 20, 2008
1
0
18,510
Hi,

i cannot re-add the 3rd drive to my raid 5 volume after a crash. im very sad about it because i need data-redundancy (important data -.- cant afford to lose it)

Mainboard: GA-P35C-DS3R (ICH9R-chipset)
Storage: 3x SAMSUNG HD501LJ (500GB) @ RAID 5

the intel matrix storage manager gives me the following error when i klick "rebuild on selected drive":

Rebuild RAID Volume Wizard
The volume cannot be rebuilt to the selected hard drive due to one of the following reasons:
- The hard drive contains system files or is the system hard drive
- The hard drive is not large enough to be used for the rebuild action
- The hard drive has reported a SMART event
- The hard drive has reported a failure



to solve the problem...

- i did a lowlevel format on the "not anymore accepted drive" -> no systemdata etc on that drive -> NOT MY PROBLEM
- i checked the size: the "not anymore accepted drive" was in that array before the crash -> if it has not changed its size (and i realy belief that it did not), it cant be to small -> NOT MY PROBLEM
- i checked the SMART-status with speedFan -> all ok, no errors, no warnings -> NOT MY PROBLEM
- checked the raidBIOS -> drive is in "Non-Raid-Mode" and no errors or warnings at all -> NOT MY PROBLEM

so, what is the problem? i dont know what to do... plz help... *cry* :/

thanks in advice,
ic3


 

rockchalk

Distinguished
Oct 9, 2007
58
0
18,630
Please post the system report from the Intel Matrix Storage Manager (got to File | Save System Report). Make sure to do this with all hard drives attached.
 

lkwjeremy

Distinguished
Jan 21, 2008
1
0
18,510
OMG I've just encountered a very similar problem and I'm at my wits end. Seems like intel RAID marks the drive somewhere so it knows that the drive has been used in the array before but for some reason got broken. Like ic3f1re I've done all the low lvl diagnostics even clearing the mbr/bootcode sectors of the drive. I've had to get another harddrive and the intel raid console happily rebuilt the array with the new drive.

I know the rebuild works in some cases as I've a friend who was pulling in and out one of his hd's to use as a temp storage. For his system, the intel bios did detect when the drive was put back and automatically started the rebuild, and on another occasion he manually used the console to begin the rebuilding process, all performed on the same array / harddrives.

I've got a feeling we're missing something about the "broken" drive...

System Information

Kit Installed: 7.8.0.1012
Kit Install History: 7.8.0.1012
Shell Version: 7.8.0.1013

OS Name: Microsoft Windows XP Professional
OS Version: 5.1.2600 Service Pack 2 Build 2600
System Name: MAGICAL3
System Manufacturer: Gigabyte Technology Co., Ltd.
System Model: P35-DS3R
Processor: Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
BIOS Version/Date: Award Software International, Inc. F11, 01/04/2008

Language: ENU



Intel(R) RAID Technology

Intel RAID Controller: Intel(R) ICH8R/ICH9R SATA RAID Controller
Number of Serial ATA ports: 6

RAID Option ROM Version: 7.5.0.1017
Driver Version: 7.8.0.1012
RAID Plug-In Version: 7.8.0.1013
Language Resource Version of the RAID Plug-In: 7.8.0.1013
Create Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume Wizard: 7.8.0.1013
Create Volume from Existing Hard Drive Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume from Existing Hard Drive Wizard: 7.8.0.1013
Modify Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Modify Volume Wizard: 7.8.0.1013
Delete Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Delete Volume Wizard: 7.8.0.1013
ISDI Library Version: 7.8.0.1013
Event Monitor User Notification Tool Version: 7.8.0.1013
Language Resource Version of the Event Monitor User Notification Tool: 7.8.0.1013
Event Monitor Version: 7.8.0.1013

Array_0000
Status: No active migration(s)
Hard Drive Write Cache Enabled: Yes
Size: 1192.3 GB
Free Space: 0 GB
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Number of Volumes: 1
Volume Member 1: Data

Data
Status: Degraded
System Volume: No
Volume Write-Back Cache Enabled: Yes
RAID Level: RAID 5 (striping with parity)
Strip Size: 64 KB
Size: 894.1 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Parent Array: Array_0000

Hard Drive 0
Usage: Non-RAID hard drive
Status: Normal
Device Port: 0
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81PAJVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
System Hard Drive: No
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes

Hard Drive 1
Usage: Array member
Status: Normal
Device Port: 1
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81JXTWN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 2
Usage: Array member
Status: Normal
Device Port: 2
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R818297N
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 3
Usage: Array member
Status: Normal
Device Port: 3
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81K8HVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 4
Status: Missing

Unused Port 0
Device Port: 4
Device Port Location: Internal

Unused Port 1
Device Port: 5
Device Port Location: Internal
 
Good luck with this, maybe someone with more experience with RAID 5 can help.
It appears that you set up a RAID 5 array and considered it good enough to be a backup.
"Redundacy" is a situation where a server or PC can stay up and keep working in the event of a drive failure. It is not any way a form of, or considered a "backup", which is what you should have.
If you have data that you can't afford to lose, you need to back it up onto another type of media, like
CD's, or DVD's. That way in the event of a failure, like the one you are experiencing now, you do not lose your important data.


 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980


I don't run RAID5, but throughout all my reading on various websites everyday, I've come too understand that once a drive is reported "bad", then it's marked somehow so that you cannot use the drive in your RAID5 again to help prevent data loss. Maybe your S.M.A.R.T. shows everything ok now, but maybe when if first dropped out it reported an error and that's when it was "marked" "bad". What HDD's do you have (brand/model)? Maybe you'll have to buy another one, it should work fine then. Anything as large or larger than the other 2 (or more) drives working in the RAID5 will work, but hopefully you'll get the same model.

Maybe search around about this "marking" of the drives to see if you can undo it somehow. That is, if you really trust that the drive is ok, then go ahead and try and use it again. Good luck, keep me posted, I'm curious as how to fix this. :)
 

rockchalk

Distinguished
Oct 9, 2007
58
0
18,630
lkwjeremy -- when you took this snapshot, were you unable to rebuild to the drive on port 0? Also, you said it was a very similar problem-- did you get the same error message?

lkwjeremy or ic3f1re
Can you please try the following: Add one additional SATA disk to your system that is larger than your existing disks on one of the unused ports. This should let you get into the rebuild wizard, but once you're in the rebuild wizard, select the drive that you really want to rebuild to. I expect this will get around this issue. Please let me know if it works.
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980

Interesting...

Were you able to try this yet?
 

techjunkiewest

Distinguished
Jan 24, 2008
2
0
18,510
This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array. I talked to engineers and Western Digital, Seagate and Samsung who explained that, when a desktop drives performs internal error correction, the amount of time required exceeds the time that RAID controllers allow before they drop the drive out of the array. When this occurs, the RAID controller writes something (I can't remember what or where) to the drive so that it can't be used again in the array. The Seagate engineer explained that some type of low level reformatting needs to be done before it can be used again.

This is the reason that both Seagate and WD offer RAID drives - they're a $10 to $30 more than their standard drives. Samsung does NOT have a similar drive - I got this from their engineer. These RAID drives limit the amount of time they report to the controller so that they will not be dropped from the array. Two examples of these RAID drives are: Western Digital Caviar RE2 WD5001ABYS, and Seagate Barracuda ES.2 ST3500320NS

Note the "YS" in the Western Digital suffix and the "NS" in the Seagate suffix. Both Seagate and WD also indicate that these drives are built to a higher spec.

Therefore, the person in this thread that indicated that there was no need for special RAID drives was incorrect, probably because non-RAID drives in an array will work.... for a while.

On a personal note, I learned all this AFTER I purchased a pair of Samsung 500GB Desktop drives for my RAID 1 configuration. They've been in my array for the last three months without any problems. That said, I'm going to replace them with RAID-certified drives in the next 60 days.

I just wish that the vendors, like NewEgg, would warn us and educate us with the facts so that we could make informed decsions.
 

rockchalk

Distinguished
Oct 9, 2007
58
0
18,630
Good point. When the Intel Matrix Storage Manager determines that a disk is failed, it will mark the state as "Failed." You can tell it to unmark it as failed by right-clicking on a failed disk and select "Mark as Normal." This will tell the driver to unset that failed bit and treat the disk as healthy again.

While this may have caused the initial RAID degradation above, it does not appear to be the reason you can't re-build. Please note in the attached system report that Hard Drive 0 is not marked as failed. It is possible that it was failed, and the user already marked the disk as normal.

 

rozar

Distinguished
Jun 7, 2007
345
0
18,780
Hard drives in RAID arrays have "meta data" on the drive that tells the controller about the drive. It sounds like your controller has somehow "marked" your drive as "defective". If you need to make that "meta data" go away so the controller will see that drive as a "new drive", you will need to do a low lvl format of that drive. You will need to have that drive on a standard controller (not RAID) for this to work. Here is one tool that works well.

http://hddguru.com/content/en/software/2006.04.12-HDD-Low-Level-Format-Tool/

It would also be a good idea to check that drive with a drive scanning tool while you are at it. I use Seatools a lot, it works for most any drive.

http://www.seagate.com/www/en-us/support/downloads/seatools/
 

rockchalk

Distinguished
Oct 9, 2007
58
0
18,630
If you do use one of those tools, if it will report the actual drive size (not partition size) with a precision of bytes or kilobytes (greater precision than MB), could you please report that number here as well?
 

gwolfman

Distinguished
Jan 31, 2007
782
0
18,980

I've been running my 3-disk RAID-0 for over a year with "desktop" drives and have never had a problem. Maybe you just had a faulty drive, which happens.
 

techjunkiewest

Distinguished
Jan 24, 2008
2
0
18,510
gwolfman,

Re-read my post. If any internal error correction goes on within a healthy single drive - which does occur to prolong the life of a drive - it will take longer than the controller will allow before the controller forces it out of the array. RAID-0 configurations would therefore be especially vulnerable because, as you know, if a drive drops out, for whatever reason, you've lost your stripe.

So, the fact that you've been running over a year without any problem doesn't diminish the fact that you are unnecessarily at risk. At risk because you could lose it all when a healthy drive, undergoing a normal internal process, drops out.
 

rockchalk

Distinguished
Oct 9, 2007
58
0
18,630
Okay guys, I've reproduced the issue on one of my systems, and have two viable workarounds.

Based on the system report posted by lkwjeremy, it looks like he has already reset the problem drive to non-RAID either through the UI or OROM by selecting "Reset to Non-RAID." (or by zeroing the entire disk using a non-RAID system, which should do about the same thing, just take longer) My workarounds both start from this configuration, as anybody with this issue should be able to perform that task simply.

The first workaround is fairly easy. When you go into the OROM (the Ctrl-I utility), it should pop up a dialog asking you if you want to rebuild the volume to a drive. Rebuild to the disk that was originally part of the volume (Be very careful not to accidentally select a drive with data on it, as you would lose that data). When you boot into Windows, the volume will start rebuilding.

The second workaround requires that you boot the system with the problem disk still inserted, and no other disks from the volume. (Note that if you haven't marked the disk as non-raid, you will see a failed volume show up here, and you won't be able to continue this process) You will now be able to right-click on the drive and "mark as spare." (This right-click menu only shows up when there are no degraded volumes in the system). Now, shut down the system and re-add the all the volume disks you took out before. When Windows boots, the driver will automatically pick up the spare drive and start rebuilding the volume to it.

As a note, I have successfully tested both of these workarounds.
 

speeder2000

Distinguished
Nov 13, 2008
3
0
18,510
Im having the same issue. I actually had to disable raid to get memtest86+ to run, and when it was re-enabled, one of the disks was 'non-raid member', and raid has a missing member and is marked as degraded. I do not get a rebuild raid option.

The disk is already marked as a non-raid member so resetting it in ROM interface is not an option.
I don't quiet understand workaround 2 - the raid is my system drive....???


 

Skallah

Distinguished
Nov 14, 2008
2
0
18,510
Hi,

I was getting this same error and was googling all over the place trying to find a solution.

I think I finally found one and thought I would share this just in case someone is ever in my shoes.


I have 5x500 gb hard drives in a RAID 5 configuration. After flashing my motherboard bios (Gigabyte GA-P35-DS4) my raid got disabled. (I guess when I flashed it took it back to defaults consequently shutting down the RAID. ) After enabling this one of my disks fell out of the array. I tried to recover the disk in the intel console but got the error message listed in the first post. I tried several times to low level format but this didn't work.

Finally, I powered off the raid and only powered on the disk that fell out of the array to do another low level format. When I booted the system up the intel console read the drive and I was able to mark it as a spare. I shut down the PC, powered on the RAID and booted the PC. Immediately the intel console started rebuilding the RAID.

My suggestion would be the above. That is, if you haven't given up yet.
 

speeder2000

Distinguished
Nov 13, 2008
3
0
18,510
Finally, I powered off the raid and only powered on the disk that fell out of the array to do another low level format. When I booted the system up the intel console read the drive and I was able to mark it as a spare. I shut down the PC, powered on the RAID and booted the PC. Immediately the intel console started rebuilding the RAID.

by console, do you mean the raid bios, or intel console in windows? if you power down the raid, how do you get into the intel console assuming that your raid drives are your system drives as well
 

GhostLobster

Distinguished
Nov 14, 2008
2
0
18,510
I ran into a problem yesterday with my Intel RAID 5. I hope someone has a clue how I can rescue this..

Setup:

ASUS P5LD2 Deluxe (with the latest BIOS)
Intel Matrix Storage Manager ICH7R
3x 500GB Samsung SATA drives

When I boot up, the concole says "Status Failed" and below lists all 3 drive members but shows no errors, all show "member Disk(0)" under "Type/Status(Vol ID)".

Is there any way I can get these back up? My OS was on the raid so the machine won't boot.. The boot up screen shows all 3 drives with no problems.

Dennis
 

Skallah

Distinguished
Nov 14, 2008
2
0
18,510
by console, do you mean the raid bios, or intel console in windows? if you power down the raid, how do you get into the intel console assuming that your raid drives are your system drives as well

I meant the intel console in windows. I didn't mess with any settings in the raid bios. The way my system is set up is the raid acts as a big storage drive. No system files (other than raid specific) are stored on it. The system drive I use is a standard 150 gig hard drive which is not associated the raid at all.

The raid carriage I have lets me individually power on each drive (if needed). Not too sure how common this is in most carriages but this allowed me to fix the problem I was encountering in my first post.
 

leandrodafontoura

Distinguished
Sep 26, 2006
898
0
19,060



Im having the same problem. my 4x Raptors are in RAID 5 with the OS and everything (games, photos). I have an old Mobo with only 4 ports. I cant rebuild the system in the BIOS. Anyone, please, correct me if Im wrong:

1) RAID can only be repaired at Windows, never at BIOS.
2) Doesnt mattter wich RAID you use, make sure to have a spare HD with an OS for you to boot in case things go badly. Even HDs built for RAID like the NS and YS may fail for some unexpected reason.

My next system will be just 1 Raptor with the OS and games + 2x 1TB RAID certified drives in RAID 1 for important stuff. This will give me all storage I need and safety. It wont be as fast, but Ill be able to relax and enjoy my PC.

RAID needs to be improved. In fact, there is an article saying that RAID will be dangerous in the future, as HD capacity increases, so does the probability of error during reconstruction of a failed arry, therfore making RAIDs impossibe to repair if you have a 999999999TB HD, wich will happen someday in the future.
 

GhostLobster

Distinguished
Nov 14, 2008
2
0
18,510
I was able to get my RAID 5 back by putting in a new drive, install the OS onto it, then the intel raid software. Once booted on the new OS drive, I started the intel manager, and did a recover to the raid drives which worked! i then shutdown, removed the new OS drive and booted back on my raid 5 without any problems.

First thing I did was a BACKUP!
 

enewmen

Distinguished
Mar 6, 2005
2,247
3
19,815
I like SATA RAID5. After changing motherboards, the RAID5 was seen and functional without needing addtional word. The Intel Matrix Storage Manager is very simplified, but seems to work well. When a drive failed, just replaced with a new drive and the array was automatically rebuilt. When another drive failed, just formatted it, marked it as normal, then the array was rebuilt.
Yes, RAID 5 has problems, but it's a whole lot better than nothing for rebuilding data.
just my 2 cents worth.
 

ptr727

Distinguished
Mar 28, 2007
13
0
18,510
I am really glad I found this post, I had a very similar experience with my 4 drive RAID10 system.

The solution as proposed to boot with one drive and mark it as a spare worked, but getting an OS to boot was a big problem.

Here is what I did:
1. Vista Ultimate x64 SP1, 8GB RAM, 4x500GB HDD, 120GB RAID10 for OS, 800GB RAID10 for data, Gigabyte EP45-DS3R, ICH10R RAID chipset, Q9650 3GHz.
2. Update BIOS from F9 to F10, load optimized defaults, reboot.
3. The loading of the optimized defaults changed the drive config from RAID back to AHCI, I think this somehow confused the RAID chipset.
4. Changed BIOS config back to RAID, rebooted.
5. Now the Intel RAID BIOS reported that drive-0 is non-RAID, that drive-1 to drive-3 is RAID, and that the two RAID10 volumes are degraded.
6. I entered the RAID config by pressing Ctrl-I, but there were no options to let me move the drive back to RAID.
7. I found an Intel KB that says I am supposed to see a dialog that lets me re-assign the drive, but I had no options to do anything with the non-raid drive, see: http://support.intel.com/support/chipsets/imsm/sb/CS-021017.htm
8. So I figured I can still boot since the RAID1 portion of the RAID10 will still work. Two things went wrong.
9. Booting Vista gave me a boot volume can't be mounted bluescreen error. I think this was because the non-raid disk was listed before the raid boot volume in the BIOS.
10. I changed the BIOS to put the single non-raid drive after the two raid volumes.
11. I booted again and this time Vista told me it can't find winload.exe.
12. I also tried booting with the non-raid drive unplugged, same problem.
13. I booted from the Vista DVD and selected repair, but the repair could not find any Vista installs.
14. I ordered a replacement drive and paid for next day delivery, turns out I don't need it, but I will set it up as a spare.
15. I wanted to get the Intel Storage Manager software running, but I need a working OS to do this.
16. I plugged in an external eSATA drive, unplugged all internal drives, and installed Vista and the Intel version 8.6 storage manager on the eSATA drive.
17. I plugged all the internal drives back in, and booted from the eSATA drive.
18. On logging in the Intel storage manager tray icon told me that the drive is degraded, and that I must right click on the new drive and to select rebuild.
19. I right clicked on the non-raid drive and selected it for the rebuild, but got the stupid error that lists the various possible causes of the problem, but not the actual problem.
20. Intel, this is the dumbest error message ever, I want to know exactly what failed, not some list of possible causes.
21. One of the possible causes mentioned was that there is data on the drive. I opened disk manager and saw that the non-raid disk had an assigned drive letter, and had a 120GB partition on the 500GB disk.
22. I deleted the partition, rebooted, tried again, and the same winload.exe error.
23. Now that I had the (stupid) error dialog, google pointed me to this thread.
24. This is what really helped me; I removed all but the non-raid drive, and selected that disk for use as a spare.
25. I shutdown, reconnected all drives, rebooted, still booting on the esata drive, this time the rebuild automatically started.
26. After a few hours the rebuild completed, I ran chkdsk on the raid volumes, the first volume had some errors, repaired them, the second volume was fine.
27. Now I had to get Vista booting from the original boot volume again.
28. I changed the BIOS settings to place the raid volumes first, and I rebooted, but I still got the winload.exe error.
29. I booted from the Vista DVD, and selected repair, this time the repair tool found two install, the one on the boot volume and the one on the eSATA drive, and performed some repair actions.
30. On booting again I had two “Vista Ultimate (repaired)” OS boot options, the first entry booted from the eSATA drive, the second entry booted from the RAID boot volume.
31. I powered down the eSATA drive and rebooted. I ran bcdedit.exe and deleted the first entry.
32. After rebooting the correct OS automatically loaded.
33. I think if I had ran the boot repair with the eSATA drive disconnected I could probably have skipped the bcdedit step.

This still leaves me some questions:
1. Why did the drive turn into a non-RAID drive, this is of greatest concern to me for I am bound to update the BIOS again, and switching from RAID to AHCI is bound to happen?
2. Why did the Intel RAID BIOS not allow me to fix the problem from the BIOS, but forced me to do it from Windows?
3. Why did the right-click on the drive fail, and required that the drive be individually marked as a spare?

Thanks again for the advice in this thread.

P.
 

qwertylesh

Distinguished
Jan 27, 2009
2
0
18,510


Oh my god, thank you ptr727 for posting this up. I recently setup my X58 i7 with 4 500gb Seagates in RAID10
Unknowingly updated my bios for better stability and whamo! the first disk of the raid dropped out.

Now ive gone about things a little differently then you. I'm back in the operating system with disk 2+3+4 but disk one is out of action and not allowing me to use it for the rebuild, even after using diskpart and 'cleaning' the disk information.

What I did to get back into the OS, was I unplugged the dropped disk, inserted my vista cd and allowed it to write the Vista Ultimate... (repaired) bcd. that got me around the Winload.exe problem, the bsod i got before the winload problem :cry:
I was baffled why the OROM or the Intel software wouldn't allow me to rebuild the array.

So all i need to do is unplug all but the dropped one and it will allow me to mark as spare in the OROM?
Wont all the remaining 3 get marked as Missing? will that fix itself when I plug them back in after?

I'm concerned I'm going to ruin it further. however i will follow your steps bit by bit to ensure I can boot again from it as long as it doesn't all go down hill on me. and Hopefully I can reuse the disk since its not bad, its a nasty flaw in the software which I think Intel should be working with motherboard manufactures to correct as it is integrated RAID.

Thanks again ptr727 for posting your raid 10 solution, it will prove a data saver for me :)
Thanks TH & all the Posters who have contributed to this thread, this issue is a real doozie.