Tom's Hardware Forums » Storage » NAS/RAID & Technologies » ICH9R; RAID 5; cannot re-add drive to volume
 

ICH9R; RAID 5; cannot re-add drive to volume

Add a reply



 Word :   Username :  
 
Bottom
Author
 Thread : ICH9R; RAID 5; cannot re-add drive to volume
 
Profile: stranger
More Information

Hi,

i cannot re-add the 3rd drive to my raid 5 volume after a crash. im very sad about it because i need data-redundancy (important data -.- cant afford to lose it)

Mainboard: GA-P35C-DS3R (ICH9R-chipset)
Storage: 3x SAMSUNG HD501LJ (500GB) @ RAID 5

the intel matrix storage manager gives me the following error when i klick "rebuild on selected drive":

Rebuild RAID Volume Wizard
The volume cannot be rebuilt to the selected hard drive due to one of the following reasons:
- The hard drive contains system files or is the system hard drive
- The hard drive is not large enough to be used for the rebuild action
- The hard drive has reported a SMART event
- The hard drive has reported a failure



to solve the problem...

- i did a lowlevel format on the "not anymore accepted drive" -> no systemdata etc on that drive -> NOT MY PROBLEM
- i checked the size: the "not anymore accepted drive" was in that array before the crash -> if it has not changed its size (and i realy belief that it did not), it cant be to small -> NOT MY PROBLEM
- i checked the SMART-status with speedFan -> all ok, no errors, no warnings -> NOT MY PROBLEM
- checked the raidBIOS -> drive is in "Non-Raid-Mode" and no errors or warnings at all -> NOT MY PROBLEM

so, what is the problem? i dont know what to do... plz help... *cry* :/

thanks in advice,
ic3


Related Pr oduct
Register or log in to remove.

Profile: journeyman
More Information

Please post the system report from the Intel Matrix Storage Manager (got to File | Save System Report). Make sure to do this with all hard drives attached.

Profile: stranger
More Information

OMG I've just encountered a very similar problem and I'm at my wits end. Seems like intel RAID marks the drive somewhere so it knows that the drive has been used in the array before but for some reason got broken. Like ic3f1re I've done all the low lvl diagnostics even clearing the mbr/bootcode sectors of the drive. I've had to get another harddrive and the intel raid console happily rebuilt the array with the new drive.

I know the rebuild works in some cases as I've a friend who was pulling in and out one of his hd's to use as a temp storage. For his system, the intel bios did detect when the drive was put back and automatically started the rebuild, and on another occasion he manually used the console to begin the rebuilding process, all performed on the same array / harddrives.

I've got a feeling we're missing something about the "broken" drive...

System Information

Kit Installed: 7.8.0.1012
Kit Install History: 7.8.0.1012
Shell Version: 7.8.0.1013

OS Name: Microsoft Windows XP Professional
OS Version: 5.1.2600 Service Pack 2 Build 2600
System Name: MAGICAL3
System Manufacturer: Gigabyte Technology Co., Ltd.
System Model: P35-DS3R
Processor: Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
BIOS Version/Date: Award Software International, Inc. F11, 01/04/2008

Language: ENU



Intel(R) RAID Technology

Intel RAID Controller: Intel(R) ICH8R/ICH9R SATA RAID Controller
Number of Serial ATA ports: 6

RAID Option ROM Version: 7.5.0.1017
Driver Version: 7.8.0.1012
RAID Plug-In Version: 7.8.0.1013
Language Resource Version of the RAID Plug-In: 7.8.0.1013
Create Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume Wizard: 7.8.0.1013
Create Volume from Existing Hard Drive Wizard Version: 7.8.0.1013
Language Resource Version of the Create Volume from Existing Hard Drive Wizard: 7.8.0.1013
Modify Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Modify Volume Wizard: 7.8.0.1013
Delete Volume Wizard Version: 7.8.0.1013
Language Resource Version of the Delete Volume Wizard: 7.8.0.1013
ISDI Library Version: 7.8.0.1013
Event Monitor User Notification Tool Version: 7.8.0.1013
Language Resource Version of the Event Monitor User Notification Tool: 7.8.0.1013
Event Monitor Version: 7.8.0.1013

Array_0000
Status: No active migration(s)
Hard Drive Write Cache Enabled: Yes
Size: 1192.3 GB
Free Space: 0 GB
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Number of Volumes: 1
Volume Member 1: Data

Data
Status: Degraded
System Volume: No
Volume Write-Back Cache Enabled: Yes
RAID Level: RAID 5 (striping with parity)
Strip Size: 64 KB
Size: 894.1 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Hard Drives: 4
Hard Drive Member 1: Hitachi HDT725032VLA360
Hard Drive Member 2: Hitachi HDT725032VLA360
Hard Drive Member 3: Hitachi HDT725032VLA360
Hard Drive Member 4: Missing hard drive
Parent Array: Array_0000

Hard Drive 0
Usage: Non-RAID hard drive
Status: Normal
Device Port: 0
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81PAJVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
System Hard Drive: No
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes

Hard Drive 1
Usage: Array member
Status: Normal
Device Port: 1
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81JXTWN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 2
Usage: Array member
Status: Normal
Device Port: 2
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R818297N
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 3
Usage: Array member
Status: Normal
Device Port: 3
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: Hitachi HDT725032VLA360
Serial Number: VFM201R81K8HVN
Firmware: V54OA7EA
Native Command Queuing Support: Yes
Hard Drive Write Cache Enabled: Yes
Size: 298 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Volumes: 1
Volume Member 1: Data
Parent Array: Array_0000

Hard Drive 4
Status: Missing

Unused Port 0
Device Port: 4
Device Port Location: Internal

Unused Port 1
Device Port: 5
Device Port Location: Internal


Message edited by lkwjeremy on 01-21-2008 at 05:15:33 PM
Profile: Honorary Poster
More Information

Good luck with this, maybe someone with more experience with RAID 5 can help.
It appears that you set up a RAID 5 array and considered it good enough to be a backup.
"Redundacy" is a situation where a server or PC can stay up and keep working in the event of a drive failure. It is not any way a form of, or considered a "backup", which is what you should have.
If you have data that you can't afford to lose, you need to back it up onto another type of media, like
CD's, or DVD's. That way in the event of a failure, like the one you are experiencing now, you do not lose your important data.


Profile: addict
More Information

lkwjeremy wrote :

OMG I've just encounter a very similar problem and I'm at my wits end. Seems like intel RAID marks the drive somewhere so it knows that the drive has been used in the array before but for some reason got broken...

 

I know the rebuild works in some cases as I've a friend who was pulling in and out one of his hd's to use as a temp storage. For his system, the intel bios did detect when the drive was put back and automatically started the rebuild, and on another occasion he manually used the console to begin the rebuilding process, all performed on the same array / harddrives.

 

I don't run RAID5, but throughout all my reading on various websites everyday, I've come too understand that once a drive is reported "bad", then it's marked somehow so that you cannot use the drive in your RAID5 again to help prevent data loss. Maybe your S.M.A.R.T. shows everything ok now, but maybe when if first dropped out it reported an error and that's when it was "marked" "bad". What HDD's do you have (brand/model)? Maybe you'll have to buy another one, it should work fine then. Anything as large or larger than the other 2 (or more) drives working in the RAID5 will work, but hopefully you'll get the same model.

 

Maybe search around about this "marking" of the drives to see if you can undo it somehow. That is, if you really trust that the drive is ok, then go ahead and try and use it again. Good luck, keep me posted, I'm curious as how to fix this. :)


Message edited by gwolfman on 01-21-2008 at 04:05:56 PM
Profile: journeyman
More Information

lkwjeremy -- when you took this snapshot, were you unable to rebuild to the drive on port 0? Also, you said it was a very similar problem-- did you get the same error message?

lkwjeremy or ic3f1re
Can you please try the following: Add one additional SATA disk to your system that is larger than your existing disks on one of the unused ports. This should let you get into the rebuild wizard, but once you're in the rebuild wizard, select the drive that you really want to rebuild to. I expect this will get around this issue. Please let me know if it works.

Profile: addict
More Information

rockchalk wrote :

Can you please try the following: Add one additional SATA disk to your system that is larger than your existing disks on one of the unused ports. This should let you get into the rebuild wizard, but once you're in the rebuild wizard, select the drive that you really want to rebuild to. I expect this will get around this issue. Please let me know if it works.


Interesting...

Were you able to try this yet?

Profile: stranger
More Information

This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array. I talked to engineers and Western Digital, Seagate and Samsung who explained that, when a desktop drives performs internal error correction, the amount of time required exceeds the time that RAID controllers allow before they drop the drive out of the array. When this occurs, the RAID controller writes something (I can't remember what or where) to the drive so that it can't be used again in the array. The Seagate engineer explained that some type of low level reformatting needs to be done before it can be used again.

This is the reason that both Seagate and WD offer RAID drives - they're a $10 to $30 more than their standard drives. Samsung does NOT have a similar drive - I got this from their engineer. These RAID drives limit the amount of time they report to the controller so that they will not be dropped from the array. Two examples of these RAID drives are: Western Digital Caviar RE2 WD5001ABYS, and Seagate Barracuda ES.2 ST3500320NS

Note the "YS" in the Western Digital suffix and the "NS" in the Seagate suffix. Both Seagate and WD also indicate that these drives are built to a higher spec.

Therefore, the person in this thread that indicated that there was no need for special RAID drives was incorrect, probably because non-RAID drives in an array will work.... for a while.

On a personal note, I learned all this AFTER I purchased a pair of Samsung 500GB Desktop drives for my RAID 1 configuration. They've been in my array for the last three months without any problems. That said, I'm going to replace them with RAID-certified drives in the next 60 days.

I just wish that the vendors, like NewEgg, would warn us and educate us with the facts so that we could make informed decsions.

Profile: journeyman
More Information

Good point. When the Intel Matrix Storage Manager determines that a disk is failed, it will mark the state as "Failed." You can tell it to unmark it as failed by right-clicking on a failed disk and select "Mark as Normal." This will tell the driver to unset that failed bit and treat the disk as healthy again.

While this may have caused the initial RAID degradation above, it does not appear to be the reason you can't re-build. Please note in the attached system report that Hard Drive 0 is not marked as failed. It is possible that it was failed, and the user already marked the disk as normal.

Profile: enthusiast
More Information

Hard drives in RAID arrays have "meta data" on the drive that tells the controller about the drive. It sounds like your controller has somehow "marked" your drive as "defective". If you need to make that "meta data" go away so the controller will see that drive as a "new drive", you will need to do a low lvl format of that drive. You will need to have that drive on a standard controller (not RAID) for this to work. Here is one tool that works well.

http://hddguru.com/content/en/soft [...] rmat-Tool/

It would also be a good idea to check that drive with a drive scanning tool while you are at it. I use Seatools a lot, it works for most any drive.

http://www.seagate.com/www/en-us/s [...] /seatools/

Profile: journeyman
More Information

If you do use one of those tools, if it will report the actual drive size (not partition size) with a precision of bytes or kilobytes (greater precision than MB), could you please report that number here as well?

Profile: addict
More Information

techjunkiewest wrote :

This may come as a major shock to all of us who have built their various RAID configurations using standard desktop drives, because standard desktop drives have a tendency to drop out of the RAID array.


I've been running my 3-disk RAID-0 for over a year with "desktop" drives and have never had a problem. Maybe you just had a faulty drive, which happens.

Profile: stranger
More Information

gwolfman,

Re-read my post. If any internal error correction goes on within a healthy single drive - which does occur to prolong the life of a drive - it will take longer than the controller will allow before the controller forces it out of the array. RAID-0 configurations would therefore be especially vulnerable because, as you know, if a drive drops out, for whatever reason, you've lost your stripe.

So, the fact that you've been running over a year without any problem doesn't diminish the fact that you are unnecessarily at risk. At risk because you could lose it all when a healthy drive, undergoing a normal internal process, drops out.


Message edited by techjunkie west on 01-25-2008 at 08:03:53 PM
Profile: journeyman
More Information

Okay guys, I've reproduced the issue on one of my systems, and have two viable workarounds.

Based on the system report posted by lkwjeremy, it looks like he has already reset the problem drive to non-RAID either through the UI or OROM by selecting "Reset to Non-RAID." (or by zeroing the entire disk using a non-RAID system, which should do about the same thing, just take longer) My workarounds both start from this configuration, as anybody with this issue should be able to perform that task simply.

The first workaround is fairly easy. When you go into the OROM (the Ctrl-I utility), it should pop up a dialog asking you if you want to rebuild the volume to a drive. Rebuild to the disk that was originally part of the volume (Be very careful not to accidentally select a drive with data on it, as you would lose that data). When you boot into Windows, the volume will start rebuilding.

The second workaround requires that you boot the system with the problem disk still inserted, and no other disks from the volume. (Note that if you haven't marked the disk as non-raid, you will see a failed volume show up here, and you won't be able to continue this process) You will now be able to right-click on the drive and "mark as spare." (This right-click menu only shows up when there are no degraded volumes in the system). Now, shut down the system and re-add the all the volume disks you took out before. When Windows boots, the driver will automatically pick up the spare drive and start rebuilding the volume to it.

As a note, I have successfully tested both of these workarounds.


Go to:
Add a reply
  Tom's Hardware Forums » Storage » NAS/RAID & Technologies » ICH9R; RAID 5; cannot re-add drive to volume
 

Google Ads
Ad