Need help recovering from Nvidia Raid 5 Drive Failure

ddog

Reputable
Oct 11, 2014
44
0
4,540
Hi folks, I have a home/media server running for some years now with the nforce 630a chip set running a Raid 5 setup with x3 2TB drives.

SPECS:

AMD 1640 LE
Biostar N68SA-M2s
4GB DDR2
1TB HDD (OS Drive)
2TB x3 (Raid 5)

I recently heard a loud screeching sound coming from my server. Upon inspection I realized the noise was coming from one of the HDDs. I then noticed the machine was stuck on the Nvidia "Detecting Array" screen. The machine then stops responding at this point, unable to even access the bios. So what I did at this point was:

Disconnected all the drives on the raid 5 set. This allowed the machine to boot into windows. After realizing that the problem did not lie with the OS and OS drive I started a process of elimination to pinpoint the failed drive. Powering them up one by one allowed me to pinpoint the failed drive due to the loud screeching noise.

From here I reconnected all the drives except for the failed one and the machine was able to boot into windows. From the Nvidia storage console I can see an error with the RAID 5 set and it displays the 2 connected drives.
5a17aa.jpg

Unfortunately at this point I don't have access to the logical raid drive in windows to perform a backup.

Replacing and rebuilding the failed drive should be a simple process as shown here : http://nvidia.custhelp.com/app/answers/detail/a_id/2290/~/rebuilding-a-nvidia-raid-array
However this can only be done with all drives connected including the failed drive. Unfortunately in my particular scenario the machine is unable to boot while the failed drive is connected. I am not sure what the reason for this is.

Any suggestions in moving forward? I was thinking of getting a replacement 2TB drive, stick it in place of the failed drive, boot the machine and see what happens. Is this a lost cause? Of course I haven't backed up since last year and now I will have to suffer badly if my data is lost...sigh...
 
Solution
Since your old mobo contains nVidia north- and south-bridge chips, I checked there for info on how changing a mobo will go. I know that, in the past, I took advantage of an nVidia assurance that they would make sure that any future chips they deployed would be able to deal with HDD's from a RAID array originally written by their chips. So choosing a mobo with the same chipset seems like a good idea.

I noted on their their website's Support Knowledge base this item on nVidia support for Linux OS's:

https://nvidia.custhelp.com/app/answers/detail/a_id/740/kw/RAID


It appears that, if you want to read your old RAID5 HDD's on a new mobo under a Linux OS, you may have to be careful to choose the right Linux distribution. Or, maybe you can...
Rebuilding a RAID 5 should NOT require the failed drive as well. This is what a RAID 5 is designed for. I would just replace the 2TB and then assign it to the RAID 5 and it should rebuild. But after reading the NVidia way I'm not sure. I'm use to servers running high end RAID Cards where you remove the old, replace with a new and done nothing else needed to do.

And I don't see any harm in tossing in a blank drive, Then follow the RAID BIOS Rebuild Instructions at the bottom vs using the NVidia Control Pannel
 

ddog

Reputable
Oct 11, 2014
44
0
4,540
I assumed that it should run with one fail drive but I am not sure why the system refuses to boot with the failed drive connected...very strange. I also thought disconnecting the failed drive would have flagged a failure at the raid bios and still allow me to access my files from the raid set through windows but it strangely doesn't.

I will try the bios method with a new drive. Hopefully it will be able to rebuild.
 
Well if there is an issue with the failing drive it can cause what is happening. The Motherboard knows a drive is there, Its trying to find the drive over and over, but the drive just isn't answering so it keeps trying, hence why it won't boot all the way.

But I would try just adding a new drive to the RAID set, and see if it takes it and rebuild its. Me, Personally, I never use Onboard RAID or Software RAID. I always buy a RAID Card (Even if its a cheap one) and do a true hardware RAID. Less CPU Overhead and if something happens to the motherboard and you have to move PC's then you just move the RAID card with it and not have to deal with rebuilding on a new machine when all the settings are in the RAID Card. And if the RAID card fails usually all you have to do is just get the same one, connect the drives, and if it doesn't pick up the RAID you just enter the RAID BIOS and usually it will be like "Hey, I see you have a RAID set plugged in? Wanna import those settings?" And you just say YES and you're back up and running!
 

ddog

Reputable
Oct 11, 2014
44
0
4,540
Well to make matters worst the machine now refuses to POST.....great! and I just borrowed a 2TB drive from work today to test. I will have to troubleshoot this some more once I get a chance.

At this point in time I am just trying to work on recovering the data if it is possible at all. I ran this setup since 2009 I believe, just changing the original 500GBs and motherboard setup once during that time . Performance was never an issue as it was just a file/ media server that started off as a cheap project and it worked great for that purpose. However since I swapped to the 2TB drives and my data increased I never really planned ahead in terms of backup and for a failure and well now I am suffering for it.....

In the interim I want to build a new machine re-using the 2TB HDDs but this time using a proper Raid card. I was contemplating software raid with either Oracle Linux or Red Hat but I think the hardware raid card might be the best option. I do have a cheap 30us Raid card I used way back when I first built the machine but I believe that was OS dependent.

Do you have any suggestions for a relatively cheap Hardware Raid 5 card around the 200 US range? I am hoping it is possible to get one at this price at least with x4 HDD sata ports and with a hot spare option....might be asking to much.
 
Well me I'm a big Dell PERC (Pretty much just a Branded LSI Card) fan. I have a Dell SAS 5 HBA SATA II that i run a RAID 0 off of with 2 2TB drive but i have a 4Tb Drive as a backup that i just use Unstoppable Copier to copy over new files (Its movies, anime, tv shows and nothing ever gets deleted off the drive only added) and it works just fine for me.

But if you want to run a RAID 5 i HIGHLY HIGHLY Suggest something like this

http://www.ebay.com/itm/DELL-PERC-H700-6GBPS-RAID-CONTROLLER-BATTERY-SAS-CABLE-for-R410-R510-R610-R710-/111413830912?pt=US_Server_Disk_Controllers_RAID_Cards&hash=item19f0c82500

That is used but its an example. 1) Its 6GBps so you can use SSD's with it later if you want. 2) It has a Write Back Cashe and a BBU so if you have a power failure you won't lose data being written to the RAID. 3) The entire RAID runs off the Card and little CPU over head. You can download and install the LSI Mega Raid Utiltiy and you can monitor, configure, build, delete, and rebuild RAID in windows (I install this on EVERY Dell server we have because it gives you detailed info).

There are others out there. You can get a PERC H310 for like 80 bucks brand new. It only does 4 Drives though the 700 Does 8. So room to upgrade. Plug they both support up to 4Tb Drives (And maybe bigger with a firmware upgrade).

Again thats just an example. You can find new ones for under 100 that supports 4 drives like the H310 that do RAID 5. Also check around Newegg as well.
 

Paperdoc

Polypheme
Ambassador
OP, you said, "I assumed that it should run with one fail drive...." You have misunderstood RAID5. What you describe is a RAID1 feature.

In RAID5 if one HDD unit fails, the RAID management system is supposed to warn you, stop using the array, and help you identify the failed unit so you can replace it. You already know which unit to replace. You MUST replace the failed unit before you can do anything else. Once you have done that, the RAID management software is supposed to rebuild the array by re-creating the data missing on the replacement unit from the data on the good units you did NOT replace. This can take a long time. AFTER all that data reconstruction is complete, THEN you can access your data again.
 
^^^ Exactly. 2TB hard drives aren't that expensive. If you get a good RAID card get another one! Setup it up as a Hot Spare! This way in the future if a drive fails it will rebuild automatically using the hot spare. Just pray another drive doesn't fail.

Also as far as RAID 5 goes usually anything greater than a 2TB drive is not reccommened because of the rebuild time can take forever and risk a second drive failing because the hard drives are working overtime to get the RAID rebuild. May want to look into RAID 10 maybe.
 

ddog

Reputable
Oct 11, 2014
44
0
4,540
Hey thanks for all the advice guys but it seems like my motherboard is officially dead so I have no access to my data right now and that is hoping that the array isn't corrupted in the first place. I may have to purchase a board with the same chipset off ebay to attempt to recover my data. I am not putting much hope into this anymore.

In the mean time I am researching parts for building a new machine with brand new drives as well. I think that changing all the hardware may be the best option as they are around 5 years old. I have considered that I will attempt the software raid initially either with Red Hat or Ubuntu server. I believe that this setup would be suitable for my needs and it will keep the cost down a bit by exempting a hardware raid card. I will perform proper testing this time around before making full use of the server and also properly organize backups too. If it doesn't workout at that point or I am not satisfied I would consider a hardware raid card.

Here are the specs I have come up with so far, please let me know your thoughts on this

CPU : AMD A6 APU
I purchased this for a HTPC build earlier this year that didn't turn out how I expected so I decided to use it for the server instead.

Motherboard : BIOSTAR Hi-Fi A88W 3D
This may be a bit overkill but it got the best rating at newegg for FM2 boards and it is a lot cheaper than most of the competition ($80). Based on past experience I find Biostar boards to be very reliable.

Memory: HyperX Fury Series 8GB (2 x 4GB) DDR3
Any reputable brand will do....

HDD: 3TB Seagate or Western Digital Nas drives
I haven't decided on this yet. The western digital drives seems perfect or at least the marketing does however from research they seem to be prone to the most failures and DOA when compared to the Seagate. I have had pretty decent experience with my last 2 sets of drives that were both seagate so I may lean towards that.

I am only capable of purchasing 3 drives at the moment but I will purchase another in the following month to act as a hot spare if possible on the linux. Because of this I am considering a Raid 5 setup again. I am researching all of this at the moment so of course nothing is finalized.

I also have a kingston 60GB SSD I used for said HTPC build that I would like to re-use as the OS drive.

Power Supply & PC Case:.....
I may re-use my Power supply (500W Cooler Master) just to keep the course down a bit. As for the PC case anything with a decent amount of HDD bays I guess.
 

Paperdoc

Polypheme
Ambassador
Since your old mobo contains nVidia north- and south-bridge chips, I checked there for info on how changing a mobo will go. I know that, in the past, I took advantage of an nVidia assurance that they would make sure that any future chips they deployed would be able to deal with HDD's from a RAID array originally written by their chips. So choosing a mobo with the same chipset seems like a good idea.

I noted on their their website's Support Knowledge base this item on nVidia support for Linux OS's:

https://nvidia.custhelp.com/app/answers/detail/a_id/740/kw/RAID


It appears that, if you want to read your old RAID5 HDD's on a new mobo under a Linux OS, you may have to be careful to choose the right Linux distribution. Or, maybe you can get the dmraid software they specify and add it to the Linux of your choice. Check this carefully.

HOWEVER, I note that the mobo you plan to use next does NOT use an nVidia chipset, so reading the data from the old RAID5 set of HDD's may not work. But maybe I misunderstood your plan. MAYBE you meant you would build an intermediate machine using a mobo with an nVidia chipset and Windows OS and read the old disks that way, moving all data to some backup medium. Then you would build the final new server and load all that data onto its RAID 5 array of new HDD's. If you do that, note this nVidia item on how to rebuild a RAID array:

https://nvidia.custhelp.com/app/answers/detail/a_id/2290/kw/rebuild%20RAID
 
Solution

ddog

Reputable
Oct 11, 2014
44
0
4,540
That's exactly the plan moving forward paper doc. Thanks for the info.

As for the linux route with the nvidia raid it seems I will still need to get a board with the nvidia chipset before trying the fedora/ dmraid setup.
 

Paperdoc

Polypheme
Ambassador
Well, maybe not. My last paragraph noted that your list of proposed components used a mobo that does NOT have an nVidia chipset. If that's the direction you take, make sure to check into how that board's chipset makers support Linux. As long as you have good support for the chipset in the board you choose to build your final server and for the particular version of Linux you choose, you're just fine. Once you have your RAID5 array rebuilt on the intermediate mobo and then backed up to some NON-RAID storage device, you should have no problem loading that backup into the new array in the Linux-based server. Of course, you would want to be very sure that the software you use to make the backup generates files that CAN be read by some software running under Linux on the server.