Sign in with
Sign up | Sign in
Your question
Solved

Catasrophic Drive Failure of 3 drives on RAID 10

Tags:
Last response: in Storage
Share
August 5, 2013 1:49:26 PM

Specs: We have a DELL PowerEdge 905 with a mirrored OS and a RAID 10 data portion (with 12 (15k) HDD's and 2 hot spares)

Issue: We powered down the server to add more memory...that's it!.. when we powered it back up, 3 drives were in a failed state (Drives 12, 13, 15) and drive 14 was unable to join the array because of its other failed partners. We ended up losing all of the data on D drive, and restoring from backups. Just wondering if anyone can shed some light as to why 3 drives could fail..in series.??
a b G Storage
August 5, 2013 1:58:18 PM

it could be the controller, were they using a dedicated raid controller or motherboard software raid?
m
0
l
a b G Storage
August 5, 2013 1:58:28 PM

I hate to say it, but it almost sounds like ESD fried your SATA or SAS controller.

However, if all the drives were purchased at the same time it could be they aren't failed yet but just out of spec and it only showed up on the reboot.

m
0
l
Related resources
August 6, 2013 7:47:39 AM

mauller07 said:
it could be the controller, were they using a dedicated raid controller or motherboard software raid?


It has a dedicated controller, and we originally thought it might be something like that, but when the new drives were installed, they fired right up.
m
0
l
August 6, 2013 7:55:47 AM

maddogfargo said:
I hate to say it, but it almost sounds like ESD fried your SATA or SAS controller.

However, if all the drives were purchased at the same time it could be they aren't failed yet but just out of spec and it only showed up on the reboot.



The DELL tech's theory was that the drives seized up because when they run for a long period, and are shut down for an extended time, the plastic (no longer made from ball-bearing) mechanism cooled down and shrunk, then when fired back up again, it caused it to fail

m
0
l
a b G Storage
August 6, 2013 8:15:07 AM

sounds rediculous that heat would have caused the problem, what sort of temperatures were the drives getting to? migh want to add some cooling across the HDD's if they are getting that hot.
m
0
l

Best solution

a c 888 G Storage
August 6, 2013 4:34:58 PM

djboi said:
maddogfargo said:
I hate to say it, but it almost sounds like ESD fried your SATA or SAS controller.

However, if all the drives were purchased at the same time it could be they aren't failed yet but just out of spec and it only showed up on the reboot.


The DELL tech's theory was that the drives seized up because when they run for a long period, and are shut down for an extended time, the plastic (no longer made from ball-bearing) mechanism cooled down and shrunk, then when fired back up again, it caused it to fail


I too have found that drives running for a long time (3 years?) dont like to be shut down. Odds are they just took too long to spin back up to speed and timed out rather than an out-n-out drive failure. If you can, you can check the drives SMART and see whats there.
Share
!