File server - intermittent failure to detect all RAID drives on boot

Keidos

Reputable
Mar 12, 2014
7
0
4,520
I have an old file server, built myself, which has had the following problem for all the 4.5 years I've owned it:

Sometimes the array is not detected on boot, because one of the drives is not detected.

Problem Details
The first thing that happens on boot is that the RAID card initializes, and each drive is accessed - I see the HDD light flicker for that drive - in sequence, counting up through the 8 drives.

=>On a successful boot, this only takes a few moments, the RAID reports a successful boot, and then Windows starts and everything is great until I'm forced to reboot - days or weeks later.
=>On an unsuccessful boot, one of the drives doesn't flicker, and instead there is a delay, after which the light for that HDD turns on and stays lit. There is then a long delay, before the RAID finally reports that it cannot find my volume. Windows will still boot, but my array is obviously missing.

This problem happens, I would say, 60-70% of the time. When it doesn't happen, the server is incredibly reliable, sometimes running for months before I'm forced to reboot for some critical update or something, at which point I could end up having to try 5-10 reboots before I'll get a successful reboot.

Here's my build...

Hardware

Troubleshooting
Case
The Norco case is designed for file servers, and came with two hot-swap bays, each intended to hold 5x SATA drives. Each of these bays takes 2x molex power, which is used for all five drives and an 80mm fan. I had always assumed that these were just cheaply made, and were distributing unclean power or something, so this week in an effort to finally resolve the issue, I replaced both bays with one of these, thus removing any hardware interface between the drives and the RAID card. It didn't solve the problem.
Power adapters
With the above step, I introduced another potential issue: to get power to my HDDs, I have a 1-to-4 SATA power splitter, plugged into a molex-to-SATA adapter, plugged into the modular power supply. Each set of 4 drives has its own line to the PSU, but I don't know whether these two molex power ports on the modular PSU are on different rails. Each 12v rail should be able to support 38W, so even if all 8 RAID drives, plus the system SATA drive and the optical are on a single 12v rail, I think it would be fine. Still, all these adapters make me less than confident, and I no longer have the direct modular power cables that would let me get rid of them.
Power
Tangential to the above - I could just have a crappy PSU. I was considering sending back the latest order and ordering a replacement PSU, but some research made me doubt this conclusion (SeaSonic has never failed me, and the wattage seems like it should be plenty). So I decided I'd try to troubleshoot without being a part replacer. Hence, this post.
RAID card
Obviously, the RAID card could just be bad. I've upgraded the firmware on it to the final (no longer a supported product) which is v1.49. Areca is, or at least was, supposed to be top-of-the-line hardware, so this seems very unlikely. It was a $440 card, too, so it would also be extremely expensive to swap for testing.
Drives
It's possible that I made a shitty choice of HHDs. I know that some "green" drives play poorly with RAID hardware. I did do some research before the purchase, and made sure that others had used these drives successfully, in RAID systems. For what it's worth, when the array is working, there's an Areca utility that will let me query the SMART reports for each drive, and all seem to be running just fine.

Conclusion?

I'm at a loss. I know of a few things I can try, but I'm running out of options that don't cost an unreasonable amount. I could buy a new PSU, but my gut says I'd be wasting $. I could try to find replacements for my lost modular power cables, but ... same deal. I could just call this whole thing a write off, save up for a bit and build a new server - this one is pretty old - but it's still running just fine aside from this intermittent issue and I would just feel so lame giving up on it.

Help me tomshardware forums. You're my only hope.
 

TyrOd

Honorable
Aug 16, 2013
527
0
11,160


Yeah, green drives...

Don't ever take anecdotal advice from people about the reliability of a RAID setup. You can always find people who use consumer drives in large RAID volumes who report good results. It's a game of numbers and you can always find anecdotal evidence skewed toward "cheap" solutions.
 

Keidos

Reputable
Mar 12, 2014
7
0
4,520


I am pretty sure you're right, TyrOd. Typing it out helped me focus on the real probability of each of these possibilities, and your reply affirms my own assessment. When I built this system, I was inexperienced, and I took a shortcut, and I should have known it would bite me. But you're right, this is the thing that makes the most sense.

I'd love to mark your answer as "best answer" but apparently I am too dumb to figure out how to do that.