System Unstable, possibly RAID driver?

LoganYoung

Distinguished
Jun 7, 2010
4
0
18,510
Hello,

I have a Windows Server SBS 2003 system set up with 2x Adaptec RAID Controllers each of which is able to connect 2x SATA HDD drives, but most of the time, we only have 3 drives connected (the 4th is for off-site backup).

At this point, I'd like to stress that the drives are not set up in any RAID configuration. I don't get it either, but then I didn't set up the machine...

The system wouldn't boot, so we sent it out and were told that the RAID configuration had been corrupted, and we'd need to have the arrays rebuilt. The technician replaced the RAID Controllers and updated the firmware, but forgot to update the driver (I only found this out a week later)

Because of this slip by the technician, the system presented severe stability issues last week in that, after about 30 minutes or less, it would become non-responsive to user input of any kind (except in safe mode). I did a little digging and found a dump file that indicated that the RAID driver was causing the issue.

After updating the driver to:
- Adaptec Embedded Serial ATA HostRAID
- Driver Provider: Adaptec
- Driver Date: 2004/11/11
- Driver Version: 7.0.0.45A
- File Name: aarsi3x.sys

These issues corrected, but come back when we connect the 4th drive for backup.

Does anyone have any idea why this is happening? I'm thinking it could be because the drive wasn't connected when I updated the RAID driver. Can anyone verify?
 

sub mesa

Distinguished
Probably a bad sector on one of the drives.

Also, why do you say you are sure you are not running RAID? All evidence you gave me suggests otherwise; and you simply do not know the organization of your harddrives.

It also appears the RAID array is your main system drive; which creates additional problems and makes you rely totally on the RAID drivers which virtually always are of extremely low quality.

You may want to install Windows to a normal non-RAID disk to an AHCI controller instead of "FakeRAID" (Promise/Silicon Image/JMicron). By not using these RAID drivers, you have your system free of such bad quality storage drivers which can cripple your system. Instead, use the software RAID offered by windows, though this can only work on disks you do not boot from; thus not your system disk.

I highly recommend to use the motherboard onboard SATA ports.

Also, you should know Windows XP (and 2003; based on XP) has a stripe misalignment problem when used in RAID; this means virtually all random I/O performance gain from using RAID0 on Windows would be gone; wasted.

So the advantages were slight, and the risk very great. Normally that means: stop using it.
 

LoganYoung

Distinguished
Jun 7, 2010
4
0
18,510
Hi Sub,

Thanks for the reply.
It's not that I'm "not sure if we're running RAID or not", I know for a fact that we aren't. The drives may be connected to the system board via RAID controllers, but they're all set up in a non-RAID configuration.

Thanks for the recommendation to use the onboard SATA, I'll run that by my supervisor. Of course it'll likely mean we'd have to reboot the server when we do backups (we plug in the drive, copy backup files, and remove the drive for off-site backup on a weekly basis). Not a problem anyway, as we already have to reboot the server anyway.

Is there a tutorial on setting up a RAID config? I figure if we can set it up on RAID 5, we could avoid the whole reboot story altogether...
 

sub mesa

Distinguished
RAID controllers often require a RAID array to be created even if the disk is just a single disk, not part of a RAID configuration.

So for example the Silicon Image PCI-express x1 addon cards would have two Serial ATA ports and has a RAID BIOS. Even if you do not want to use any RAID0 or RAID1 and have two harddrives you would like to use as two separate harddrives, that still means you have to create a "RAID array" consisting of one disk; and another RAID array for the second disk. Then in Windows, the RAID driver from Silicon Image is used to access the data.

Any error might also 'degrade' the array; causing the disk to disconnect from the array and meaning you have to re-create the array.

All this fakeRAID stuff is just tiresome to the average user; i hope you don't get bitten by it. By using REAL Serial ATA without any FakeRAID stuff, you get normal opportunity to use your system as intended; without all the problems.

RAID5 is a very complex RAID level, much more complex than RAID0 and RAID1. It caused many people to lose their data on Windows' low-quality RAID5 options. Keep thing in mind, and regard the reliability of a RAID5 array under Windows as lower than a single disk without RAID. The chance of the RAID-array failing with all disks being fine, is (much) higher than the chance of failure caused by drive failure.

In other words; perhaps i'm just telling you you can avoid all these problems by not using RAID at all - or at least not RAID drivers or levels that are expected to cause the most problems. Never rely on RAID alone; always rely on backup! RAID on windows is quite unreliable in my opinion.
 

LoganYoung

Distinguished
Jun 7, 2010
4
0
18,510
Hi Sub,

I've only just been able to have a look at the motherboard in the server and there's a problem. There are no SATA ports on which would explain why the RAID controllers was used in the first place.

I was all for moving the drives off the RAID controllers, but since that isn't an option, I'm at a loss for another possible solution.
Got any ideas?
 

LoganYoung

Distinguished
Jun 7, 2010
4
0
18,510
The problem has been fixed.

What happened is that the server had problems, and was sent in to be fixed.
The guys who fixed it replaced the RAID controllers and rebuilt whole configuration. They updated thefirmware, but not the drivers and that's where the problem was.

After updating the drivers, we then experienced problems with newly added drives... Adding another drive to either controller caused the same stability and performance issues detailed earlier in the post.

Updating the Adaptec Storage Manager software on the computer fixed this problem...