Intermittant 'Raid 0' failure

WaltsWorker

Honorable
Oct 9, 2013
11
0
10,520
Hello Everyone,

I've been reading here and at other places for what I'm experiencing but I haven't found much. This system has been running fine - without any Raid failures for at least 2 years. Recently I've had 2 of the 8 systems exhibit the same issue:
Upon bootup the system (WinXP) will not see the Array and will say 'the shortcut is bad.'

Upon checking the system Intel Matrix Storage Manager 8.9.0.1023 says that the Array has failed, but doesn't give any other information; except if you go to the individual drives - they give their information about them but none of them say that it's failed. Generally rebooting the system gets it back up and running fine; once - a few days ago I had to reboot one PC a 3rd time to get it running.

The systems are setup to play an uncompressed AVI video that is located on the disk array.

Kit Installed: 8.9.0.1023
Kit Install History: 8.9.0.1023, Uninstall
Shell Version: 8.9.0.1023

OS Name: Microsoft Windows XP Professional
OS Version: 5.1.2600 Service Pack 3 Build 2600
System Manufacturer: ASUSTeK Computer INC.
System Model: P5E-VM DO
Processor: Intel Pentium Core™2 Duo Processor E8500
BIOS Version/Date: American Megatrends Inc. 0902 , 02/17/2009

Chipset: Intel® Q35 / ICH9DO support Intel® Active Management Technology
Language: ENU

ATI HD4350 - 512M

Intel(R) Matrix Storage Manager
Intel RAID Controller: Intel(R) ICH8R/ICH9R/ICH10R/DO/PCH SATA RAID Controller
Number of Serial ATA ports: 6

RAID Option ROM Version: 7.5.0.1017
Driver Version: 8.9.0.1023
RAID Plug-In Version: 8.9.0.1023
Language Resource Version of the RAID Plug-In: 8.9.0.1023
Create Volume Wizard Version: 8.9.0.1023
Language Resource Version of the Create Volume Wizard: 8.9.0.1023
Create Volume from Existing Hard Drive Wizard Version: 8.9.0.1023
Language Resource Version of the Create Volume from Existing Hard Drive Wizard: 8.9.0.1023
Modify Volume Wizard Version: 8.9.0.1023
Language Resource Version of the Modify Volume Wizard: 8.9.0.1023
Delete Volume Wizard Version: 8.9.0.1023
Language Resource Version of the Delete Volume Wizard: 8.9.0.1023
ISDI Library Version: 8.9.0.1023
Event Monitor User Notification Tool Version: 8.9.0.1023
Language Resource Version of the Event Monitor User Notification Tool: 8.9.0.1023
Event Monitor Version: 8.9.0.1023

Array_0000
Status: No active migrations
Hard Drive Data Cache Enabled: Yes
Size: 596.2 GB
Free Space: 0 GB
Number of Hard Drives: 4
Hard Drive Member 1: ST3160813AS Firmware: CC2H
Hard Drive Member 2: ST3160813AS Firmware: CC2J
Hard Drive Member 3: ST3160813AS Firmware: CC2J
Hard Drive Member 4: ST3160813AS Firmware: CC2J
Number of Volumes: 1
Volume Member 1: Volume0

Volume0
Status: Normal
System Volume: No
Volume Write-Back Cache Enabled: No
RAID Level: RAID 0 (striping)
Strip Size: 128 KB
Size: 596.1 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes
Number of Hard Drives: 4
Hard Drive Member 1: ST3160813AS
Hard Drive Member 2: ST3160813AS
Hard Drive Member 3: ST3160813AS
Hard Drive Member 4: ST3160813AS
Parent Array: Array_0000

Hard Drive 0
Usage: Non-RAID hard drive
Status: Normal
Device Port: 0
Device Port Location: Internal
Current Serial ATA Transfer Mode: Generation 2
Model: ST3160813AS
Serial Number: 9SY251S6
Firmware: CC2H
Native Command Queuing Support: Yes
System Hard Drive: Yes
Size: 149 GB
Physical Sector Size: 512 Bytes
Logical Sector Size: 512 Bytes

Unused Port 0
Device Port: 5
Device Port Location: Internal

All other drives have the same specifications noted as HD0 - with the exception of the firmware as noted: Drive Number1 has firmware CC2H whereas the rest of them have CC2J. But with the system having run these +2 years I can't believe that it has much to do with what I'm experiencing now.

I have read where people are having problems with the Intel(R) Matrix Storage Manager version we've been running, but we don't get the BSoD, nor any other problems seemingly associated with this 8.9.0.1023 version; and again - since it's been running fine all these years - it's hard to believe that this intermittan problem is showing up - not on just one system but two of them so far. Any suggestions are always welcomed. I'm about to replace all of the drives...
 

festerovic

Distinguished
I'm no RAID genius, but I want to see the answer if/when someone else chimes in.

That said, I have had similar problems with 4 disk array using the ICH for RAID0. My problem was the chipset getting too hot. The part that does the RAID'ing (southbridge?) typically didn't have the best cooling on most of the socket 775 boards (which I think yours is as well). My RAID would bite it if and when the chip overheated, causing me to lose data (its raid0 so i expected it). On a completely different system, I had the same issue as what you've described, and it could be resolved each time by restarting the system. Pretty much every time I booted, the array was missing. Someone mentioned to me to change the HDD delay that the mobo waits from 0 to something over 3 seconds. This worked for me on that system.

 

WaltsWorker

Honorable
Oct 9, 2013
11
0
10,520


Thanks everyone! I'll have to check the power supply and everything else. But since it's been working fine as it is I doubt a few of the 'normal' things that may cause this. The original power supply had to be swapped since it could barely run lol Thanks again! I'll let ya'll know what if anything I find.
 

WaltsWorker

Honorable
Oct 9, 2013
11
0
10,520
Having to do so much with so little time - I found the answer the next day... I found the cables connecting to the mother board going bad and they where 'loose' in that you could blow on them and they would move. After re-securing them it's worked out fine.