I put together a computer using a Gigabyte 965p-DQ6 mb. I am using the ICH8R Raid controller which is ports 0-5 on this mb. Everything has worked well as far as software and configuration. However, I am now about to replace my 5th hard drive in a year and a half in this configuration. Why do the drives keep failing?
I had a cheap bare bones box I bought in 2001 with a 40GB drive. It cost only $300 bucks or less. That drive never failed and it is now faithfully serving as part of my Windows 2000 server. That drive lasted 7 years and is still going. These drives I'm buying are lasting only about 6 months. RAID has been great, but I wonder if it is a self-fulfilling prophecy. I'm prepared for the drive to fail and so now they are failing. I could not have recovered if my old bare bones drive failed and it never did. Can anyone tell me what is going on? Any suggestions for cooling perhaps?
I have the drives in 4 removable bays now. I didn't when I started and that was a real pain. They're at the top of the case with lots of space around them. I don't know what else to do.
What hard drives are you using? Also, are they actually failing (physically), or are they simply reporting imminent failure (which is a feature of questionable accuracy - I have a RAID 0 array that reported problems every time I started for about a month, then stopped complaining. It's still running flawlessly, and that was 2 years ago).
Are you sure that the drives are really failing? The reason I ask is that I have the exact same motherboard and am using it with a Raid 1+0 configuration and I've seen the Intel Raid utility mark 1 of my WD 500s as "failed" quite a few times. This seems to happen sometimes when the system crashes for some non-drive related reason. Anyway, if you go into the Intel Raid utility once a drive is marked "failed", select that drive, right click it and you may see the option to "Mark as Normal". I've done this maybe 8 times and have ALWAYS been able to recover the drive without fail. It seems that the Intel software will not re-examine a drive on its own once it has been marked as "failed". Unfortunately, I only discovered this after sending a few perfectly good drives back to NewEgg for replacement. Give it a try - it might save you a big hassle!
Thanks for the tip. The odd thing is, I put in a brand new drive and it is also failing. I've tried it twice today. The computer kind of locks up for 10 - 20 minutes and then finally comes back and says the drive has failed. This is after it looks like it is rebuilding the drive. I guess I could put the most recently failed drive back in and see what happens, but I'm not holding out much hope.
That sounds like a problem with the failure detection, rather than with the drives. I doubt that many drives, especially known good ones like the Caviars and 7200.10s, would have failed in that amount of time.
If you are overclocking the PCI-E frequency then you will get drops and stress the hard drive yes. You should leave the PCI-E frequency at 100Mhz. I learned that the hard way myself I went through like 6 hard drives and figured that out finally. No one ever talks about that issue for some reason. There is a bug in the new raid controllers/ new chipsets as well where you make the raid drop by messing with the PCI-E frequency and the controller will no longer like the hard drives.. That was actually the issue I kept running into and had to figure it out myself. I have left my PCI-E frequency at 100 ever since and had no issues ever since.. I have three drives in Raid0 right now.
I would agree that you should keep PCI and/or PCIe spread sprectrum turned off when overclocking, and make sure the speed is set to 100 if you think it's causing issues. there are some that will keep the PCIe frequency up as high as 110 to get that little extra out of the vid card.
From what Ive read, it seems that the SATA corruption happens about 120-125Mhz. Ive personally had mine as high as 112 with no ill effects, but couldnt muster the courage to go any higher.
If the failed hard drives(even they are brand new) are from WD, it is an known issue for WD non-RAID hard drive used in RAID configuration. http://en.wikipedia.org/wiki/Time-Limited_Error_Recover...
You need enable TLER(Time-Limited Error Recovery) on your WD SE hard drive when used in RAID.