64-Bit Raid System, 4 Hard drives have crashed twice

obxbarb

Distinguished
Jan 14, 2010
4
0
18,510
I've either had a really unlucky experience with hard drives or there is something else terribly wrong with my system.

I have had 2 Raid failures in the past 2 months. Both times happened upon re-starting computer from overnight in the monring. I shut my computer down every night before I leave office. Both times I gor messages that the Raid failed upon start up in the Bios. Both times we have been unable to recover anything from hard drives. After the first failure, my IT guy replaced all 4 hard drives with new ones. 2 mos later it has happened again.

My Specs Are:
64-Bit System, Windows XP Pro OS
Ram: 8 gb, fully buffered ecc
4 Hard Drives, 80 gb Western Digital
Sata ii
(2) 3 ghtz 2-core Xeon processors
2 Raid Zeros (1 OS and 1 is Scratch)
Dell Precision 690 Motherboard

What could be causing this to happen? Is there anything you can see from set-up which would alarm you? The computer is fast, but I have experienced some sluggish behavior and application hangs. I will admit, I run Adobe CS4 creative suite and do run photoshop, InDesign all at once. But isnt that why you invest in the fast computer and ram?

Is there anything else I should look into for this set-up? Any cables or connectors? Would anything here cause the hard drives to keep failing? I am afraid to set my computer up again the same way. Thanks in advance for any help you can lend.
 

theholylancer

Distinguished
Jun 10, 2005
1,953
0
19,810
what exact models are they? I think you would need a wd with Time-Limited Error Recovery

check here: http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery

i know some wd black 1 tb can be enabled without extra cost, but apparently they catched people doing that and locked it away so you had to buy their enterprise raid drives to get tler

Also, winodws xp 64 is kinda buggy from what I heard with it's raid performance, what is your raid controller and do they have xp 64 listed as a supported platform? try winodws 7 64 or better yet a linux 64 biut live cd and see how that goes
 

obxbarb

Distinguished
Jan 14, 2010
4
0
18,510
Thanks Holy Lancer,

The Raid controller was integrated into the chip set. (is attached to the Mother Board (Don't know what it is, IT guy has computer)

I was thinking the Adobe CS4 platform for 64 bit may have been why i was having the hang-ups all this time, maybe it was the OS (Windows XP Pro for 64 bit)

The Hard Drives were Western Digital, not sure of the exact model. Would TLER allow me to at least boot when its said Raid failed and recover the data from the hard drives?

Thanks for your reply and thanks if anyone else has any insight!
 

He means that using RAID0 is not safe at all (it's worse than using a single drive) and I totally with him. If you care about your data, use larger drives in RAID1 or RAID5.
 

will_chellam

Distinguished
Jun 5, 2007
450
0
18,810



There are ways to answer a post - the above is terse at best, rude at worst, it would ave been vastly more helpful if you qualified your comments for the op to further their understanding of the subject....

Additionally to what has already been posted, I'm not sure about your particular motherboard, but my experience of on-board RAID solutions (and even some off-board cheapo ones) has been variable at best, I currently use the intel matrix raid on my 975xbx motherboard which seems pretty reliable so far, but ive used cards by avansys which were a pile of crap, and adaptec which were expensive but very very good.....
 

will_chellam

Distinguished
Jun 5, 2007
450
0
18,810


recovering data from a RAID0 is a total pain in the ass - basically the date is split into alternate stripes across the two disks, so if one disk gives up, or if the array becomes 'unbuilt' the reamining disk or disks only contain half the data, with 4 disks each only contains a quarter.

reassembling this data can be immensely difficult, but some companies will attempt it for a price - depends on how badly you want it back.

Personally I have a RAID0 for my OS and programs and a standalone disk which contains my outlook datafiles and all my documents, i run a windows home server with a backup of my raid array for an easy restore if anything should go wrong
 

will_chellam

Distinguished
Jun 5, 2007
450
0
18,810



If it was me, 1tb+ disks are cheap these days - why not add a big disk for your os and run all the 80gb disks as a 4disk raid0 for your scratch - im not sure on the scalability of the onboard raid controller, but it should be quicker than what you run now - what size files are you using btw? with 8gb of ram i would have thought the scratchdisk requirement would be pretty low....
 

leon2006

Distinguished
Check your PSU make sure it has enough capacity for your load.

Dell Raid controllers are not that reliable. I had a bad experience with dell raid setups before.

Update your BIOS and drivers(inlcuding the raid drivers).


Use acronis trueimage to backup your drive. That is both your OS and Data drive. The image file can be save on the network, external drive, or optical disk(DVD or blue ray disc). You will be able to recover your entire os drive and data drive from the image file.

 

theholylancer

Distinguished
Jun 10, 2005
1,953
0
19,810
yeah, if you want data reliability, use RAID 5 and NOT a onboard controller, use one of these controllers, it may be expensive but you'll want it for RAID 5 or better (there is a onboard processor that does the XOR operation ins hardware and other recovery features)

http://www.newegg.com/Product/Product.aspx?Item=N82E16816118105&cm_re=9260-8i-_-16-118-105-_-Product

SAS 6 GB/s, 800 Mhz power pc on board processor for RAID 5

and then you want to use some enterprise level storage, including the Seagate CHEETAH 15K.7 line (the .7 denotes that it is 6 GB/s and top performance) or if you want performance over risk, then some SLC SSDs like the Intel SSDs that are marked enterprise, or try MLCs if you want cheaper (relative)

600 GB 15k rpm drives:
http://www.megabuy.com.au/seagate-cheetah-15k7-600gb-p149159.html

I'll try to find more of these
 

The Dell Precision 690 has an integrated LSI 1068 SAS/SATA 3.0Gb/s controller that supports RAID 0, 1. I don't know why you said that it's a bad controller, but it should be fine for a workstation. The problem might be that inexpensive SATA 80GB hard disks were installed instead of SAS hard disks that are specifically designed for RAID.
 

theholylancer

Distinguished
Jun 10, 2005
1,953
0
19,810
I guess raid 1 is okay for a work station, but I didn't realized that it had a integrated LSI 1068, then yeah you would only need enterprise level drives.

I would still personally perfer something with a integrated processor (rather than using CPU to do it) if you want to do any real work (raid 5 all the way), and better yet with a BBU
 

obxbarb

Distinguished
Jan 14, 2010
4
0
18,510
Thanks to EVERYONE for all of your great responses and feedback.

We are looking into the possibility that because the drives were WD Caviar LE's with no TLER that may have something to do with the quick failure. The hard drives were fine, maybe the RAID just got corrupted somehow and maybe if we had TELR enabled that wouldnt have happened? or drives which are designed for RAID. Or possibly even adding a hardware RAID Controller.

Yes, we did understand that Raid0 was risky, BUT we were hoping to capitalize on speed, mainly because we run some many graphics programs.

I really do appreciate your suggestions and I really have learned a lot about the subject of RAID.

RAID is not bug spray!
 
You already have a decent hardware RAID controller. If I were you, I'd install SAS or enterprise SATA hard disks in a RAID1 configuration. Newer hard disks are faster than old ones. Even if the performance might be lower if using SATA (15K SAS hard disks are very fast), you wouldn't lose your data unless a hard disk goes bad and you ignore the warning.
 

theholylancer

Distinguished
Jun 10, 2005
1,953
0
19,810



If you wanted speed for graphics programs (that I assume uses random access more since it could split up frames to different processors) then maybe some SLC SSDs are an option, and as said before some nice 15k SAS 6GB/s drives can be a sure fire way to go, and they would be good enough to be RAID 1ed rather than RAID 0 to provide the performance increase

if not, grab another controller that does raid 0 + 1, where it would stripe the data across two drives, but there is a back up set of drives running as well in case of failure.


also the LE edition are I think low energy editions, they are NOT performance orientated products nor RAID products, you may want to see at least WD blacks or the WD RE2 enterprise raid edition drives if you guys insist on using consumer priced items, because a good industrial 300 GB 15k drive is 300-400 dollars and that can be a lot for someone trying to save money.
 

jtk_35

Distinguished
May 19, 2011
1
0
18,510


One Go to Microsoft 2008 R2, Two never shut down a server unless you have to.

Three, Never use the settings on automatic updates - Install them over the weekend.

There is a red light that most servers have if a harddrive goes out.

I've had servers from 1980.

Good Luck..