Sign in with
Sign up | Sign in
Your question

Harddrives keep dying on me HELP!!

Tags:
  • Hard Drives
  • Western Digital
  • Power Surge
  • Storage
Last response: in Storage
Share
March 17, 2004 8:12:22 PM

Ok heres the thing. about a month ago, there was a big power surge in my house. everything seemed fine but my burner was toast. no biggie. a week later, my hard drive started to make clicking noises when playing diablo II. It got out of hand and constantly started making clicking noises. so i got rid of it, and bought a brand new western digital. i had it in ther for about a week, now IT starts making clicking noises. i called up WDC and got a replacement. stuck it in my computer, now IT is starting to make clicking noises.

something tells me its not the hard drives fault. could it be the PSU thats damaged? motherboard? IDE controller?

all the parts in my system are not even a year old. BIOS is updated, and the CPU never gets hotter then 48°C. the hard drives when being used in games never get hotter then 49°C. both are WEstern digital 80GB, 7200 RPM.

any suggestions would be helpful, im willing to try anything at this point

More about : harddrives dying

March 18, 2004 12:40:22 AM

Could you describe the clicking further? Reason being is I had a HDD that clicked all the time and it was just the molex wires didnt quite touch the connection points. Easy fix, just adjusted the wires and everything is good.

But I would tend to think it may just be the power supply, but then your system isnt unstable is it?

Xeon

<font color=orange>Scratch Here To Reveal Prize</font color=orange>
March 18, 2004 3:50:43 AM

Wait...your drive made clicking noises, so you replaced it. You replaced it with the drive WD sent you, correct?

Funny thing, WD repairs drives like yours and send them to the next customer for his replacement. The problem can return, and usually does, sometimes quickly. WD hopes that the drive will outlive the warranty of the original drive. None of my replacement drives from WD have.

<font color=blue>Only a place as big as the internet could be home to a hero as big as Crashman!</font color=blue>
<font color=red>Only a place as big as the internet could be home to an ego as large as Crashman's!</font color=red>
Related resources
March 20, 2004 6:42:36 PM

Quote:
WD repairs drives like yours and send them to the next customer for his replacement.

That's actually kinda smart, as I know many customers have no clue what component is malfunctioning ("my computer got borke!").
How do you know they do this?

Abit IS7 @ 275 FSB, VAGP @ 1.65, OCZ PC4200 RAM @ 550Mhz, P4 2.4 @ 3.3Ghz Vcore @ 1.625, GeXCube Radeon 9600P @ 450/626.
Thermalright SLK-947U, P3 HS @ NB.
March 20, 2004 8:50:55 PM

Easy, they put a number on the drive. When you look up the number, you find it's a refurbished drive. All the replacement drives I've gotten from WD were refurbs. All later failed.

<font color=blue>Only a place as big as the internet could be home to a hero as big as Crashman!</font color=blue>
<font color=red>Only a place as big as the internet could be home to an ego as large as Crashman's!</font color=red>
March 21, 2004 11:04:25 PM

is that common practice among all hard drive manufacturers or just western digital?
March 22, 2004 1:41:15 AM

All my Maxtor replacement drives have been NEW. It's possible they refurbish drives and use them for replacements, but if that's the case they must be FAILING drives more frequently than WD during the refurbishing process.

<font color=blue>Only a place as big as the internet could be home to a hero as big as Crashman!</font color=blue>
<font color=red>Only a place as big as the internet could be home to an ego as large as Crashman's!</font color=red>
March 22, 2004 7:19:45 AM

I've also had a large number of drive failures recently. My RAID-5 array keeps going into critical mode, and it's almost getting hard to keep the replacements coming fast enough. I can't seem to narrow down what could be causing these failures.

My power supply should be more than adequate for my system, I have 460W and it's just an athlon XP 2400, NIC, Quadro4 750xgl, and 3 drives (plus other standard PC components). The whole system is also connected to a UPS, so it should not be getting any surges at all.

I'm at a loss to explain it... I was restoring backups three months ago, and in the middle of it a perfectly healthy WD 100GB drive died. Got the info back off it thankfully, but a month and a half later a Maxtor 200GB drive died - it was only a month and a half old. I figured what the heck, sometimes it's defective to start with, I exchanged it in town with no hassles. Now, a month later (last week) another 200GB Maxtor drive failed, and it was a different one too, not the one I'd just replaced (though that replacement was new, exchanged off a retail shelf.)

I haven't taken a voltmeter to the case yet, but seems if I was over-voltage I'd have stability problems in the computer right? The PC is running fine apart from drive crashes. When I feel the drives, they are warm but not too warm. They have enough airflow, and don't even get as hot as drives in previous systems of mine. I do however feel an odd sort of vibration in the drive cage. Is it normal for drives to have some vibration while they operate? It's obviously vibrating at about the rate the platters are turning, as if something in there isn't absolutely perfectly balanced. That vibration is the only abnormal thing I've noticed so far.

The case is an Antec Sonata. It uses screwless drive caddy trays with rubber grommits. The drives screw into the rubber grommits from underneath the drive tray, so that they are held in place pretty much only by the stiffness of the rubber. Does this system lead to vibration and failures?
March 22, 2004 1:34:58 PM

thanks for the info. i have a wd 250gb sata drive whose data connector came off when i removed the cable. i was able to get it back in and it works fine. i was considering rmaing it, but it sounds like i could get a drive back that had (and could have again) an even bigger problem than what i have now.
March 22, 2004 4:29:32 PM

Possible, I returned a drive with a noisey bearing only to get a drive that completely quit responding around 3 months later.

<font color=blue>Only a place as big as the internet could be home to a hero as big as Crashman!</font color=blue>
<font color=red>Only a place as big as the internet could be home to an ego as large as Crashman's!</font color=red>
March 22, 2004 4:36:03 PM

I don't think the rubber bushings would set up a catastrophic resonance situation, simply because the drive cage is heavy but the oscilations would be fast.

<font color=blue>Only a place as big as the internet could be home to a hero as big as Crashman!</font color=blue>
<font color=red>Only a place as big as the internet could be home to an ego as large as Crashman's!</font color=red>
March 22, 2004 5:17:21 PM

Hmm sounds alot like the problems i have or had if u will.

for a little over 1½ year i bought a RAID controler from Adaptec and 2 IBM GXP 120 HDD 80 GB and mounted them in my computer in a RAID 0 config. that ran fine for a whole year, when i ( i like to mess around with windows) [-peep-] up windows and had to reinstall. that i did, but the new version of windows i had then didnt support my drivers ( and i run without a FDD at that time, didnt like it ). so i had it get the old version again, and reinstalled.
but i quit dont know what happend next .. but my system didnt quit run like it was supposed to, like there was something that nagged the system in one way.

and the suddently my HDD started saying wird sounds, like spinning down and up again, twitch like the "head" was jumping around the disk.. and i thought o'[-peep-], cos it at that time sounded like the disk was about to go.

well it allso did, one day prolly a week or so after the reinstall my system suddently locked up and when i rebooted it said critical error no RAID found. and there all my data gone.

i then stated to figure out a way to get a new HDD, but i had to figure out witch one of the 2 HDD it was. so i started looking for test software, and i found a lowlevel test proggy in IBM's site to boot up on and thest the S.M.A.R.T system on the HDD. and it found no errors what so ever.

i then bought a new RAID controler, cos it had to be that one if it wassnt the drives, ( i bought a promise fasttrack 4000 ) and lowlevel formatted my HDD and reinserted them in a new RAID 0 config and installed windows.

this ran fine for ½ a year untill yesterday when i accendently knocked to my computer and there the wird sound was back. and my computer locked up instantly. ( and me .. oh god no not again ). well i rebooted and it looked like my new raid controler could handle it and my system stated again.

i then thought if it wassnt the raid controler and the HDD's is fine ( there is a SMART monitor on this controler ) the it could only be the cables, so i changed them .. and i havnt had a lockup up since .. BUT the wird sound is still there and my HDD sometime spins up and down .. that i dont know why ..

so what im trying to say in short is
chk ur cables too
dont blame ur HDD drives just yet even if they say funne sounds :) , keep the untill they breakdown totaly and then send them back ( and keep a running backup of the data u can afford to lose ).



PS my english aint so great. so sry :) 



<P ID="edit"><FONT SIZE=-1><EM>Edited by Optimist on 03/22/04 02:30 PM.</EM></FONT></P>
March 22, 2004 6:30:34 PM

I did low-level tests on the drives that failed, using different cables, plugged into an onboard motherboard IDE connector in a different computer, and the low-level test utility found disk problems. So I'm pretty sure the disks are going bad, though I did <i>not</i> think to test the drives using that utility <i>before</i> putting them onto the RAID card... so who knows.

My main problem with the SX4000 is that it has three drives in a RAID-5, and windows boots slower and all programs load slower and every kind of disk access is slower than just having a single drive. Well, that's the minor problem - I can deal with it for the purpose of having data security. The major problem is that flashing the card's BIOS made it incompatible with my motherboard (an nForce2) and their system is set up such that the only way to recover was to put the card in a different system (where it POSTed fine) and re-flash the BIOS. I don't consider a card where that kind of thing can happen as a good way of keeping my data secure - I almost lost everything to that. (thank god, I know my stuff and made a BIOS backup to floppy just in case.)
March 23, 2004 1:00:18 PM

and windows boots slower and all programs load slower and every kind of disk access is slower than just having a single drive. Well, that's the minor problem
----------------------------------------------------------------------------------------------------------------------------------

if ur useing the SX4000 .. REMEMBER to use the beta driver on their site every other drive lags big time .. after the install i mentioned i had the same problem .. my system ran slowly and i couldnt play back movies correctly.. and i reinstalle acouple of time and the problem was still there .. then i remembered that before i lost my data i was useing a beta drive .. and wupti after i installed that every thing was finde .. well better than fine, it was all running smoothly and fast.
March 23, 2004 4:33:55 PM

The beta driver requires the latest BIOS on the card, and the latest BIOS is completely and utterly incompatible with my motherboard. Hence a good portion of my deep hatred for Promise products (this is the second RAID card from them that has turned out to be completely useless.) I almost lost everything in the process of finding that out. Luckily I have two computers here, and the new bios would POST on the spare PC, so I was able to move the card to one of its slots and flash the bios back, but if I hadn't had a spare PC, or if I hadn't backed the factory bios up beforehand, I would have been completely and utterly f****d. (and I mean I wouldn't have been able to apply for jobs for a year or so - demo reel and all source files gone! It still pisses me off after all these weeks.)
<P ID="edit"><FONT SIZE=-1><EM>Edited by grafixmonkey on 03/23/04 12:34 PM.</EM></FONT></P>
March 24, 2004 5:01:10 PM

hmm .. well bought my controler last year and it had the bios version requred. so i'd just asumed that u had the bios version too.
March 24, 2004 5:50:04 PM

First off so there is no confusion as to my "potentially perceived biase" I admit to having been a great fan of Promise RAID controllers for years.

I find it surprising that you have run into compatibility issues with a motherboard? Compatibility has has always been one of my chief reasons to stay with Promise.

My experience is that they have rarely been the fastest, but have been extremely compatible, and durable.

One thing I DEFINATELY have run into though, is on the SX6000 and SX4000 series they are VERY particular about the memory that is installed on the board. According to their documentation is just any old memory meeting their specs, but I have found that specific memory modules just don't seem to work well. So I would watch out for that.

I have used Promise cards, of varying models, on A7M, A7V, and A7N systems without any problems. Additionally, I have several running on SK8N motherboards as well.

I have never had the misfortune of flashing a Promise card with a new BIOS and to not have it work in the resulting system.

I have had situations where flashing the BIOS on the Promise card REQUIRES Windows to redetect the conroller card. (In fact, updating the BIOS on pretty much ANY card, or even moving it around from one PCI slot to another will have the same affect.) That is not a problem in situations where the controller card is being used on a data array, but on a boot drive, yes that can be a problem. (I.E. You can't boot, until the "new" controller is redetected, and you can redetect the "new" controller until you boot? Or at least what Windows percieves to be a "new" controller.) I believe this situation to be a Windows problem though, perhaps that is the situation that you ran into?

Personally, I avoid using RAID 5.

All of my systems are either setup as Mirrored pairs for redundancy, or as RAID 0 for speed. In ALL cases the boot drive is mirrored.

The advantage of having the boot drive being mirrored is that if the Promise RAID controller fails then the drives can easily be attached directly to the motherboard and will function just as well. (NOTE: I always configure what Microsoft calls the Boot and Systems drives as being one in the same.)

This affords a greater number of recovery options.

I HAVE NOT used Promise controllers in RAID 5 configuration, so I cannot talk specifically about that. It's adding additionally complexity that I don't need, and can get around with mirroing and/or Striping configurations.
March 24, 2004 6:16:37 PM

Here is a question, anyone think that defraging a large drive can cause it to fail? I have had two 250 Gig drives fail DURING the defrag process using norton speed disk. Now, I only defrag my drives when I have to. Also, the 250 gig drives were partitioned in half, so they were about 125 gigs each.
March 24, 2004 7:54:48 PM

I wouldn't think that "defragging" would necessarily cause a drive to fail.

That being said, "defragging" is a very disk intensive event. (This is "extra true" for the Norton Speed Disk. I think it, likely, does a better job but it's really disk intensive and time consuming.) As such I think it may be possible for a marginal drive to exhibit problems after such activity, or if the drives are not cooled properly enough heat could build-up from use to become a problem.

I defragment my server drives several times a month. (NOTE: I ONLY use the Windows 2000 Server Defragment routine.) I have had no problems with drives failing during a defrag or shortly after.

I have had desktop drives fail, but I have not been able to isolate it to a specific action, or factor. They just seem to break. (Well this is one thing that will help kill a drive! Running a computer on bad power wihtout a UPS. Just lost one last week to that!)

I have not had very many server drives fail. In almost ALL cases the servers I deal with have much better cooling than the desktops. Additionally, I tend to change out server drives fairly regularly while quite honestly I use desktop drives often until they do fail!
March 24, 2004 9:12:36 PM

I once had a WD drive that I'd been using for backups fail the instant I deleted a file that hadn't been touched for a very, very long time, one that was on the very beginning of the disk. I'd guess that there was a spot on your drive that was waiting to interfere with a drive head in some way, and it just never passed under the read head during normal use until you defragged it (because defraggingw would likely cover the entire disk surface, or at least all that had been used.)

jim552, about Promise...
The SX4000 was using a Raid-5 config. I needed it for data security, and thought it would give me more disk space total and require less drives (400GB using three 200GB drives, instead of needing four to get that capacity in a Raid-10.) I thought it would give me better speed than a two-200GB raid-1, but at least in this case I was wrong (maybe in all cases w/ raid-5, many documents state that raid-5 is supposed to have good performance? Maybe that is only for large numbers of drives.)

After flashing the BIOS on the card, the computer would fail during the SX4000 Bios POST process. It would just hang there forever. Same with the drives connected and disconnected. Same with all cards except the graphics card and the SX4000 unplugged, and all integrated peripherals including IDE and USB turned off on the motherboard. Also the same, with the CPU underclocked to a slower FSB and multiplier (it had not been overclocked before - was running at rated.) The motherboard had previously been flashed to the most recent version (it is an Epox 8RDA+), on suggestion from Promise, as a potential way to fix the performance problem. Promise suggested I remove my Quadro4 750xgl and try a "small" graphics card like a Geforce 2, but after six hours of trying to fix this, I drew the line there. I plugged it in to the other computer and flashed it back, and just endured the slow performance.

Now, about my <i>other</i> Promise card... It was a TX2000 with Raid-0. Whenever I transferred anything to that card over the network, the computer would hard-lock in the middle of the transfer process. No errors, blue screen, or rebooting, just an instant system freeze, mouse network and all. I could transfer up to 3 megabytes before this happened, as long as the files weren't very small and numerous. After rebooting, whichever file the computer had crashed on before was now a "funny file", a ticking time bomb on my disk, which would instantly blue-screen and crash windows XP the instant the OS opened it for reading. This included pressing 'Delete' with the file selected. The files had to be removed using the command prompt. I lost a lot of old work to this problem, and thankfully had a CD backup of it to restore some of it from. I should note, that the Promise card did this with the following different network cards: onboard 10/100 network, Linksys EG1032 V1.0 at both 100 and 1000 speed, Linksys EG1032 V2 at both 100 and 1000 speed, and Realtek 8139 10/100, using many combinations of hubs/switches/routers/cables. I initially thought the network cards were overheating, but found they were not because it also did it with an 80mm high-output fan blowing full blast directly onto the network card, with a headsink attached to its chip.

The kicker: Another college friend of mine <i>also reports that his TX2000 did the same exact thing.</i> So does another person on this forum, I can't remember who said it. I have since run into a couple other people who say their Promise cards are being used as cat litter or lining for hamster cages. You are the first person, ever, that I have met who likes Promise cards and would recommend them to anyone else, so I think I just have to assume you got lucky with them. I for one will not touch another one, ever.
March 24, 2004 10:55:25 PM

Wow!
It does indeed sound as though you had a H@#$ of a time!

I have NEVER had any problems close to that!

The only real problem I have had was last night, when I decided I was going to remove my Compaq Smart 2SL SCSI controller from my Proliant 1200 and replace it with a Promise Fasttrack S150 TX4.

It had, what appears, to be a similar situation. It would POST, but then prior to booting up it would just lock. The first time I waited for about 1/2 an hour, but it just stayed there.

Then I popped out the Smart 2SL card, and booted with the Promise card by itself, with no drives attached. It booted up. (Well I booted from the floopy.) Powered off, put the Smart 2SL card back in, booted up. (Again from floppy no drives attached yet! Made sure to tell the Smart 2SL card to preserve my configuration, and not delete the failed drives!) Powered up again, with the drives all attached, and loaded the Promise drivers. Success! Rebooted to set the drivers, rebooted again, configured, the Promise RAID for single drive operation, booted to floppy, ghosted the data over, popped out the Smart 2SL card, booted again with only the Promise card. Success! Rebooted again, killed the single drive Array, Created an Array that was mirrored, turned off the second drive, rebooted into Windows VIA the Promise card, loaded up the Promise Array Monitoring program, turned on the second drive, watched for the rebuild messages. Done!

Yeah, the first part was pretty much, "What the F&#$!", but once I got it to not conflict it was like clock work as fast as I could go! (Mainly, just waiting around trying to stay awake for the next stage!.)

I don't think that I would categorize me as being "lucky" though! I would categorize you as possibly being "unlucky"! (Note: Not in your actions, but rather in getting possibly a couple of bad boards!)

From your description you hung in there a whole lot longer than I would have! If my first couple of experiences were to have to pull cards, swap cards, change clock frequencies, I would give up on them too!

For me though, it wasn't that way.

Actually, now that I am writing this there was another problem that comes to mind.

The PAM program, Promise Array Monitoring program, had a problem when the SATA controllers first came out.

It would sit there and chew up about 7 handles every 2-3 seconds. Eventually, it would get somewhere around 700,000 handles and then kill the system! It took them, I think, three revisions of the PAM to get that fixed.

That was a HUGE annoyance to me because I had 5 systems running and wanted to receive Email status of what was happening on them. Without the PAM working properly that wouldn't happen. (It was the background service that allowed PAM to connect that was specifically the problem, but I am just considering them the same. Without one the other doesn't work!)

Now, I am happy.

I think there are 12 systems running on my network at this moment. ALL using the SATA controllers, but the origonal 5 used various IDE controllers before.

There is one more thing, that will start becoming a problem for me in the near future. (One that if Promise doesn't fix it I may HAVE TO leave them.)

Currently the SATA controllers are only designed to have one card per machine! This is incredibly unfortunate, because I made the origional assumption that since the SX6000 could handle up to four cards per machine, the SATA controllers would handle multiple cards as well!

NOT THE CASE!
(Well, if the cards use different drivers, and the drivers don't cross recognize the other cards, you can have multiple ones. Also if you hand code the resources Promise tech support says it will likely work, but it is not supported. I only put that here in case someone says, "Yes I've done it!" So have I too, but I don't do things that aren't supported by the manufacturer in production, nor do I like things that involve anything close to the concept of "Zen" in getting it to work!)

I have talked to Promise, but they aren't committing to fixing this.

So, if anyone is actually reading this and is still awake, I would be REALLY be interested in any experience anyone has in other controllers that can at this moment handle multiple cards per machine?

This isn't an issue at this moment for me, but will be around June!
March 24, 2004 11:24:46 PM

Yes, that's my other problem with the Promise card I got. It wouldn't work in the machine alongside the TX2000 - refused to post one or the other. That meant that I could have a raid-5, or a raid-0, but not both. I had to have my data on a secure system, so there was no choice but to put up with raid-5 performance in my video editing. Thankfully my new workstation I'm building will have both in the same box. (with a Highpoint and a 3Ware card. :wink: )
!