Internal hard disks of my linux box are kept getting crashed beyond recovery.
Earlier, I had Fedora OS installed with 2 Seagate hard disks connected as RAID-1(Mirror) array.
One day, the entire running system suddenly got locked. I mean the OS said on the screen that the entire filesystem has become read-only. I must mention that i had a faulty-noisy telephone, not connected to this computer, and when my computer did this behavior there was a sharp sudden noisy sound raised on the telephone receiver and my computer almost simultaneously. I don't know if these were linked.
When i restarted the machine, OS was unable to recover the file system and i did reinstall. After that i had 2-3 such crash in 2-3 months gap. But most of them was recovered by the OS. And i removed the RAID-1 system to make it a single hard disk system, leaving the other hard disk sleeping
Finally the day came, when that single hard disk crashed, and it was unrecoverable by OS and by repairing tool provided by the hard disk manufacturer. I sent that to them and they replaced it completely.
But i went with the other hard disk, not the replaced one. To my agony same thing is happening with it. The filesystem today suddenly become read-only, unrepairable by OS. I just repaired it by the repairing tool of the manufacturer. I am now counting the day when it will become completely unrepairable again.
I have no idea why this random crash/corruption is happening. My box is almost 2 years old now.
Is this a bad PSU issue, which is causing some electromagnetic induction on the hard disks?
I have read that when faced with too many simultaneous writing attempts, the OS may make the filesystem read-only to prevent any unauthorized attack.
Some information about how to check PSU health status will be helpful along with some insight into this issue.
I could just be that both drives went bad; especially if you bought the drives at the same time, are the same make/model, and were manufactured around the same time. Hard drives do go bad, sometimes without warning.
Me personally, I would start by replacing the drives with a brand and model known to be reliable. I've never really had good luck with Maxtor or Seagate hard drives, I've come to prefer Western Digital. Although the new 1TB Seagate Barracuda 7200.12 drives are getting favorable reviews. The newer Samsung Spinpoint's are also getting some really good reviews.
Electromagnetic induction from the psu, while quite possible, is pretty unlikely, unless the psu is literally sitting on top of the drives. As long as you've got a quality brand name psu with enough watts/amps to run your components; I doubt it's the psu.
For kicks and giggles, after you've got your OS back up and running stable, I'd connect those drives and run some diagnostics for bad sectors and platter integrity.
My hard disks are of Seagate Barracuda 7200.10 type. And i did fix the corrupted disk quite a few times using the software provided by Seagate, except at the last time which did make the disk unrepairable by the software.
I don't know what was the actual the cause of repeated crashes, but through other incidences i came to a conclusion.
On some fortunate day, i found that the motherboard was not getting any power supply. After some stupid routine checkup, i found that the power cable, which connects the external UPS to the computer case, is faulty. This reminded me that sometimes the computer did not start right after pushing the "Start" button. Though i am not sure, but i guess it is possible that due to that faulty power cable, the motherboard sometimes lacked correct power level and that may result in unknown uncanny( ) electrical noise, which caused unusual writing attempts on the hard disk, resulting in disk corruption.
But, again, this is entirely a speculation. It is quite possible that the cause of the repeated crash is entirely different than this.
List down you system spec including the make and model of your PSU. It could be a bad pair of drives, short of capacity on your PSU, or anything. Its hard to analize PSU related problem.
It happen to me in the past. Two Raid0 drives purchase at the same time. Belong to the same LOT with serial numbers very close. Keep crashing for no apparent reason. I sent both back for warranty replacement. After that it work fine.
I doubt the power supply is the cause of your hard drive problems. Generally if there is a problem with a high quality power supply it will just shut off.
It could be the power supply, or possibly the motherboard.
Does the HDD corruption happen during heavy system loads, or only during heavy hard drive use (like a partition/system state backup verification)?
The "power good" signal is sometimes craptacular on a cheap power supplies, it is simply factory set to "yes" which will continue to feed your system "bad" power (whether it is over / under, current / voltage, spikes, brownouts, etc.).
some electromagnetic induction on the hard disks?
... WTF? If your serious you can try shielding the hard drive (unmounted) behind a large ferrous object.
Unrelated I usually check for heat (w/ speedfan or a linux equivalent) first (As I have found a broken or lose h/s bracket is hard to see with the Mobo mounted in the case), afterwards I would check the memory with memtest 86+, then put the processor and ram through a load with prime95.