From "1 bad sector" to none: what does SMART data really mean?

jhsachs

Distinguished
Apr 10, 2009
224
6
18,685
A recent experience led me to wonder just what SMART data represents and how reliable it is. If you known how it's collected and reported, I'd appreciate your comments.

I was testing two used HDDs of unknown background, using Ubuntu 16.04. I first ran Disks, which told me that each drive was OK but had one bad sector.

I then deleted each drive's active partitions and executed shred, which ran uneventfully. When I was done, I noticed that Disks now said one drive was OK with no bad sectors. I looked at the other one drive and found the same thing.

I don't know whether SMART data is supposed to be cumulative -- once a sector is reported bad, it's reported bad forever -- or the data is refreshed periodically. Either way, It seems strange for bad sectors to experience a spontaneous recovery. If they can do that, I wonder how much "no bad sectors" actually means. If bad sectors can go away for no reason, it stands to reason that they can come back for no reason. These disks could just as well have reported "no bad sectors" when I first used them, and one or more bad sectors when I looked again -- or when I didn't.

Should I be concerned about this? Are there additional precautions I can take?
 
Hard drives have spare sectors that can replace defective sectors, so when you partition the drive the spare sector was swapped in. SMART reports the bad sectors as uncorrectable sector count which is cumulative and cannot go down.
 
Can you post screenshots with SMART data? Something like HDtune health (don't know linux analog).

Bad block is file system/OS concept. Essentially - bad block is when OS can't read a data block, it marks it as "bad/used" and doesn't try to read/write it anymore.

SMART works differently - on a lower (hardware) level.
SMART reports Relocated Sector Count and Current Pending Sector.
Process is following - when sector could not be read, it is marked as pending (Current Pending Sector increases).
To resolve pending status, the sector has to be overwritten.
Then depending on success or failure of rewrite operation either sector gets relocated to spare area (Relocated Sector Count increases) or not (if rewrite was successful). In both situations pending status gets cleared (Current Pending Sector decreases).
But - the spare area is not limitless. It can get exhausted (Relocated Sector Count current value will reach threshold value). When this happens, SMART will report drive failure.
 

jhsachs

Distinguished
Apr 10, 2009
224
6
18,685

Well, not really. I could post a snapshot of the current "Disk is OK" report, but it wouldn't be very interesting. I can't show the transition from "Disk is OK (1 bad sector)" to "Disk is OK" unless I can find a time machine -- and if I could, it still wouldn't be very interesting.

From what you and others have said, I gather that SMART is a great technology for protecting disks from progressive failure -- as long as spare sectors are available -- but as a tool for monitoring a disk's health it doesn't even try to be useful.

Suppose my disk is going bad at an average rate of 32 sectors a day, and over the past 16 days it has replaced 511 bad sectors with spares, leaving one spare unused. I look at the SMART data, and it says I have no bad sectors. But guess what's gonna happen over the next day or so. It doesn't even give me a warning until I start losing disk integrity and data.

Perhaps there's some other tool in/for either Linux or Windows that would give me more useful information.
 

jhsachs

Distinguished
Apr 10, 2009
224
6
18,685
smartmontools sounds like an ideal solution. I won't have a chance to try it for a few days, but I'll report my impressions when I do.
 

jhsachs

Distinguished
Apr 10, 2009
224
6
18,685
I tried installing smartmontools this evening, but did not succeed. I was trying to install it with apt-get from Ubuntu's default repository.

Unfortunately I'm not writing from the machine where I had the problem, so I can't reproduce the complete console output. I think everything went well up to the point where it tried to install systemd; then it gave the message "/var/lib/dpkg/info/systemd.postinst: 1: /var/lib/dpkg/info/systemd.postinst: Syntax error: Unterminated quoted string."

I see that packages.ubuntu.com lets me download smartmontools as well, and I can try doing that when I next have time to work on this. I don't know what will be in the download or what I'll have to do to install it, though. I much prefer the simplicity and predictability of apt-get... when it works. :)
 

jhsachs

Distinguished
Apr 10, 2009
224
6
18,685
I tried the apt-get install again after a reboot and it worked. The program works too.

I'm going to have to study the documentation before I can use it. The examples I found showed me how to make it say "Disk is [not] OK" and how to get an exhaustive account of every burp and hiccough the disk has ever had, but so far, nothing in between!

I'm pretty sure it will do exactly what I want, once I figure out the right incantations.