Sign in with
Sign up | Sign in
Your question

Has SMART ever successfully predicted failure for anybody ..

Tags:
Last response: in Storage
Share
Anonymous
a b G Storage
April 14, 2004 1:55:26 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

SMART monitoring sounds like a great idea to me. Who better to predict
a drive's failure than the drive itself, right? I mean, the drive knows
what it's doing internally, such as silently remapping bad sectors,
keeping track of how long it takes to spin up, error counts and all that
fun stuff.

SMART is there, comforting me.
Calling out "I'm good. I'll never let you down"

And then another drive begins to show symptoms of impending failure. In
my particular case, it's a Seagate Barracuda V. Just a couple months
past the expiration of the 1-year warranty.

And I start to realize that I can't recall a single case where SMART has
predicted a failure before I actually see the symptoms of the failing
drive. Not in any of the machines I own, or in machines of other people
I know. I've used HDD Health, DTemp, Promise Fastcheck, smartmontools
and ide-smart (on linux), and various manufacturer's utilities for
checking smart.

I've been able to monitor temperatures (it's at 37C right now, which
seems to be the norm for it), and I can look at statistics on the drive.
But not once has SMART actually predicted problems with a drive
before I actually see the symptoms (such as a repeated "thunk" sound, a
frozen mouse in Win2k with the HDD light on solid, or event log entries
indicating a drive error). Doing a full surface scan in all such cases
has indicated bad sectors on the drive.

Is SMART not predictive enough? Is this a fault of the various SMART
monitoring utilities? If SMART doesn't "kick in" until later on in the
dying process, then what good is it? (If I can see the symptoms well in
advance of it giving me a warning) What are your experiences with SMART
and its (in)ability to predict failure?

Thanks.
-WD
Anonymous
a b G Storage
April 14, 2004 1:55:27 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

> Is SMART not predictive enough? Is this a fault of the various SMART
> monitoring utilities? If SMART doesn't "kick in" until later on in the
> dying process, then what good is it? (If I can see the symptoms well in
> advance of it giving me a warning) What are your experiences with SMART
> and its (in)ability to predict failure?

I've been monitoring about 1100 ATA disk drives (IBM and Maxtor) on
two large linux data analysis clusters for the past two years. I've
seen more than forty failures -- SMART has predicted between half and
two-thirds of those. I've also had advanced warning of *other* types
of 'non-failure' problems, particularly unreadable (uncorrectable)
disk sectors which *would* have caused unpredictable and unrepeatable
errors with the OS and data analysis.

The bottom line is that SMART can not and will not predict all
failures. But in many cases (especially if you or the monitoring
software know what to watch for) it will predict a substantial
fraction of them.

Bruce Allen
April 14, 2004 4:58:25 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Will Dormann" <wdormann@yahoo.com.invalid> wrote in message
news:ilZec.10106$ci4.1038@fe1.columbus.rr.com...
> SMART monitoring sounds like a great idea to me. Who better to predict
> a drive's failure than the drive itself, right? I mean, the drive knows
> what it's doing internally, such as silently remapping bad sectors,
> keeping track of how long it takes to spin up, error counts and all that
> fun stuff.
>
> SMART is there, comforting me.
> Calling out "I'm good. I'll never let you down"

Nope, that's where you're wrong. Not all faillures are predictable. No SMART
alarms by no means indicates that the drive will not let you down. Using
SMART you *may* be able to to predict failures that are predictable, that's
the best SMART can do. A drive can die without SMART *ever* predicting it.

--
Joep
Related resources
Anonymous
a b G Storage
April 14, 2004 4:58:25 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Have you any favourites among the SMART monitoring software?

I started trying some the other night. Most are pretty limited, and will
only show the motherboard attached drives. I've got 3 more on a Promise
Ultra133 that I'd like to examine now and then but looks like I'd have to
buy a tool to do that.

I have not checked out the most recent tools from the vendors yet. The
drives are IBM and Maxtor. I wasn't sure if they offered any windows based
SMART monitors for free.
Anonymous
a b G Storage
April 14, 2004 5:48:14 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Mr. Grinch wrote:
> Have you any favourites among the SMART monitoring software?
>
> I started trying some the other night. Most are pretty limited, and will
> only show the motherboard attached drives. I've got 3 more on a Promise
> Ultra133 that I'd like to examine now and then but looks like I'd have to
> buy a tool to do that.


Smartmon is junk, HDD Health is nice once you turn off the non-critical
alerts (like a change in temp of 1 degree... no need to keep popping up
alerts for that). I'm currently using DTemp, since it puts the temp
of the drive in the windows sytem tray.

I have a drive on a promise controller (MBFastTrack133) that isn't
supported by those programs, so I need to run Promise's FastCheck
utility. It just gives me a "Functional" report on the drive and
nothing else, but I guess that's better than nothing.


-WD
Anonymous
a b G Storage
April 14, 2004 12:39:18 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Will Dormann <wdormann@yahoo.com.invalid> wrote in
message news:ilZec.10106$ci4.1038@fe1.columbus.rr.com...

> SMART monitoring sounds like a great idea to me. Who
> better to predict a drive's failure than the drive itself, right?
> I mean, the drive knows what it's doing internally, such as
> silently remapping bad sectors, keeping track of how long
> it takes to spin up, error counts and all that fun stuff.

> SMART is there, comforting me.
> Calling out "I'm good. I'll never let you down"

It doesnt say that, just that its better than nothing.

> And then another drive begins to show symptoms of impending
> failure. In my particular case, it's a Seagate Barracuda V.
> Just a couple months past the expiration of the 1-year warranty.

> And I start to realize that I can't recall a single case where
> SMART has predicted a failure before I actually see the
> symptoms of the failing drive. Not in any of the machines
> I own, or in machines of other people I know.

There have been some examples reported in here.

And should have helped with the most common
75GXP failures if it had been implemented properly.

And its obviously never going to help with
sudden semiconductor failure for example.
Or lots of other causes of hard drive failure
like the power supply dying and killing the drive.

> I've used HDD Health, DTemp, Promise Fastcheck,
> smartmontools and ide-smart (on linux), and various
> manufacturer's utilities for checking smart.

> I've been able to monitor temperatures

And that alone is quite handy, being a very convenient
way to keep track of what the drive temp is doing in the
worst conditions of high room temp and drive activity.

That does show that Dell laptops generally run their
hard drives at higher temps than I would do voluntarily.
Even tho they are technically within the allowed temp.

It also allows you to run something like MBM to
bring up alarms on temperature excursions that
are due to fan failure or just fur clogging etc.

> (it's at 37C right now, which seems to be the norm
> for it), and I can look at statistics on the drive.

> But not once has SMART actually predicted problems
> with a drive before I actually see the symptoms (such
> as a repeated "thunk" sound, a frozen mouse in Win2k
> with the HDD light on solid, or event log entries
> indicating a drive error).

Sure, but with drive failure so rare, thats hardly
surprising given that SMART was never ever
claimed to be able to anticipate all hard drive failures.

I havent actually had a drive failure in any system
I own or look after since SMART showed up with
the exception of a single Fujitsu MPA drive that
has some dry joint or cracked trace problem that
can see it not spin up unless you deliberately apply
pressure one way to the molex power connector.

I dont need SMART to tell me that its not spinning up.

> Doing a full surface scan in all such cases
> has indicated bad sectors on the drive.

Then that particular SMART implementation is
inadequate because it would be better if you
were warned that new bads have appeared.

> Is SMART not predictive enough?

Clearly in that particular example its not implemented
well enough if you did actually have the bios SMART
check enabled and the SMART was just incapable
of reporting the problem it had observed.

> Is this a fault of the various SMART monitoring utilities?

It certainly can be.

> If SMART doesn't "kick in" until later on
> in the dying process, then what good is it?

Sure, but thats not cast in stone.

> (If I can see the symptoms well in advance of it giving me a warning)
> What are your experiences with SMART and its (in)ability to predict failure?

I basically havent had any failures since SMART showed up where
even if it was perfectly implemented, it would have helped.

Thats not to say that others havent had hard drive failures
that should have produced a SMART warning tho.
Anonymous
a b G Storage
April 14, 2004 11:53:16 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
> Mr. Grinch wrote:
> > Have you any favourites among the SMART monitoring software?
> >
> > I started trying some the other night. Most are pretty limited, and will
> > only show the motherboard attached drives. I've got 3 more on a Promise
> > Ultra133 that I'd like to examine now and then but looks like I'd have to
> > buy a tool to do that.
>
>
> Smartmon is junk,

David Lethe will be so pleased to know.

> HDD Health is nice once you turn off the non-critical
> alerts (like a change in temp of 1 degree... no need to keep popping up
> alerts for that). I'm currently using DTemp, since it puts the temp
> of the drive in the windows sytem tray.
>
> I have a drive on a promise controller (MBFastTrack133) that isn't
> supported by those programs, so I need to run Promise's FastCheck
> utility. It just gives me a "Functional" report on the drive and
> nothing else, but I guess that's better than nothing.
>
>
> -WD
Anonymous
a b G Storage
April 14, 2004 11:55:55 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

In article <ilZec.10106$ci4.1038@fe1.columbus.rr.com>, Will Dormann
<wdormann@yahoo.com.invalid> writes

>What are your experiences with SMART
>and its (in)ability to predict failure?

I worked as a bench tech fixing Compaq machines under warranty. Saw
many dozens of machines which would produce a "SMART Predicts Imminent
Disk Failure" message on bootup when the drive was failing.

Drive replacement was covered by Compaq's 3 year warranty even if it had
not actually yet failed, allowing data to be transferred from the old
drive to the warranty replacement while it was still functional.

--
A. Top posters.
Q. What's the most annoying thing on Usenet?
Anonymous
a b G Storage
April 15, 2004 12:01:22 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
> Have you any favourites among the SMART monitoring software?
>
> I started trying some the other night.

> Most are pretty limited, and will only show the motherboard attached drives.

Most?

> I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
> but looks like I'd have to buy a tool to do that.

The Promise drivers may not support S.M.A.R.T. and therefor no software
will show you S.M.A.R.T. data.

>
> I have not checked out the most recent tools from the vendors yet. The
> drives are IBM and Maxtor. I wasn't sure if they offered any windows
> based SMART monitors for free.
Anonymous
a b G Storage
April 15, 2004 12:41:04 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Folkert Rienstra wrote:

> "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
>
>>Mr. Grinch wrote:
>>
>>>Have you any favourites among the SMART monitoring software?
>>>
>>>I started trying some the other night. Most are pretty limited, and will
>>>only show the motherboard attached drives. I've got 3 more on a Promise
>>>Ultra133 that I'd like to examine now and then but looks like I'd have to
>>>buy a tool to do that.
>>
>>
>>Smartmon is junk,
>
>
> David Lethe will be so pleased to know.


I'm sure he will be. To be fair, the software may have some technical
merits. But it's definately a candidate for the User Interface Hall Of
Shame in my book. :) 


-WD
Anonymous
a b G Storage
April 15, 2004 3:50:27 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Folkert Rienstra" <see_reply-to@myweb.nl> wrote in
news:c5juq0$2ibp9$3@ID-79662.news.uni-berlin.de:

> "Mr. Grinch" <grinch@hatespam.yucky> wrote in message
> news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
>> Have you any favourites among the SMART monitoring software?
>>
>> I started trying some the other night.
>
>> Most are pretty limited, and will only show the motherboard attached
>> drives.
>
> Most?
>

All the one's I've tried so far. One claims to show the devices on other
controllers but you have to buy it first. I was hoping to find something
I'd not have to buy that would show SMART status on the installed
controller.

>> I've got 3 more on a Promise Ultra133 that I'd like to examine now
>> and then but looks like I'd have to buy a tool to do that.
>
> The Promise drivers may not support S.M.A.R.T. and therefor no
> software will show you S.M.A.R.T. data.

That's correct. I know the controller supports SMART, but the driver is
anyone's guess.

It's hard to say for sure without trying out some SMART software is capable
of showing info for installed controllers. So far I've only found one that
claims to do this but you have to buy it first.
Anonymous
a b G Storage
April 15, 2004 6:03:30 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Will Dormann <wdormann@yahoo.com.invalid> wrote:

>> Have you any favourites among the SMART monitoring software?
>>
>> I started trying some the other night. Most are pretty
>> limited, and will only show the motherboard attached drives.
>> I've got 3 more on a Promise Ultra133 that I'd like to
>> examine now and then but looks like I'd have to buy a tool to
>> do that.
>
>
> Smartmon is junk, HDD Health is nice once you turn off the
> non-critical alerts (like a change in temp of 1 degree... no
> need to keep popping up alerts for that). I'm currently
> using DTemp, since it puts the temp of the drive in the
> windows sytem tray.

I like DTEMP too. It is simple and fulfills a straightforward
need.

The trouble is that I installed an IDE controller card and put a
coule of drive on it. Now DTEMP (and Motherboard Monitor) can't
report on the temp of those drives on the controller.


> I have a drive on a promise controller (MBFastTrack133) that
> isn't supported by those programs, so I need to run Promise's
> FastCheck utility. It just gives me a "Functional" report on
> the drive and nothing else, but I guess that's better than
> nothing.

My card uses the Silicon Image chip SiL0680.
http://www.siimage.com/products/overview_sii0680.asp
(Thanks Eric - it works nicely.)

I haven't run the Medley software which comes with my card but the
documentation suggests it is like your FastCheck utility.

Is there any way around not getting temp and other SMART data from
drives attached to such cards?
Anonymous
a b G Storage
April 15, 2004 10:57:59 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:Alhfc.14791$B%4.4329@fe2.columbus.rr.com...
> Folkert Rienstra wrote:
>
> > "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
> >
> >>Mr. Grinch wrote:
> >>
> >>>Have you any favourites among the SMART monitoring software?
> >>>
> >>>I started trying some the other night. Most are pretty limited, and will
> >>>only show the motherboard attached drives. I've got 3 more on a Promise
> >>>Ultra133 that I'd like to examine now and then but looks like I'd have to
> >>>buy a tool to do that.
> >>
> >>
> >>Smartmon is junk,
> >
> >
> > David Lethe will be so pleased to know.
>
>
> I'm sure he will be. To be fair, the software may have some technical
> merits. But it's definately a candidate for the User Interface Hall Of
> Shame in my book. :) 

Mine too.

The Author's Attitude Hall of Shame, too |-)
Anonymous
a b G Storage
April 15, 2004 4:32:24 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Both WD (DLGDIAG for Windows) and IBM give away free SMART HD monitoring
tools, and there are many other freeware programs that'll read SMART
info off your HDs and display them (eg. http://www.hdtune.com/

(funny! Active Smart's screen shot of the SMART info looks almost like
the WD one!)

They both work and saved my data before. Also, most PCs with SMART
turned on in BIOS will report HD problems as well on the next bootup,
which has also saved me in the past.

Keep in mind that all these tools do is to query the HD and report the
current SMART parameters that have been saved on the HD memory -- none
of these programs can do more than just that and alerting you!

I seriously would not bother paying for a tool that's free.

As an example of some of the stored parameters they all report:
(Not all parameters are reported by all drives!)

HD Tune: WDC WD200BB-32CJA0 SMART information

ID Current Worst Treshold Data
(01) Raw Read Error Rate 200 198 51 0
(03) Spin Up Time 96 93 21 2575
(04) Start/Stop Count 100 100 40 364
(05) Reallocated Sector Count 200 200 140 0
(07) Seek Error Rate 200 200 51 0
(09) Power On Hours Count 88 88 0 9131
(0A) Spin Retry Count 100 100 51 0
(0B) Calibration Retry Count 100 100 51 0
(0C) Power Cycle Count 100 100 0 307
(C4) Reallocated Event Count 200 200 0 0
(C5) Current Pending Sector 200 173 0 0
(C6) Offline Uncorrectable 200 200 0 0
(C7) Ultra DMA CRC Error Count 200 253 0 7569
(C8) Write Error Rate 200 200 51 0

Temperature : unknown
Power On Time : 9131
Anonymous
a b G Storage
April 15, 2004 7:28:27 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Wed, 14 Apr 2004 20:01:22 +0200, "Folkert Rienstra" <see_reply-to@myweb.nl>
wrote:

>"Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
>> Have you any favourites among the SMART monitoring software?
>>
>> I started trying some the other night.
>
>> Most are pretty limited, and will only show the motherboard attached drives.
>
>Most?
>
>> I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
>> but looks like I'd have to buy a tool to do that.
>
>The Promise drivers may not support S.M.A.R.T. and therefor no software
>will show you S.M.A.R.T. data.



They do as I have ued it , so get the correct SMART Program..


>>
>> I have not checked out the most recent tools from the vendors yet. The
>> drives are IBM and Maxtor. I wasn't sure if they offered any windows
>> based SMART monitors for free.

----------------------------------------------------------------------------------------------------
Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
Anonymous
a b G Storage
April 15, 2004 7:32:15 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Wed, 14 Apr 2004 20:01:22 +0200, "Folkert Rienstra" <see_reply-to@myweb.nl>
wrote:

>"Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
>> Have you any favourites among the SMART monitoring software?
>>
>> I started trying some the other night.
>
>> Most are pretty limited, and will only show the motherboard attached drives.
>
>Most?
>
>> I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
>> but looks like I'd have to buy a tool to do that.
>
>The Promise drivers may not support S.M.A.R.T. and therefor no software
>will show you S.M.A.R.T. data.



Active SMART works..

>>
>> I have not checked out the most recent tools from the vendors yet. The
>> drives are IBM and Maxtor. I wasn't sure if they offered any windows
>> based SMART monitors for free.

----------------------------------------------------------------------------------------------------
Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
Anonymous
a b G Storage
April 15, 2004 7:33:04 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky> wrote:

>"Folkert Rienstra" <see_reply-to@myweb.nl> wrote in
>news:c5juq0$2ibp9$3@ID-79662.news.uni-berlin.de:
>
>> "Mr. Grinch" <grinch@hatespam.yucky> wrote in message
>> news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
>>> Have you any favourites among the SMART monitoring software?
>>>
>>> I started trying some the other night.
>>
>>> Most are pretty limited, and will only show the motherboard attached
>>> drives.
>>
>> Most?
>>
>
>All the one's I've tried so far. One claims to show the devices on other
>controllers but you have to buy it first. I was hoping to find something
>I'd not have to buy that would show SMART status on the installed
>controller.



Try Active SMART as it works for me.

>>> I've got 3 more on a Promise Ultra133 that I'd like to examine now
>>> and then but looks like I'd have to buy a tool to do that.
>>
>> The Promise drivers may not support S.M.A.R.T. and therefor no
>> software will show you S.M.A.R.T. data.
>
>That's correct. I know the controller supports SMART, but the driver is
>anyone's guess.
>
>It's hard to say for sure without trying out some SMART software is capable
>of showing info for installed controllers. So far I've only found one that
>claims to do this but you have to buy it first.

----------------------------------------------------------------------------------------------------
Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
Anonymous
a b G Storage
April 15, 2004 7:33:05 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

puss@purrpurr.com wrote in
news:ti0s7093gloudjdadc88dccd6l6bfu7l6a@4ax.com:

> On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
> wrote:
>
>>
>>All the one's I've tried so far. One claims to show the devices on
>>other controllers but you have to buy it first. I was hoping to find
>>something I'd not have to buy that would show SMART status on the
>>installed controller.
>
>
>
> Try Active SMART as it works for me.

Tried it, but no luck.

According to the FAQ in the helpfile for Active Smart 2.41 (and Active
Smart SCSI 2.41):

Quote:

Does Active SMART supports Promise Ultra 100 controller (or controllers on
HighPoint chipsets)?

Current version does not support drives on external IDE controllers, that
uses HighPoint or Promise chipsets, but we work on this and support for
HighPoint and possible Promise IDE RAID cards will be added to our
software.
Anonymous
a b G Storage
April 15, 2004 11:53:13 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

David Chien <chiendh@uci.edu> wrote in news:c5mo05$aql$1
@news.service.uci.edu:

> Both WD (DLGDIAG for Windows) and IBM give away free SMART HD monitoring
> tools, and there are many other freeware programs that'll read SMART
> info off your HDs and display them (eg. http://www.hdtune.com/
>

Thanks! I'll look up those tools and try them out!
Anonymous
a b G Storage
April 16, 2004 1:25:38 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Thu, 15 Apr 2004 07:16:19 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
wrote:

>puss@purrpurr.com wrote in
>news:ti0s7093gloudjdadc88dccd6l6bfu7l6a@4ax.com:
>
>> On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
>> wrote:
>>
>>>
>>>All the one's I've tried so far. One claims to show the devices on
>>>other controllers but you have to buy it first. I was hoping to find
>>>something I'd not have to buy that would show SMART status on the
>>>installed controller.
>>
>>
>>
>> Try Active SMART as it works for me.
>
>Tried it, but no luck.
>
>According to the FAQ in the helpfile for Active Smart 2.41 (and Active
>Smart SCSI 2.41):
>
>Quote:
>
> Does Active SMART supports Promise Ultra 100 controller (or controllers on
>HighPoint chipsets)?
>
>Current version does not support drives on external IDE controllers, that
>uses HighPoint or Promise chipsets, but we work on this and support for
>HighPoint and possible Promise IDE RAID cards will be added to our
>software.

Active Smart works for my WD drives on both Promise Ultra 133TX (Win
XP) and Ultra 100 (Win98) controller cards. I have two Computers with
different controllers and operating systems.
It does not work on an IDE raid controller with disks configued as a
raid array, because I tried the SCSI version at work on an IDE Raid
Array and the data it gave was garbage.

Hence I think the author is trying to say his utility will not
currently work with Promise Raid controller cards where the disks
installed as a Raid Array.

regards
Terry Bartlett
Anonymous
a b G Storage
April 16, 2004 1:25:39 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Terence H. Bartlett <terence@bartlett.co.nz> wrote in
news:g6ls709v4g6t3h1vaf2ld7pg2uabnbo6b3@4ax.com:

> Hence I think the author is trying to say his utility will not
> currently work with Promise Raid controller cards where the disks
> installed as a Raid Array.

Ahh, got ya. For some reason, the product does not see my attached
Ultra133 non-raid controller. Strange. I'm running the latest firmware,
and the .42 driver. Have not upgraded to the .43 driver as I didn't want
to "break" anything but perhaps that's what I need to do.

Thanks again.
Anonymous
a b G Storage
May 3, 2004 7:38:38 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

ballen@uwm.edu (Bruce Allen) wrote:
>I've been monitoring about 1100 ATA disk drives (IBM and Maxtor) on
>two large linux data analysis clusters for the past two years. I've
>seen more than forty failures -- SMART has predicted between half and
>two-thirds of those. I've also had advanced warning of *other* types
>of 'non-failure' problems, particularly unreadable (uncorrectable)
>disk sectors which *would* have caused unpredictable and unrepeatable
>errors with the OS and data analysis.
>
>The bottom line is that SMART can not and will not predict all
>failures. But in many cases (especially if you or the monitoring
>software know what to watch for) it will predict a substantial
>fraction of them.

Thank you very much for the real-world info.
If you have any breakdown of failure rates by manufacturer and/or model
that would be even more valuable.
Useful info on reliability is hard to come by.

If your disks were run 24/7 this comes out to an MTBF of under 240khr,
if run 8hr/day the MTBF is under 57khr.
As far back as 1995 IBM was claiming 1Mhr [New Media June 1995], and I think
it's safe to say it's been quite a while since any maker has claimed less than
500k, so we've got a sizable gap between claims and reality.

Subsequently IBM stated it does not quote MTBF numbers for its products
saying the numbers are just confusing, citing legitimate problems with the
varying methods but also making some lame excuses for arguing against the
idea of even trying to measure reliability
[http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebI...]

Likewise Maxtor doesn't bother to provide an MTBF figure for their Diamondmax+9
(www.maxtor.com/en/documentation/data_sheets/diamondmax_...),
which has the advantage of not having anything to explain and thus nothing
that could be critiqued, but can only leave me wondering if this tells us
something about how serious they are about reliability.

Maxtor says "Historically the field MTBF, which includes
all returns regardless of cause, is typically 50-60% of projected MTBF"
[http://maxtor.custhelp.com/cgi-bin/maxtor.cfg/php/endus...]
but this applies to their Quantum SCSI disks. If this also applies to the other
lines, and if we take their Diamondmax+9 Annualized Return Rate (ARR) <1% spec
[www.maxtor.com/en/documentation/data_sheets/diamondmax_...]
to be representative of their projected rather than actual rate, then this
suggests your failure rate of 2%/yr matches their average return rate, so who
knows, maybe the phantom numbers they produce with their undisclosed methods
are pretty accurate when the field vs projected ratio is taken into account.

Seagate earns brownie points for actually explaining how they came up with
their MTBF numbers (www.seagate.com/newsinfo/docs/disc/drive_reliability.pd...).
But it strikes me as suspicious to test for only one month, guaranteeing
they will not see any effects due to aging or gradual wear and tear.
For desktop users, it's also questionable to test continual operation at
a constant temperature, guaranteeing they will not see effects
from thermal cycling and power-on surges.
While my own sample size is small (3 hard and 3 soft failures),
all were about a year or more after being placed in service, so based
on my experience, testing for less than 1.5 years is worthless.
This does however allow them to replace real data with impressive-sounding
mathematical gymnastics involving a bunch of fudge factors to come up
with an extrapolated MTBF that is suitably high.

The Samsung whitepaper on MTBF also provides some detail but is pathetic.
"SAMSUNG's MTBF for HDDs is 500,000 hours. That means that if you use
your PC for 9 hours every day, your HDD should operate for 152 years."
[www.samsung.com/Products/HardDiskDrive/whitepapers/Whit...]
They say they test 480 units for 4 hours or 120 units for 72 hours.
It is impossible to measure an MTBF in the 500k hours range from only
480x4 = 1920 hours or 120x72 = 8640 hours of data; these are insufficient
by 2 orders of magnitude.

I wasn't able to find anything from Western Digital on MTBF other than
some broken links.

The problem with company statements is we can't be sure there's any
correlation between the competence of the design and production
engineers and the competence of the tech writers.
The quality of the company's documentation on reliability may indicate
how seriously they take the issue, but then again
it's possible a company spends all its resources on producing the highest
possible reliability product and doesn't bother how shoddy the
documentation is, or conversely a company could hire a high-powered tech
writer wizard to make up for lack of attention in design and production.
But one thing is clear: providing an MTBF number without providing a
detailed explanation of how it was arrived at is a meaningless exercise.

Most usenet posts are about one disk failure, making it difficult to
evaluate them as indicators of reliability.
Likewise a visit to my local store revealed that they know they've had
lots more failures in 40g drives than 200g but they sell lots
of the small ones and few of the large ones and do not track percentages
so they have no idea what the relative failure _rates_ are.
The same phenomenon is going to apply to comparing different manufacturers.

So thanks in advance for any other real-world data you're able to share.

--
delete NOSPAM to reply by email
Anonymous
a b G Storage
May 4, 2004 4:37:02 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Walter Epp wrote:

> ballen@uwm.edu (Bruce Allen) wrote:
>
>>The bottom line is that SMART can not and will not predict all
>>failures. But in many cases (especially if you or the monitoring
>>software know what to watch for) it will predict a substantial
>>fraction of them.
>
>
> Thank you very much for the real-world info.
> If you have any breakdown of failure rates by manufacturer and/or model
> that would be even more valuable.
> Useful info on reliability is hard to come by.

Just a follow up to this...
I've found a SMART package that seems to work quite nicely:
http://smartmontools.sourceforge.net/

While there's no fancy GUI (or any, for that matter), it provides more
SMART information for a drive than any other software I've tried.


-WD
!