Has SMART ever successfully predicted failure for anybody ..

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

SMART monitoring sounds like a great idea to me. Who better to predict
a drive's failure than the drive itself, right? I mean, the drive knows
what it's doing internally, such as silently remapping bad sectors,
keeping track of how long it takes to spin up, error counts and all that
fun stuff.

SMART is there, comforting me.
Calling out "I'm good. I'll never let you down"

And then another drive begins to show symptoms of impending failure. In
my particular case, it's a Seagate Barracuda V. Just a couple months
past the expiration of the 1-year warranty.

And I start to realize that I can't recall a single case where SMART has
predicted a failure before I actually see the symptoms of the failing
drive. Not in any of the machines I own, or in machines of other people
I know. I've used HDD Health, DTemp, Promise Fastcheck, smartmontools
and ide-smart (on linux), and various manufacturer's utilities for
checking smart.

I've been able to monitor temperatures (it's at 37C right now, which
seems to be the norm for it), and I can look at statistics on the drive.
But not once has SMART actually predicted problems with a drive
before I actually see the symptoms (such as a repeated "thunk" sound, a
frozen mouse in Win2k with the HDD light on solid, or event log entries
indicating a drive error). Doing a full surface scan in all such cases
has indicated bad sectors on the drive.

Is SMART not predictive enough? Is this a fault of the various SMART
monitoring utilities? If SMART doesn't "kick in" until later on in the
dying process, then what good is it? (If I can see the symptoms well in
advance of it giving me a warning) What are your experiences with SMART
and its (in)ability to predict failure?

Thanks.
-WD
22 answers Last reply
More about smart successfully predicted failure
  1. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    > Is SMART not predictive enough? Is this a fault of the various SMART
    > monitoring utilities? If SMART doesn't "kick in" until later on in the
    > dying process, then what good is it? (If I can see the symptoms well in
    > advance of it giving me a warning) What are your experiences with SMART
    > and its (in)ability to predict failure?

    I've been monitoring about 1100 ATA disk drives (IBM and Maxtor) on
    two large linux data analysis clusters for the past two years. I've
    seen more than forty failures -- SMART has predicted between half and
    two-thirds of those. I've also had advanced warning of *other* types
    of 'non-failure' problems, particularly unreadable (uncorrectable)
    disk sectors which *would* have caused unpredictable and unrepeatable
    errors with the OS and data analysis.

    The bottom line is that SMART can not and will not predict all
    failures. But in many cases (especially if you or the monitoring
    software know what to watch for) it will predict a substantial
    fraction of them.

    Bruce Allen
  2. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message
    news:ilZec.10106$ci4.1038@fe1.columbus.rr.com...
    > SMART monitoring sounds like a great idea to me. Who better to predict
    > a drive's failure than the drive itself, right? I mean, the drive knows
    > what it's doing internally, such as silently remapping bad sectors,
    > keeping track of how long it takes to spin up, error counts and all that
    > fun stuff.
    >
    > SMART is there, comforting me.
    > Calling out "I'm good. I'll never let you down"

    Nope, that's where you're wrong. Not all faillures are predictable. No SMART
    alarms by no means indicates that the drive will not let you down. Using
    SMART you *may* be able to to predict failures that are predictable, that's
    the best SMART can do. A drive can die without SMART *ever* predicting it.

    --
    Joep
  3. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Have you any favourites among the SMART monitoring software?

    I started trying some the other night. Most are pretty limited, and will
    only show the motherboard attached drives. I've got 3 more on a Promise
    Ultra133 that I'd like to examine now and then but looks like I'd have to
    buy a tool to do that.

    I have not checked out the most recent tools from the vendors yet. The
    drives are IBM and Maxtor. I wasn't sure if they offered any windows based
    SMART monitors for free.
  4. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Mr. Grinch wrote:
    > Have you any favourites among the SMART monitoring software?
    >
    > I started trying some the other night. Most are pretty limited, and will
    > only show the motherboard attached drives. I've got 3 more on a Promise
    > Ultra133 that I'd like to examine now and then but looks like I'd have to
    > buy a tool to do that.


    Smartmon is junk, HDD Health is nice once you turn off the non-critical
    alerts (like a change in temp of 1 degree... no need to keep popping up
    alerts for that). I'm currently using DTemp, since it puts the temp
    of the drive in the windows sytem tray.

    I have a drive on a promise controller (MBFastTrack133) that isn't
    supported by those programs, so I need to run Promise's FastCheck
    utility. It just gives me a "Functional" report on the drive and
    nothing else, but I guess that's better than nothing.


    -WD
  5. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Will Dormann <wdormann@yahoo.com.invalid> wrote in
    message news:ilZec.10106$ci4.1038@fe1.columbus.rr.com...

    > SMART monitoring sounds like a great idea to me. Who
    > better to predict a drive's failure than the drive itself, right?
    > I mean, the drive knows what it's doing internally, such as
    > silently remapping bad sectors, keeping track of how long
    > it takes to spin up, error counts and all that fun stuff.

    > SMART is there, comforting me.
    > Calling out "I'm good. I'll never let you down"

    It doesnt say that, just that its better than nothing.

    > And then another drive begins to show symptoms of impending
    > failure. In my particular case, it's a Seagate Barracuda V.
    > Just a couple months past the expiration of the 1-year warranty.

    > And I start to realize that I can't recall a single case where
    > SMART has predicted a failure before I actually see the
    > symptoms of the failing drive. Not in any of the machines
    > I own, or in machines of other people I know.

    There have been some examples reported in here.

    And should have helped with the most common
    75GXP failures if it had been implemented properly.

    And its obviously never going to help with
    sudden semiconductor failure for example.
    Or lots of other causes of hard drive failure
    like the power supply dying and killing the drive.

    > I've used HDD Health, DTemp, Promise Fastcheck,
    > smartmontools and ide-smart (on linux), and various
    > manufacturer's utilities for checking smart.

    > I've been able to monitor temperatures

    And that alone is quite handy, being a very convenient
    way to keep track of what the drive temp is doing in the
    worst conditions of high room temp and drive activity.

    That does show that Dell laptops generally run their
    hard drives at higher temps than I would do voluntarily.
    Even tho they are technically within the allowed temp.

    It also allows you to run something like MBM to
    bring up alarms on temperature excursions that
    are due to fan failure or just fur clogging etc.

    > (it's at 37C right now, which seems to be the norm
    > for it), and I can look at statistics on the drive.

    > But not once has SMART actually predicted problems
    > with a drive before I actually see the symptoms (such
    > as a repeated "thunk" sound, a frozen mouse in Win2k
    > with the HDD light on solid, or event log entries
    > indicating a drive error).

    Sure, but with drive failure so rare, thats hardly
    surprising given that SMART was never ever
    claimed to be able to anticipate all hard drive failures.

    I havent actually had a drive failure in any system
    I own or look after since SMART showed up with
    the exception of a single Fujitsu MPA drive that
    has some dry joint or cracked trace problem that
    can see it not spin up unless you deliberately apply
    pressure one way to the molex power connector.

    I dont need SMART to tell me that its not spinning up.

    > Doing a full surface scan in all such cases
    > has indicated bad sectors on the drive.

    Then that particular SMART implementation is
    inadequate because it would be better if you
    were warned that new bads have appeared.

    > Is SMART not predictive enough?

    Clearly in that particular example its not implemented
    well enough if you did actually have the bios SMART
    check enabled and the SMART was just incapable
    of reporting the problem it had observed.

    > Is this a fault of the various SMART monitoring utilities?

    It certainly can be.

    > If SMART doesn't "kick in" until later on
    > in the dying process, then what good is it?

    Sure, but thats not cast in stone.

    > (If I can see the symptoms well in advance of it giving me a warning)
    > What are your experiences with SMART and its (in)ability to predict failure?

    I basically havent had any failures since SMART showed up where
    even if it was perfectly implemented, it would have helped.

    Thats not to say that others havent had hard drive failures
    that should have produced a SMART warning tho.
  6. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
    > Mr. Grinch wrote:
    > > Have you any favourites among the SMART monitoring software?
    > >
    > > I started trying some the other night. Most are pretty limited, and will
    > > only show the motherboard attached drives. I've got 3 more on a Promise
    > > Ultra133 that I'd like to examine now and then but looks like I'd have to
    > > buy a tool to do that.
    >
    >
    > Smartmon is junk,

    David Lethe will be so pleased to know.

    > HDD Health is nice once you turn off the non-critical
    > alerts (like a change in temp of 1 degree... no need to keep popping up
    > alerts for that). I'm currently using DTemp, since it puts the temp
    > of the drive in the windows sytem tray.
    >
    > I have a drive on a promise controller (MBFastTrack133) that isn't
    > supported by those programs, so I need to run Promise's FastCheck
    > utility. It just gives me a "Functional" report on the drive and
    > nothing else, but I guess that's better than nothing.
    >
    >
    > -WD
  7. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    In article <ilZec.10106$ci4.1038@fe1.columbus.rr.com>, Will Dormann
    <wdormann@yahoo.com.invalid> writes

    >What are your experiences with SMART
    >and its (in)ability to predict failure?

    I worked as a bench tech fixing Compaq machines under warranty. Saw
    many dozens of machines which would produce a "SMART Predicts Imminent
    Disk Failure" message on bootup when the drive was failing.

    Drive replacement was covered by Compaq's 3 year warranty even if it had
    not actually yet failed, allowing data to be transferred from the old
    drive to the warranty replacement while it was still functional.

    --
    A. Top posters.
    Q. What's the most annoying thing on Usenet?
  8. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    "Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
    > Have you any favourites among the SMART monitoring software?
    >
    > I started trying some the other night.

    > Most are pretty limited, and will only show the motherboard attached drives.

    Most?

    > I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
    > but looks like I'd have to buy a tool to do that.

    The Promise drivers may not support S.M.A.R.T. and therefor no software
    will show you S.M.A.R.T. data.

    >
    > I have not checked out the most recent tools from the vendors yet. The
    > drives are IBM and Maxtor. I wasn't sure if they offered any windows
    > based SMART monitors for free.
  9. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Folkert Rienstra wrote:

    > "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
    >
    >>Mr. Grinch wrote:
    >>
    >>>Have you any favourites among the SMART monitoring software?
    >>>
    >>>I started trying some the other night. Most are pretty limited, and will
    >>>only show the motherboard attached drives. I've got 3 more on a Promise
    >>>Ultra133 that I'd like to examine now and then but looks like I'd have to
    >>>buy a tool to do that.
    >>
    >>
    >>Smartmon is junk,
    >
    >
    > David Lethe will be so pleased to know.


    I'm sure he will be. To be fair, the software may have some technical
    merits. But it's definately a candidate for the User Interface Hall Of
    Shame in my book. :)


    -WD
  10. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    "Folkert Rienstra" <see_reply-to@myweb.nl> wrote in
    news:c5juq0$2ibp9$3@ID-79662.news.uni-berlin.de:

    > "Mr. Grinch" <grinch@hatespam.yucky> wrote in message
    > news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
    >> Have you any favourites among the SMART monitoring software?
    >>
    >> I started trying some the other night.
    >
    >> Most are pretty limited, and will only show the motherboard attached
    >> drives.
    >
    > Most?
    >

    All the one's I've tried so far. One claims to show the devices on other
    controllers but you have to buy it first. I was hoping to find something
    I'd not have to buy that would show SMART status on the installed
    controller.

    >> I've got 3 more on a Promise Ultra133 that I'd like to examine now
    >> and then but looks like I'd have to buy a tool to do that.
    >
    > The Promise drivers may not support S.M.A.R.T. and therefor no
    > software will show you S.M.A.R.T. data.

    That's correct. I know the controller supports SMART, but the driver is
    anyone's guess.

    It's hard to say for sure without trying out some SMART software is capable
    of showing info for installed controllers. So far I've only found one that
    claims to do this but you have to buy it first.
  11. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Will Dormann <wdormann@yahoo.com.invalid> wrote:

    >> Have you any favourites among the SMART monitoring software?
    >>
    >> I started trying some the other night. Most are pretty
    >> limited, and will only show the motherboard attached drives.
    >> I've got 3 more on a Promise Ultra133 that I'd like to
    >> examine now and then but looks like I'd have to buy a tool to
    >> do that.
    >
    >
    > Smartmon is junk, HDD Health is nice once you turn off the
    > non-critical alerts (like a change in temp of 1 degree... no
    > need to keep popping up alerts for that). I'm currently
    > using DTemp, since it puts the temp of the drive in the
    > windows sytem tray.

    I like DTEMP too. It is simple and fulfills a straightforward
    need.

    The trouble is that I installed an IDE controller card and put a
    coule of drive on it. Now DTEMP (and Motherboard Monitor) can't
    report on the temp of those drives on the controller.


    > I have a drive on a promise controller (MBFastTrack133) that
    > isn't supported by those programs, so I need to run Promise's
    > FastCheck utility. It just gives me a "Functional" report on
    > the drive and nothing else, but I guess that's better than
    > nothing.

    My card uses the Silicon Image chip SiL0680.
    http://www.siimage.com/products/overview_sii0680.asp
    (Thanks Eric - it works nicely.)

    I haven't run the Medley software which comes with my card but the
    documentation suggests it is like your FastCheck utility.

    Is there any way around not getting temp and other SMART data from
    drives attached to such cards?
  12. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:Alhfc.14791$B%4.4329@fe2.columbus.rr.com...
    > Folkert Rienstra wrote:
    >
    > > "Will Dormann" <wdormann@yahoo.com.invalid> wrote in message news:yL0fc.8908$B%4.5111@fe2.columbus.rr.com...
    > >
    > >>Mr. Grinch wrote:
    > >>
    > >>>Have you any favourites among the SMART monitoring software?
    > >>>
    > >>>I started trying some the other night. Most are pretty limited, and will
    > >>>only show the motherboard attached drives. I've got 3 more on a Promise
    > >>>Ultra133 that I'd like to examine now and then but looks like I'd have to
    > >>>buy a tool to do that.
    > >>
    > >>
    > >>Smartmon is junk,
    > >
    > >
    > > David Lethe will be so pleased to know.
    >
    >
    > I'm sure he will be. To be fair, the software may have some technical
    > merits. But it's definately a candidate for the User Interface Hall Of
    > Shame in my book. :)

    Mine too.

    The Author's Attitude Hall of Shame, too |-)
  13. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Both WD (DLGDIAG for Windows) and IBM give away free SMART HD monitoring
    tools, and there are many other freeware programs that'll read SMART
    info off your HDs and display them (eg. http://www.hdtune.com/

    (funny! Active Smart's screen shot of the SMART info looks almost like
    the WD one!)

    They both work and saved my data before. Also, most PCs with SMART
    turned on in BIOS will report HD problems as well on the next bootup,
    which has also saved me in the past.

    Keep in mind that all these tools do is to query the HD and report the
    current SMART parameters that have been saved on the HD memory -- none
    of these programs can do more than just that and alerting you!

    I seriously would not bother paying for a tool that's free.

    As an example of some of the stored parameters they all report:
    (Not all parameters are reported by all drives!)

    HD Tune: WDC WD200BB-32CJA0 SMART information

    ID Current Worst Treshold Data
    (01) Raw Read Error Rate 200 198 51 0
    (03) Spin Up Time 96 93 21 2575
    (04) Start/Stop Count 100 100 40 364
    (05) Reallocated Sector Count 200 200 140 0
    (07) Seek Error Rate 200 200 51 0
    (09) Power On Hours Count 88 88 0 9131
    (0A) Spin Retry Count 100 100 51 0
    (0B) Calibration Retry Count 100 100 51 0
    (0C) Power Cycle Count 100 100 0 307
    (C4) Reallocated Event Count 200 200 0 0
    (C5) Current Pending Sector 200 173 0 0
    (C6) Offline Uncorrectable 200 200 0 0
    (C7) Ultra DMA CRC Error Count 200 253 0 7569
    (C8) Write Error Rate 200 200 51 0

    Temperature : unknown
    Power On Time : 9131
  14. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    On Wed, 14 Apr 2004 20:01:22 +0200, "Folkert Rienstra" <see_reply-to@myweb.nl>
    wrote:

    >"Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
    >> Have you any favourites among the SMART monitoring software?
    >>
    >> I started trying some the other night.
    >
    >> Most are pretty limited, and will only show the motherboard attached drives.
    >
    >Most?
    >
    >> I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
    >> but looks like I'd have to buy a tool to do that.
    >
    >The Promise drivers may not support S.M.A.R.T. and therefor no software
    >will show you S.M.A.R.T. data.


    They do as I have ued it , so get the correct SMART Program..


    >>
    >> I have not checked out the most recent tools from the vendors yet. The
    >> drives are IBM and Maxtor. I wasn't sure if they offered any windows
    >> based SMART monitors for free.

    ----------------------------------------------------------------------------------------------------
    Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
  15. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    On Wed, 14 Apr 2004 20:01:22 +0200, "Folkert Rienstra" <see_reply-to@myweb.nl>
    wrote:

    >"Mr. Grinch" <grinch@hatespam.yucky> wrote in message news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
    >> Have you any favourites among the SMART monitoring software?
    >>
    >> I started trying some the other night.
    >
    >> Most are pretty limited, and will only show the motherboard attached drives.
    >
    >Most?
    >
    >> I've got 3 more on a Promise Ultra133 that I'd like to examine now and then
    >> but looks like I'd have to buy a tool to do that.
    >
    >The Promise drivers may not support S.M.A.R.T. and therefor no software
    >will show you S.M.A.R.T. data.


    Active SMART works..

    >>
    >> I have not checked out the most recent tools from the vendors yet. The
    >> drives are IBM and Maxtor. I wasn't sure if they offered any windows
    >> based SMART monitors for free.

    ----------------------------------------------------------------------------------------------------
    Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
  16. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky> wrote:

    >"Folkert Rienstra" <see_reply-to@myweb.nl> wrote in
    >news:c5juq0$2ibp9$3@ID-79662.news.uni-berlin.de:
    >
    >> "Mr. Grinch" <grinch@hatespam.yucky> wrote in message
    >> news:Xns94CAC104071B4grinchhatespamyucksh@24.71.223.159
    >>> Have you any favourites among the SMART monitoring software?
    >>>
    >>> I started trying some the other night.
    >>
    >>> Most are pretty limited, and will only show the motherboard attached
    >>> drives.
    >>
    >> Most?
    >>
    >
    >All the one's I've tried so far. One claims to show the devices on other
    >controllers but you have to buy it first. I was hoping to find something
    >I'd not have to buy that would show SMART status on the installed
    >controller.


    Try Active SMART as it works for me.

    >>> I've got 3 more on a Promise Ultra133 that I'd like to examine now
    >>> and then but looks like I'd have to buy a tool to do that.
    >>
    >> The Promise drivers may not support S.M.A.R.T. and therefor no
    >> software will show you S.M.A.R.T. data.
    >
    >That's correct. I know the controller supports SMART, but the driver is
    >anyone's guess.
    >
    >It's hard to say for sure without trying out some SMART software is capable
    >of showing info for installed controllers. So far I've only found one that
    >claims to do this but you have to buy it first.

    ----------------------------------------------------------------------------------------------------
    Life is not measured by the number of breaths we take, but by the moments that take our breath away. (George Carlin)
  17. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    puss@purrpurr.com wrote in
    news:ti0s7093gloudjdadc88dccd6l6bfu7l6a@4ax.com:

    > On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
    > wrote:
    >
    >>
    >>All the one's I've tried so far. One claims to show the devices on
    >>other controllers but you have to buy it first. I was hoping to find
    >>something I'd not have to buy that would show SMART status on the
    >>installed controller.
    >
    >
    >
    > Try Active SMART as it works for me.

    Tried it, but no luck.

    According to the FAQ in the helpfile for Active Smart 2.41 (and Active
    Smart SCSI 2.41):

    Quote:

    Does Active SMART supports Promise Ultra 100 controller (or controllers on
    HighPoint chipsets)?

    Current version does not support drives on external IDE controllers, that
    uses HighPoint or Promise chipsets, but we work on this and support for
    HighPoint and possible Promise IDE RAID cards will be added to our
    software.
  18. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    David Chien <chiendh@uci.edu> wrote in news:c5mo05$aql$1
    @news.service.uci.edu:

    > Both WD (DLGDIAG for Windows) and IBM give away free SMART HD monitoring
    > tools, and there are many other freeware programs that'll read SMART
    > info off your HDs and display them (eg. http://www.hdtune.com/
    >

    Thanks! I'll look up those tools and try them out!
  19. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    On Thu, 15 Apr 2004 07:16:19 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
    wrote:

    >puss@purrpurr.com wrote in
    >news:ti0s7093gloudjdadc88dccd6l6bfu7l6a@4ax.com:
    >
    >> On Wed, 14 Apr 2004 23:50:27 GMT, "Mr. Grinch" <grinch@hatespam.yucky>
    >> wrote:
    >>
    >>>
    >>>All the one's I've tried so far. One claims to show the devices on
    >>>other controllers but you have to buy it first. I was hoping to find
    >>>something I'd not have to buy that would show SMART status on the
    >>>installed controller.
    >>
    >>
    >>
    >> Try Active SMART as it works for me.
    >
    >Tried it, but no luck.
    >
    >According to the FAQ in the helpfile for Active Smart 2.41 (and Active
    >Smart SCSI 2.41):
    >
    >Quote:
    >
    > Does Active SMART supports Promise Ultra 100 controller (or controllers on
    >HighPoint chipsets)?
    >
    >Current version does not support drives on external IDE controllers, that
    >uses HighPoint or Promise chipsets, but we work on this and support for
    >HighPoint and possible Promise IDE RAID cards will be added to our
    >software.

    Active Smart works for my WD drives on both Promise Ultra 133TX (Win
    XP) and Ultra 100 (Win98) controller cards. I have two Computers with
    different controllers and operating systems.
    It does not work on an IDE raid controller with disks configued as a
    raid array, because I tried the SCSI version at work on an IDE Raid
    Array and the data it gave was garbage.

    Hence I think the author is trying to say his utility will not
    currently work with Promise Raid controller cards where the disks
    installed as a Raid Array.

    regards
    Terry Bartlett
  20. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Terence H. Bartlett <terence@bartlett.co.nz> wrote in
    news:g6ls709v4g6t3h1vaf2ld7pg2uabnbo6b3@4ax.com:

    > Hence I think the author is trying to say his utility will not
    > currently work with Promise Raid controller cards where the disks
    > installed as a Raid Array.

    Ahh, got ya. For some reason, the product does not see my attached
    Ultra133 non-raid controller. Strange. I'm running the latest firmware,
    and the .42 driver. Have not upgraded to the .43 driver as I didn't want
    to "break" anything but perhaps that's what I need to do.

    Thanks again.
  21. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    ballen@uwm.edu (Bruce Allen) wrote:
    >I've been monitoring about 1100 ATA disk drives (IBM and Maxtor) on
    >two large linux data analysis clusters for the past two years. I've
    >seen more than forty failures -- SMART has predicted between half and
    >two-thirds of those. I've also had advanced warning of *other* types
    >of 'non-failure' problems, particularly unreadable (uncorrectable)
    >disk sectors which *would* have caused unpredictable and unrepeatable
    >errors with the OS and data analysis.
    >
    >The bottom line is that SMART can not and will not predict all
    >failures. But in many cases (especially if you or the monitoring
    >software know what to watch for) it will predict a substantial
    >fraction of them.

    Thank you very much for the real-world info.
    If you have any breakdown of failure rates by manufacturer and/or model
    that would be even more valuable.
    Useful info on reliability is hard to come by.

    If your disks were run 24/7 this comes out to an MTBF of under 240khr,
    if run 8hr/day the MTBF is under 57khr.
    As far back as 1995 IBM was claiming 1Mhr [New Media June 1995], and I think
    it's safe to say it's been quite a while since any maker has claimed less than
    500k, so we've got a sizable gap between claims and reality.

    Subsequently IBM stated it does not quote MTBF numbers for its products
    saying the numbers are just confusing, citing legitimate problems with the
    varying methods but also making some lame excuses for arguing against the
    idea of even trying to measure reliability
    [http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FQ101856]

    Likewise Maxtor doesn't bother to provide an MTBF figure for their Diamondmax+9
    (www.maxtor.com/en/documentation/data_sheets/diamondmax_plus_9_data_sheet.pdf),
    which has the advantage of not having anything to explain and thus nothing
    that could be critiqued, but can only leave me wondering if this tells us
    something about how serious they are about reliability.

    Maxtor says "Historically the field MTBF, which includes
    all returns regardless of cause, is typically 50-60% of projected MTBF"
    [http://maxtor.custhelp.com/cgi-bin/maxtor.cfg/php/enduser/olh_adp.php?p_faqid=545]
    but this applies to their Quantum SCSI disks. If this also applies to the other
    lines, and if we take their Diamondmax+9 Annualized Return Rate (ARR) <1% spec
    [www.maxtor.com/en/documentation/data_sheets/diamondmax_plus_9_data_sheet.pdf]
    to be representative of their projected rather than actual rate, then this
    suggests your failure rate of 2%/yr matches their average return rate, so who
    knows, maybe the phantom numbers they produce with their undisclosed methods
    are pretty accurate when the field vs projected ratio is taken into account.

    Seagate earns brownie points for actually explaining how they came up with
    their MTBF numbers (www.seagate.com/newsinfo/docs/disc/drive_reliability.pdf).
    But it strikes me as suspicious to test for only one month, guaranteeing
    they will not see any effects due to aging or gradual wear and tear.
    For desktop users, it's also questionable to test continual operation at
    a constant temperature, guaranteeing they will not see effects
    from thermal cycling and power-on surges.
    While my own sample size is small (3 hard and 3 soft failures),
    all were about a year or more after being placed in service, so based
    on my experience, testing for less than 1.5 years is worthless.
    This does however allow them to replace real data with impressive-sounding
    mathematical gymnastics involving a bunch of fudge factors to come up
    with an extrapolated MTBF that is suitably high.

    The Samsung whitepaper on MTBF also provides some detail but is pathetic.
    "SAMSUNG's MTBF for HDDs is 500,000 hours. That means that if you use
    your PC for 9 hours every day, your HDD should operate for 152 years."
    [www.samsung.com/Products/HardDiskDrive/whitepapers/WhitePaper_05.htm]
    They say they test 480 units for 4 hours or 120 units for 72 hours.
    It is impossible to measure an MTBF in the 500k hours range from only
    480x4 = 1920 hours or 120x72 = 8640 hours of data; these are insufficient
    by 2 orders of magnitude.

    I wasn't able to find anything from Western Digital on MTBF other than
    some broken links.

    The problem with company statements is we can't be sure there's any
    correlation between the competence of the design and production
    engineers and the competence of the tech writers.
    The quality of the company's documentation on reliability may indicate
    how seriously they take the issue, but then again
    it's possible a company spends all its resources on producing the highest
    possible reliability product and doesn't bother how shoddy the
    documentation is, or conversely a company could hire a high-powered tech
    writer wizard to make up for lack of attention in design and production.
    But one thing is clear: providing an MTBF number without providing a
    detailed explanation of how it was arrived at is a meaningless exercise.

    Most usenet posts are about one disk failure, making it difficult to
    evaluate them as indicators of reliability.
    Likewise a visit to my local store revealed that they know they've had
    lots more failures in 40g drives than 200g but they sell lots
    of the small ones and few of the large ones and do not track percentages
    so they have no idea what the relative failure _rates_ are.
    The same phenomenon is going to apply to comparing different manufacturers.

    So thanks in advance for any other real-world data you're able to share.

    --
    delete NOSPAM to reply by email
  22. Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

    Walter Epp wrote:

    > ballen@uwm.edu (Bruce Allen) wrote:
    >
    >>The bottom line is that SMART can not and will not predict all
    >>failures. But in many cases (especially if you or the monitoring
    >>software know what to watch for) it will predict a substantial
    >>fraction of them.
    >
    >
    > Thank you very much for the real-world info.
    > If you have any breakdown of failure rates by manufacturer and/or model
    > that would be even more valuable.
    > Useful info on reliability is hard to come by.

    Just a follow up to this...
    I've found a SMART package that seems to work quite nicely:
    http://smartmontools.sourceforge.net/

    While there's no fancy GUI (or any, for that matter), it provides more
    SMART information for a drive than any other software I've tried.


    -WD
Ask a new question

Read More

Storage