How Seagate Tests Its Hard Drives

Analysis

The Longmont facility contains a full manufacturing lab capable of producing prototype volumes of any of Seagate’s drives in a fully automated fashion. The lab itself could fill an entire article in how it goes about using robotic systems, visual feedback loops for calibration, and tracking systems able to identify not the drive batch a given disk came from but also its media tray and even the slot position within that tray. It’s mind-boggling.

Assembly

However, this lab is only a microcosm of the dramatically larger manufacturing facilities located abroad. Contamination is a constant focus for Seagate drive design robustness and manufacturing processes. Despite assembly happening in rigid cleanroom conditions, there’s no such thing as perfect cleanliness. Contamination could occur from factors outside the drive as well as inside—a type of lubricant, a chemical emission from a new PCB component, and so on. Some factors lie within Seagate’s factories; others can come from outside suppliers. And remember when we talked about drives getting pulled from atmospheric test chambers for analysis? Contamination can result from atmospheric influences. If contaminants get onto the media or heads, it may prove disastrous for drive reliability.

The forensic quest for contaminants starts here, with a $1.6 million secondary ion mass spectrometer. In the simplest terms, this machine allows scientists to analyze materials from the very topmost layers of heads or media, sometimes down to only a few molecules.

When we arrived in the spectrometer lab, workers were busy examining a chemical contaminant fingerprint. Most likely, it came from one of the lubricants used within the drive. Interestingly, though, different components within the drive can use different lubricants, and each gives off a unique spectral pattern under analysis. In this way, scientists can better pinpoint the root cause of potential contamination issues.

The secondary mass ion spectrometer room stands adjacent to the particle metrology lab. Here, every particulate that can be extracted from a component gets extracted and analyzed. Workers measure quantities, but they also intensively characterize the types of particulate that are either intrinsic to the material or are present as a contaminant.

Of course, no analysis lab would be complete without a scanning electron microscope (SEM) or two. During our visit, we saw one machine taking extractions filtered out from the particle metrology lab and subsequently dried, then examined by the SEM for identification. On a different SEM, shown below,  white dots are observed and identified as bits of corrosion measuring only a few dozen molecules across. These were found on heads exposed to the three-week high temperature and humidity conditions mentioned earlier.

Not all analysis is chemical. Our last leg in the analysis wing took us to the metallography lab, which essentially involves lots and lots of cross-sectioning work.

“We cross section any part, any sub-assembly,” explained one senior technician. “We even cross-section whole drives here. There are many, many reasons to do that. Oftentimes, it’s for the mechanical design team to look at tolerances off of a production part. More fun is when we’re looking at parts that have been subjected to environmental conditions that are intolerable for human beings, then figuring out the robustness of the component or the sub-assembly. Increasingly, drives are expected to perform in extreme environments. We send off drives for corrosion and pollution and heat, then we evaluate what the consequences are, to the PCBA especially. Part of our concern is about the need for a longer warranty on enterprise drives, but we also need to address markets in these inhospitable environments.”

The variety of things that can be learned through cross-sectioning is fascinating. Worker may intentionally fracture a drive to see what happens to its materials, particularly on surfaces. The PCB occupies a lot of attention, especially under chips, as do solder joints around the drive. Not surprisingly, such matters become doubly important when designing and refining the integration of new drive technologies.

Grind & Shine

In these images, you see cross-sections of a new design’s base plate. When imaged through a microscope, one can see the hairline crack splitting through a corner region. This is exactly the sort of thing engineers want remedied before completing the Design phase.

MORE: Best SSDs For The Money
MORE: How We Test HDDs And SSDs

MORE: All Storage Content

This thread is closed for comments
87 comments
    Your comment
  • tom10167
    Awesome photos. I don't know what the last picture is but I know I need one of those in my house.
  • Rookie_MIB
    Quote:
    Awesome photos. I don't know what the last picture is but I know I need one of those in my house.


    That is an enterprise storage rack full of 2u hotswap chassis. 18 chassis, 12 drives per chassis = 216 drives @ 6tb (?) per drive = 1,296 terabytes or 1.3 Petabytes.

    You could store a lot of TV shows or movies on that thing. Imagine how many of those are used for YouTube? Yikes. They get 300 hours of footage uploaded every minute.
  • Mike-TH
    So if their testing is so good, why are their drives among the worst for reliability - to the point where most IT people I know actually refuse to use them, or if forced to use them will keep (and use) more spares than for other makers.
  • Tom20160027
    The article explains the different types of drive/MTBF and why the backblaze test is useless information. Marketing plot to have folks talking about it and re-posting its link. It seems to work as we keep seeing the link over and over... They are not getting my data. They put drives designed for desktop into servers and run them to the ground and call it a "reliability test". Let's test my kids bicycle with training wheels at the Tour de France and complain about its quality....

    I know IT folks that refuse to use other brands of drives as well. I know IT folks that refuse to use servers from this brand or that brand. We can find anecdotal information about anything. It does not make it true.
  • Glock24
    Seagate tests their drives? I thought they didn't!

    I've had more Seagate drives die without warning than any other brand. The only ones that have survived are some old 250GB Barracuda ES. All other models I've owned had lots of bad sectors or just stopped working before the first year, but SMART almost always says the drive is fine!
  • zodiacfml
    Yawn. All I think of right now is that HDDs will become the tape drives of the past.
  • Garrek99
    The only drives I've ever had go bad on me were Seagate drives.
    Every other drive I've ever purchased simply became obsolete due to size and thus replaced.
    They should be reading about how the other drive makers do their testing and learn from that. Hahaha
  • rosen380
    Maybe things changed... but all of my old SGI machines always had Seagate drives in them and the 20+ year old drives all still work. Hell look at what these drives *sell* for on eBay:
    http://www.ebay.com/sch/i.html?_sc=1&_udlo=0&_fln=1&_udhi=200&LH_Complete=1&_ssov=1&_mPrRngCbx=1&LH_Sold=1&_from=R40&_sacat=0&_nkw=%28st31200N%2C+st32171N%2C+st32272N%2C+ST34371N%2C+st34520N%2C+st34573n%2C+st39173N%2C+st318417N%2C+st52160N%29&_sop=16


    4.5 GB drives *selling* for $150+ I see a 2Gb for $120.

    They must have been pretty decent at some point if SGI was putting them in their $5000-20000 workstations and people are spending $40+ per GB to get these now...
  • rosen380
    Link was too long... http://tinyurl.com/gntz4p2
  • Bossyfins
    How does Segate test their hard drives?


    They don't LOL


    It is nice to see this, but failure rates to damn high.
  • Colin_10
    This article is reminiscent of a "How it's made episode," with the exception that it actually covers the details that make a manufacturing process interesting. Well written article.
  • Bossyfins
    ^ I agree, but If they test their HDDs, I should be able to see results, aka; less failure rates.
  • kittle
    I never understood everyone griping about HDD failures for a specific brand. I have several seagate drives that are 10yrs old and STILL WORK FINE. I have several WD drives that still work and they are also 10yrs old. they are slow compared to today's standards, but they work.

    Of all the drives i have used, only 3 have failed in 20+ years of using PCs. 1st one was already well abused and it fried a chip on the pcb. 2nd one cooked itself because I had no clue 10k rpm drives needed active cooling and the 3rd one failed because I repeatedly dropped it on the floor.

    take care of your drives and they will return the favor
  • none12345
    "Yawn. All I think of right now is that HDDs will become the tape drives of the past. "

    Perhapps one day....however you do realize that tape drives are still used right?

    Tapes are still cheaper then hard drives, are still cheaper then ssds, on a cost/TB. They are still the cheapest way to archive lots of data.
  • jimmysmitty
    2185273 said:


    How did I know that this was going to come up.

    579055 said:
    So if their testing is so good, why are their drives among the worst for reliability - to the point where most IT people I know actually refuse to use them, or if forced to use them will keep (and use) more spares than for other makers.


    Most IT people buy their servers, SANs etc prebuilt and typically have no choice in what brand of drives are used as that is normally decided by the OEM such as Dell, Nimble etc. We have servers with all brands. In fact so far in the year I have worked here we have had more WDs fail than Seagates or Toshibas but that does not mean the WDs are worse.

    2185311 said:
    The article explains the different types of drive/MTBF and why the backblaze test is useless information. Marketing plot to have folks talking about it and re-posting its link. It seems to work as we keep seeing the link over and over... They are not getting my data. They put drives designed for desktop into servers and run them to the ground and call it a "reliability test". Let's test my kids bicycle with training wheels at the Tour de France and complain about its quality.... I know IT folks that refuse to use other brands of drives as well. I know IT folks that refuse to use servers from this brand or that brand. We can find anecdotal information about anything. It does not make it true.


    The problem is getting people to grasp the concept of using a product in an unintended environment AND, to top it off, not even mounting them properly in some cases thus making this data unusable for consumers as no consumer OEM PC or self built PC will have a improperly mounted drive in a torture environment. It is much like using Prime95 these days. Sure it is great if you want absolute maximum temps but if you run the Asus real world stress test you will have a better picture of how your system will perform.

    The best thing about that is that in their Q3 report, Seagates dropped to under 6% failure rates yet WD jumped to 8%.

    I also found another interesting fact, the 3TB Seagate everyone fears is a HDD that is rated for 2400 power on hours per year, it is not meant to be on 24x7. Move to the Enterprise class 3TB and guess what? It is designed and rated for 24x7 use.

    It is just another flawed study of a component that is hard to put real world failure rate numbers out since there are a ton of different reasons for HDD failures due to so many different configurations and environments.

    Other than that the article was very interesting.
  • teahsr
    "How Seagate works with Toms to recover some credibility"

    is what this article should have been entitled. Clear attempt to co-opt Tom's to recover a reputation tarnished by awful reliability. I wonder how much advertising Tom's sold Seagate in conjunction with this article?

    Quote:
    The article explains the different types of drive/MTBF and why the backblaze test is useless information. Marketing plot to have folks talking about it and re-posting its link. It seems to work as we keep seeing the link over and over... They are not getting my data. They put drives designed for desktop into servers and run them to the ground and call it a "reliability test". Let's test my kids bicycle with training wheels at the Tour de France and complain about its quality....


    Backblaze's use of hard drives is neither pure server or consumer. Typically usage patterns are write the data, leave it there, rarely if ever reading it. That is the nature of cloud backup. Not withstanding backblaze's data, Seagate has form in unreliability - anyone remember the 7200.11 1tb drives - they died like flies and Seagate covered it up for ages.

    the reason I'll never buy Seagate again, is how they deal with dodgy drives. I had 7200.11 drives die just outside the warranty period - Seagate refused to replace. Seagate has continued to sell drives with know failure rates of 40% - not ethical and not acceptable.
  • jimmysmitty
    1882370 said:
    "How Seagate works with Toms to recover some credibility" is what this article should have been entitled. Clear attempt to co-opt Tom's to recover a reputation tarnished by awful reliability. I wonder how much advertising Tom's sold Seagate in conjunction with this article?
    Quote:
    The article explains the different types of drive/MTBF and why the backblaze test is useless information. Marketing plot to have folks talking about it and re-posting its link. It seems to work as we keep seeing the link over and over... They are not getting my data. They put drives designed for desktop into servers and run them to the ground and call it a "reliability test". Let's test my kids bicycle with training wheels at the Tour de France and complain about its quality....
    Backblaze's use of hard drives is neither pure server or consumer. Typically usage patterns are write the data, leave it there, rarely if ever reading it. That is the nature of cloud backup. Not withstanding backblaze's data, Seagate has form in unreliability - anyone remember the 7200.11 1tb drives - they died like flies and Seagate covered it up for ages. the reason I'll never buy Seagate again, is how they deal with dodgy drives. I had 7200.11 drives die just outside the warranty period - Seagate refused to replace. Seagate has continued to sell drives with know failure rates of 40% - not ethical and not acceptable.


    Known failure rate of 40%? Where do you get these numbers? How are these numbers obtained? What scenarios are these numbers obtained in?

    BTW, did you know consumer HDDs from most brands are not rated for 24x7 operation hence why they should not be used in any server environment? It is a server environment in that it is a server with multiple HDDs running 24x7. They may not be read from all the time but they are spinning all the time and most 7200RPM HDDs will spin at 7200RPM unless they have a power saving feature.

    Running a consumer drive rated for normal use, i.e. that it will be powered on and off, will result in much different failure rates.

    BTW, the majority of companies will not replace a product outside of the warranty unless it is an issue affecting a massive amount of people such as a recall.

    This is not the first time Toms has gone into a company to see how they do things. They did not get paid anything nor did they sell advertising to Seagate.
  • firefoxx04
    Here we go again. Seagate drives died when other brands didn't. I don't care if they were misused. I want drives that can be abused and still work. Not fragile drives.
  • littleleo
    Quote:
    Quote:
    Awesome photos. I don't know what the last picture is but I know I need one of those in my house.
    That is an enterprise storage rack full of 2u hotswap chassis. 18 chassis, 12 drives per chassis = 216 drives @ 6tb (?) per drive = 1,296 terabytes or 1.3 Petabytes. You could store a lot of TV shows or movies on that thing. Imagine how many of those are used for YouTube? Yikes. They get 300 hours of footage uploaded every minute.


    That's a lot of porn, lol.
  • mrjhh
    Some other interesting information would be when Seagate started their testing process in the form it is today. There were some families of drives which were almost guaranteed to fail. I had a couple of the 1.5TB drives, both of which failed in less than one year. The warranty replacements have lasted 5+ years. Talking about some of the less reliable drives, the problems they had, and the processes they put in place to ensure those failures would not happen again would do wonders to improve their reputation. Making drives that run 24x7 for 5 years for the consumer market would also help their reputation. Most consumers buying a 4+ TB hard drive aren't turning their machines off when they are done, although they may sleep. Maybe they have to spend an extra dollar to make the drive better, and won't have enough differentiation with their more expensive drives. But, that's what they need to get both the commercial and home users who specify brand to have more consumer confidence.
  • littleleo
    You know the old saying "the only thing for certain is death and taxes", well we need to add one more and that is "your hard drive will fail whatever brand it is". I had hard drives fail from every brand. Over the years I've noticed sometimes a brand has a bad cycle and everyone notices it and switches for awhile. I remember the IBM Deskstar which we called the Deathstar since it died the last day of the warranty on que every time. At one point no one wanted WDs, and lately Seagate has had some issues especially with the 3 TB models isn't there a class action lawsuit on the external backup model? Point is don't buy any brand and expect it to last forever. Drives that last longer than a few years are an anomaly not the rule. Expect them to fail and back up all your important stuff.
  • jimmysmitty
    147996 said:
    Some other interesting information would be when Seagate started their testing process in the form it is today. There were some families of drives which were almost guaranteed to fail. I had a couple of the 1.5TB drives, both of which failed in less than one year. The warranty replacements have lasted 5+ years. Talking about some of the less reliable drives, the problems they had, and the processes they put in place to ensure those failures would not happen again would do wonders to improve their reputation. Making drives that run 24x7 for 5 years for the consumer market would also help their reputation. Most consumers buying a 4+ TB hard drive aren't turning their machines off when they are done, although they may sleep. Maybe they have to spend an extra dollar to make the drive better, and won't have enough differentiation with their more expensive drives. But, that's what they need to get both the commercial and home users who specify brand to have more consumer confidence.


    So Seagate needs to make their consumer drives rate for 24x7 operation but no other company does? No other companies consumer HDDs are rated for 24x7. Just because you might keep yours on 24x7 does not mean that that drive is rated for it.

    To top it off even if a consumer keeps their system on 24x7 the HDD has sleep states and is not being accessed or written to constantly like in an enterprise solution.

    No HDD line is guaranteed to fail. It is just luck of the draw and the situation you put it in. If you let it get dusty and run warm it sure as hell will die a lot faster than if you kept it cool and dust free.
  • teahsr
    Quote:
    Known failure rate of 40%? Where do you get these numbers? How are these numbers obtained? What scenarios are these numbers obtained in? BTW, did you know consumer HDDs from most brands are not rated for 24x7 operation hence why they should not be used in any server environment? It is a server environment in that it is a server with multiple HDDs running 24x7. They may not be read from all the time but they are spinning all the time and most 7200RPM HDDs will spin at 7200RPM unless they have a power saving feature. Running a consumer drive rated for normal use, i.e. that it will be powered on and off, will result in much different failure rates. BTW, the majority of companies will not replace a product outside of the warranty unless it is an issue affecting a massive amount of people such as a recall. This is not the first time Toms has gone into a company to see how they do things. They did not get paid anything nor did they sell advertising to Seagate.



    Failure rate is from Backblaze - 3tb Seagate drive failures in 2014 - 43.1%.

    https://www.backblaze.com/blog/wp-content/uploads/2015/07/blog-fail-drives-manufacture-2015-june.jpg

    Toms reported in 2009 that the 7200.11 drives had reported failure rates of 30-40%.

    http://www.tomshardware.com/news/seagate-7200.11-failing,6844.html

    The 7200.11 issue effected 500gb, 1tb and 1.5tb drives, with Seagate deleting posts on there forums about the issue to cover it up. Eventually Seagate offered free data recovery for effected persons, but would not replace the drives outside of warranty.

    I'd say with 40% defect rate, on a hard drive where important data is held, a recall should have been done and all faulty drives replaced upon request. Nope - seagate decided to deny the issue and refuse replacements.

    BTW - did you know that Backblaze uses all the major brands consumer drives in a server environment. And Seagate 3tb are the only drive that has shown catastrophic failure rates.

    Also - sure Toms would not have been paid for the article, as for advertising...well that information would be commercial in confidence. But Seagate has given VIP access for Toms to have an article. Anyone asking why, would be thinking it's to improve Seagates reputation, and has given an article to Toms on a platter. it's how the PR/Media industries work and IMO pretty dodgy.