During the Design phase, engineers are challenged to optimize functionality, schedule, and cost without compromising on reliability. Seagate iterates design points until there is confidence that customers will have, as Seagate puts it, “a positive field reliability experience,” meaning that the drive holds up as expected. Does that mean perfect reliability? No. Would anyone have called the Wright Flyer reliable? Of course not. That thing was totaled by a wind gust while parked on the ground following its first day of testing. Yet ongoing waves of design refinement increased reliability, with the third-generation Flyer staying aloft for up to 39 minutes and proving far less prone to crashing.
Seagate’s Design phase flies a similar route, although, unlike the Wright brothers, Seagate engineers know exactly what quality levels they need to reach. Enterprise Capacity, for example, needed to achieve a mean-time between failure (MTBF) of 2,000,000 hours as part of its design objectives. A 2.0 million-hour MTBF is common and expected among a certain class of enterprise hard drives, and the design would not be accepted by the market if it failed to reach this level of reliability.
Two points should be made here. First, engineers could design a drive with much higher reliability. As one of them told us, it’s possible to design a drive that would run for 100 years—but no one would ever buy it. So when you see MTBF specs of 800,000 hours or 2 million hours, understand that those are reliability levels commonly expected by certain market segments. A home PC that might run for three hours per day over five years (about 5500 hours) has no need for a 2 million-hour MTBF drive and thus no need to add the expense of the more expensive parts and additional engineering necessary to meet that metric. A relatively low MTBF does not imply that a drive is inferior. It is merely price-performance optimized for a given market segment.
Second, drives do not begin the Design stage at anything remotely close to their final MTBFs. Just as the Wright brothers were happy simply to fly ten feet over the ground on their first flights, Seagate engineers might only want their first wave of drives to reach a 300-hour MTBF, or even less. Knowing this, they will begin with only a relative handful of drive samples, perhaps 100 or 200, and run them through a battery of basic tests to examine both their performance characteristics and the nature of any failure points. Once that analysis completes, the design will undergo whatever tweaks engineers deem necessary, create a new batch of drives, and try again. Once that first set of reliability and performance metrics is satisfied, Seagate creates a larger population of drives and raises their metric targets, and the cycle repeats.
By the time Design wraps up, a drive will undergo well over 500 tests, many of which take weeks to run.
“Ultimately, we're trying to determine how long will the product live out in the real world when it’s used in normal usage,” explained one engineer on our journey. “The second half to that is design maturity testing, where we push the envelope. We try to see how hot will they go, how cold, wet, dry, voltage margining, scripts. You throw all these parameters at the hard drive in various combinations, and you see where the limits are.”