Update, 10/7/16, 2:30PT -Intel changed the endurance rating for three of the four 600p SSDs, and we have the new endurance threshold listed below. We have further detail in the Intel Quietly Increases The 600p Series Endurance Ratings article.
|Header Cell - Column 0||600p 128GB||600p 256GB||600p 512GB||600p 1TB|
|Old Endurance (TBW)||72 TB||72 TB||72 TB||72 TB|
|New Endurance (TBW)||72 TB||144 TB||288 TB||576 TB|
Update, 9/23/2016, 2:20pm PT: After further discussion with Intel, we discovered that the SSD can continue to write data after the MWI limit if there is still enough spare area to replace failed cells. We have amended the original article content accordingly.
We recently shone a light on the somewhat concerning Intel SSD endurance limitations in the M.2 600p SSD review. The 600p switches into a locked read-only mode when the SSD exhausts its spare area. Intel clarified the process for recovering data after its SSDs enter into the locked state, but we also want to clear up some common misconceptions about endurance ratings, which SSD vendors spec with the somewhat misleading "TBW" (Terabytes Written) measurement.
Intel designed its IMFT 3D TLC NAND-powered 600p SSDs specifically for the low end of the market, and it has only 72TB of endurance, which pales in comparison to value offerings from other SSD vendors. For instance, the recently announced 960 EVO (which also uses 3D TLC NAND) offers 400TB of endurance with its 1TB model. Most users will not reach the Intel-imposed 72TB limit, but we do know that as a general rule, the flash will outlive its endurance specifications. There have been many reports of SSDs outlasting the endurance specification to the tune of hundreds of TBs (or several PBs) before the user actually experiences data loss.
Intel's locking procedure has been a staple in both its client and enterprise products for several years. We explained in the review that although we knew the read-only locking procedure applied to past products, we had not received confirmation from Intel that the new low-endurance models also employ the same technique. The locking feature really hasn't been an issue in the past, but due to the 600p's low endurance rating, a casual user certainly has a much higher chance of encountering the odd situation.
We received an official response from Intel that confirms that the feature is active on the 600p series, but the MWI SMART value, which indicates how much warrantied endurance is available, does not trigger the read-only state. All SSDs have a spare area of flash that it uses to replace failed cells, but it is possible that the SSD can still have a portion of the spare area available after the MWI counter reaches one (signaling the flash is expired). If there are still cells available for replacement the SSD can continue to function past the MWI counter expiration, and the user can continue to write data after the warrantied endurance rating (which Intel doesn’t recommend). However, the SSD will enter the read-only state if the spare area is exhausted. The company also outlined the data recovery process after the endurance expires:
Under typical client usage, a user will not wear out the endurance of the drive before reaching end of warranty period. For NVMe SSDs, the Percentage Used SMART info is the end user indicator for when drive is reaching its write endurance EOL. If SSD reaches Percentage Used value of 100, then the drive has reached the planned life of the media, and the user should replace drive. Another quality metric of the drive is available spare. If available spare area drops below threshold, which is very untypical during warranty period and write endurance of drive, then the user will also be warned via the SMART information that drive is in critical state. If user continues to use the drive, it will reach a point that it will be forced into read only mode. The user can then place drive in a system that only requires reading from the drive and recover data before replacement. (emphasis added)
Most users will know that the SSD has entered into the read-only state because the operating system will begin generating error messages. The OS generates the error messages because it has to be able to write data to the drive in order to function (there are always myriad processes, such as logging, that occur in the background).
Intel's process for copying the data from a read-only SSD involves simply installing the drive as a secondary volume (non-OS) in a computer. The operating system will not lock up if the secondary drive does not accept incoming write data, so the user is free to copy the data to another drive.
The process to copy the data is simple, but Intel designed the 600p series for casual users, and most non-technical users will never know that the drive has entered into a read-only state. Successive reboot attempts will be unfruitful and not resolve the issue, and many users may decide that the drive has died, taking the data with it.
We feel that Intel should be more forthcoming with the end of life process and educate users in standard documentation. As it stands, we cannot find any direct Intel support or reference materials that outline the end of life process.
Why The TBW Rating Is Misleading
Another interesting facet of the endurance conversation revolves around the widely used, but somewhat misleading, TBW measurement. Simply put, SSD vendors provide a TBW rating to indicate how many terabytes of data a user can write to the SSD before it expires.
SSD endurance is a tricky subject. Unfortunately, most users rely upon the "host writes" measurement (which calculates how much data the host has written to the SSD) as an indicator of how much endurance they have used, and how much remains. Most of our readers comment that they have "only" written XX amount of data to their SSD in X amount of years, but that is not an accurate indicator of the used, or remaining, endurance.
Writing 1GB of data does not always mean that the SSD actually writes only 1GB of data. In fact, unless the SSD uses compression technology (which is very rare after the slow and silent death of SandForce controllers), the SSD will normally write more data than the host computer sends to the storage device. This “write amplification” is due to internal SSD processes. Write amplification is widely documented but often misunderstood. The amount of write amplification varies between SSD vendors, controllers, firmware implementations and the type of data you write, but it usually falls into the 2X to 3X range. This means 1GB of data transfers can result in as much as 3GB of data written to the NAND (and possibly even more).
The SSD also constantly juggles data internally due to static data rotation and garbage collection routines, so there is always a constant stream of wear inside the SSD, even if the user is not actively writing data to the drive. This wear is beyond the user’s control. Some SSDs have aggressive garbage collection routines, which increases the amount of internal wear compared to other SSDs with more conservative algorithms.
Intel, like all other SSD vendors, uses the MWI (Media Wearout Indicator) SMART value to determine how much life the SSD has left. The wearout indicator is not based on the amount of data that the user writes to the drive (host writes). Rather, the MWI measures what percentage of the finite program/erase cycles the SSD has consumed. The MWI indicator takes into account all of the "unseen" writes that constantly sap endurance in the background, including the non-"host write" variety. Intel's SSDs rely upon the MWI indicator as the measure of endurance, which it referred to as the "Percentage Used" value in its official statement.
The media wearout indicator (MWI) is not a one-to-one measurement of the endurance in relation to the amount of raw data the user writes to the drive. This uncomfortable fact means that you cannot judge how much endurance you need by the amount of data that you write to the drive, or even by the TBW (Terabytes Written) value that many common utilities provide. Intel, and other SSD vendors, also determine if you are fit for a warranty replacement by using the MWI counter, not the amount of data the user writes to the drive. The MWI counter begins at 100 for a new SSD, but your warranty is void once it reaches one.
SSD vendors spec SSD endurance by the TBW metric, but it is merely a guideline. For most purposes, TBW serves as a good general guideline, but older SSDs tend to accumulate much more of the "unseen" wear that is measured only by the MWI counter.
The MWI counter is the only true indicator of remaining endurance with all SSDs.
We advise readers to refer to the MWI counter to accurately gauge their current data usage patterns so they can make an informed decision before they purchase their next SSD.