Some Consumer NVMe SSDs Reportedly More Prone To Data Loss During Power Outage
DRAM plus a power outage could makes for a bad situation
SSD enthusiast Russ Bishop (@xenadu02) has reportedly tested four NVMe SSDs and their associated protection from power loss. According to his testing, two out of the four SSDs lost data after the SSDs flushed data from the DRAM when power was artificially cut.
Most SSDs on the market today use a DRAM cache to improve latency and bandwidth. However, due to the nature of DRAM chips, DRAM cannot store data when power is lost, which is a legitimate reliability concern for SSDs when unexpected power outages occur. Most consumer SSDs aren't equipped with the power loss capacitors that we see on enterprise SSDs, which makes them more vulnerable to losing data during unexpected power loss events.
The DRAM chip holds a lot of important data, not just temporary data that needs to be transferred to NAND storage. DRAM also holds the drive's FTL or Flash Transition Layer, which is used as a map to see where data is stored on the drive. If the FTL is corrupted, the entire SSD could also become corrupted.
Thankfully, some SSD manufacturers have countermeasures in place for such an occasion. One example is a technique used by Samsung that employs journaling to keep as much data intact as possible during a power outage. Journaling allows SSDs to keep track of what changes need to be made to the SSD from the OS's file system before they happen. When a power outage occurs and data is lost in the DRAM cache, the SSD knows what data was already transferred to the NAND (and what data was lost directly from the journal).
Other approaches involve sensitive circuitry that detects a power outage before all power is lost, triggering a DRAM flush prior to a complete loss of power. These techniques are often good enough for consumer SSDs, to the point where actual data loss is rare during a power loss event.
Fun story: I tested a random selection of four NVMe SSDs from four vendors. Half lose FLUSH’d data on power loss. That is the flush went to the drive, confirmed, success reported all the way back to userspace. Then I manually yanked the cable. Boom, data gone.February 21, 2022
Update 2: models that lost writes:SK Hynix Gold P31 2TB SHGP31-2000GM-2, FW 31060C20Sabrent Rocket 512 (Phison PH-SBT-RKT-303 controller, no version or date codes listed)February 23, 2022
Bishop tested four NVMe SSDs -- the SK Hynix Gold P31 2TB, Sabrent Rocket 512GB, Samsung 970 Evo Plus 2TB, and the Western Digital Red SN700 1TB -- in an effort to see how these drives behave during an unexpected power outage.
The SK Hynix Gold and Sabrent Rocket lost data from the power outage after the DRAM data was "flushed," meaning the data didn't complete its final trip to the NAND. That isn't entirely unexpected given that none of these consumer-class drives have power capacitors for full power-loss protection functionality, but it does indicate that some drives may have better emergency data flushing systems even without a full-fledged power loss protection feature.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
For now, Bishop says he is going to test eight more drives, including the Intel 670P, Samsung 980 (a DRAMless drive), Crucial P5 Plus, and more to see how various drives handle power loss.
Tomorrow I'll have results for:Intel 670pSamsung 980WD Black SN750WD Green SN350Kingston NV1Seagate Firecuda 530Crucial P2Crucial P5 PlusFebruary 23, 2022
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
-
Alvar "Miles" Udell A line interactive pure sine wave UPS is as important as your power supply.Reply -
ingtar33 i'm going to be honest, even if the ssd has power loss protection, you're inviting disaster by using a ssd without a UPS. it's peak stupidity. and frankly the fact people need to be told this is a little troubling to say the least.Reply
SSDs have no mechanical storage. period. their storage is all electronic. any power loss, brownout, power supply blowing up, cap going on the motherboard, power fluctuation of any type will endanger your data. not running with a surge protector/battery backup is like inviting the only significant danger to your storage into the house. -
Sleepy_Hollowed I'd love to know which drives were tested that failed from certifiers.Reply
I used to buy Crucial end of the line drives with advertised protection, which I think was tested.
I've got UPSes and laptops have batteries.... but still, it's quite a gamble to rely on OSes now a days to properly shut down when certain things are running, or if there's users logged in. -
watzupken Power lost is a problem not just for SSD with DRAM, but also mechanical drives. So I am not sure what is the point of this finding. For most people, this is a non-issue. If one is constantly writing/ reading critical data, then there must always be a backup power, or at least they should be getting SSDs with power loss protection where the capacitors can store power to allow the write to be written to the NAND.Reply -
Soul_keeper This is good, we need to know this. I hope he can afford to test many of the current drives.Reply -
King_V Welp, I haven't received it yet, but I guess I'll be returning the SK Hynix Gold P31 and getting the Samsung 970 Evo Plus instead.Reply
Given that WD and SK are also putting terms of service stickers on their SSDs, according to Russ. That's a strike against those brands, in addition to the particular models that fail on power loss. -
cryoburner
A UPS isn't going to help if the power supply or motherboard randomly fails though. Really, if the data is important enough that you wouldn't want to risk losing it, then it should be backed up on another drive as well to minimize potential loses.ingtar33 said:i'm going to be honest, even if the ssd has power loss protection, you're inviting disaster by using a ssd without a UPS. it's peak stupidity. and frankly the fact people need to be told this is a little troubling to say the least.
SSDs have no mechanical storage. period. their storage is all electronic. any power loss, brownout, power supply blowing up, cap going on the motherboard, power fluctuation of any type will endanger your data. not running with a surge protector/battery backup is like inviting the only significant danger to your storage into the house.
In any case, testing how a drive handles loss of power seems rather useful, and makes me wonder why drive reviews don't test for that, seeing as different drives may handle the situation differently. -
plonk420 that makes me a bit nervous. i have a few SK Hynix S31s (SATA version of the P31) O_oReply -
mgerdts That isn't entirely unexpected given that none of these consumer-class drives have power capacitors for full power-loss protection functionality, but it does indicate that some drives may have better emergency data flushing systems even without a full-fledged power loss protection feature.
Actually, it is unexpected. The tests that were performed involved issuing a write command followed by a flush command. Per the NVMe 1.4 specification:
6.8 Flush command
The Flush command is used to request that the contents of volatile write cache be made non-volatile. If a volatile write cache is enabled (refer to section 5.21.1.6), then the Flush command shall commit data and metadata associated with the specified namespace(s) to non-volatile media. The flush applies to all commands for the specified namespace(s) completed by the controller prior to the submission of the Flush command. The controller may also flush additional data and/or metadata from any namespace.
Data that the drive said was written to stable storage via a successful completion of a flush command was not present on the drive when reading it after returning power to the drive. That means that these drives are not conforming to the specification.
The power loss protection feature is addressed in section 5.21.1.6, mentioned in the earlier quote.
5.21.1.6 Volatile Write Cache (Feature Identifier 06h), (Optional)
This Feature controls the volatile write cache, if present, on the controller. If a volatile write cache is present (refer to the VWC field in Figure 251), then this feature shall be supported. The attributes are specified in Command Dword 11.
Note: If the controller is able to guarantee that data present in a write cache is written to non-volatile media on loss of power, then that write cache is considered non-volatile and this feature does not apply to that write cache.
That is, drives that support power loss protection will use DRAM + a battery/capacitor to make it so that the write cache is not considered a volatile cache. As such, all acknowledged writes are considered to be on stable storage, even if not followed by a flush. In fact, section 6.8 says that in such a case, a flush command is a no-op if a sanitize operation is not in progress.
If a volatile write cache is not present or not enabled, then Flush commands:
• shall complete successfully and have no effect if a sanitize operation is not in progress; and
• may complete successfully and have no effect if a sanitize operation is in progress.