SSD Recovery: How Pros Bring Flash Memory Back To Life

When Bad Things Happen To Good Flash

In November of last year, I wrote Disaster Strikes: How Is Data Recovered From A Dead Hard Drive?, chronicling the process as some of my personal storage was brought back from the dead by Seagate Recovery Solutions. Of course, these days we have to worry about more than the loss of important files from mechanical disks. Whether it’s the USB drive in your pocket, the eMMC in your phone, or the SSD in your notebook, flash storage is now just as critical to personal and professional interests, and prone to failure, just like hard drives.

As a follow-up to our coverage last year, we connected with Flashback Data, an Austin-based rescue lab that handles all types of storage devices but carries special expertise in flash media. We even got a special kick when calling the company’s main office and finding Queen’s “Flash” as the on-hold music. Flashback agreed to take us on a tour behind the scenes and show what it takes for a top-level recovery lab to salvage your precious NAND-based bits from the jaws of ruin.

A Range Of Reading

When Flashback first got into the recovery business, most of its activities focused on swapping out faulty chips. Over time, this became increasingly difficult as vendors started sourcing different components for different manufacturing runs of the same drive model. Encryption also began to appear on some drives, making matters even more complex. This required Flashback to be able to read the drive’s memory directly, which in turn meant that the lab needed a dizzying host of ways to read chips from across the breadth of the flash industry.

Note that when Flashback refers to “encryption,” this state is generally unknown to the user. Since about 2006, for example, SanDisk has been encrypting data on all of its flash drives, according to co-founder and vice president Russell Chozick. As with self-encrypting hard drives, the controller encrypts all data as it proceeds to flash memory. However, since no password is employed to lock the encryption, data is decrypted as it gets fetched from the media. So in the case of a broken PCB, Flashback attempts to move the controller and memory chips to a new board. “If the controller chip fries, though, it’s going to be almost impossible to get that data back. The controller keeps all of the information about how to decrypt the data. Lose that and you’re pretty much...well, you’ve got issues, big time.”

Flash Types

These dark gray chips are of the TSOP48 (48-pin) variety. They were fairly standard on USB flash drives, SSDs, SD cards, and CF cards for years, although they have started to give way to other formats recently. The bottom specimen is the underside of a TLGA chip. Notice how, instead of pins on the sides, the TGLA has pads on its underside. TLGAs are common in all types of flash and are in newer iPhones, as well.

During recovery, Flashback used to plant TSOP48 chips into reader sockets, but TLGAs had to be soldered onto a board. Obviously, this made analysis and data retrieval much harder. Life hasn’t gotten any easier as smartphones have pushed flash memory into newer, smaller package types that make these older “monolithic” formats look simple in comparison.

Flash Types, Continued

These SD cards and LaCie USB-based thumb drive are also monolith chips. Whereas most flash cards have a separate controller chip and memory chip, a monolith has both components contained in one tiny package. Obviously, breaks in such devices can happen at any of many different points. If the controller itself fails, technicians can still access the data through other means than the regular access pins where the device would connect to a card reader or camera/phone. To illustrate, this photo shows the LaCie drive with its traces partially exposed.  Recovery technicians have to remove the black solder mask over the traces, to find out where the points are that they need to connect to a logic analyzer. Once all of the points are identified, the card can be wired up similarly to images later in this article.

To remove the mask, Flashback uses a surprisingly mundane approach: rubbing compound and a buffing wheel. It is possible to use chemicals for the same purpose, but Chozick says that Flashback has better luck with slow, careful buffing. With sanding, it’s too easy to damage the flash product’s very fine traces. We asked Chozick if Flashback could wire up the LaCie drive to illustrate, but we changed our minds upon learning that such work can take a technician an entire day.

Typical Flash Drive Failures

We’ve all seen pictures of hard drive damage, most of which tends to involve head crashes with circular grooves plowed into the magnetic media. With SSDs and flash products, nearly all of the damage that Flashback sees is invisible. In rare cases, there might be a burn mark on a PCB, but by and large, broken controllers or burned fuses leave no visible evidence. As a result, technicians have to go through the drive testing each resistor in a long, laborious process of trial and discovery. In comparison, a clean connector break like the one shown here is a cake walk for repair techs.

What About Wear-Out?

We’ve written previously about the tug-o’-war endurance battle between improving leveling algorithms and higher capacities versus shrinking fab processes. In particular, we’ve worried that flash drives and smaller SSDs that have been in service for several years now might start to exhibit wear-out.

Fortunately, Chozick says that most of the SSDs Flashback receives are less than a year old and haven’t had time to show NAND wear-out. In fact, actual wear-out cases are extremely rare. With USB flash drives, though, especially older ones with lower-grade leveling algorithms, wear-out is a bit more common. Technicians can read the chips just fine, but when they check the data, there are so many ECC bit errors that no data can be extracted. The four red dots in a later image show ECC problems. Major wear problems would be the opposite, with maybe four green dots.

Chozick says he has seen cases in which techs would do one analysis, take the chip out to, say, clean the solder pads, bring it back, and the data would be even worse because of the additional reading.  So yes, wear-out is a real danger, but it’s not the ever-looming crisis some might fear.

Get It Hot

Many times, chips will need to be removed from PCBs with the help of a solder rework station. One of the first tools in this process is hot air. In this image, technicians are removing a TLGA chip from a USB drive. Technicians control the temperature and air pressure, heating the device just enough to melt the solder points so that the chip can be removed. These reworking stations also contain soldering irons, flux, ohmmeters, and other diagnostic equipment. Several of these stations occupy Flashback’s main lab, which spans roughly 5000 square feet.

Removing Memory

This SSD’s controller is toast, so Flashback technicians begin the gentle tearing down of the drive. Each memory chip is hand-numbered for tracking and easier reassembly of the data.

“Sometimes we won’t know exactly which components are bad,” says Chozick. “We just know that this is the type of drive where we see this or that firmware failure. Or this type of drive often has this kind of failure, so we need to remove the chips to start working on it. Obviously, our customers are often in a hurry, so many times you don’t get to know the exact reason why something fried or what is fried. But you do know that you’re not going to get it to read through this controller, and it’s not encrypted, so we can just start pulling chips, get them read, and do a rebuild.”

Pulling Chips

Flash drives and SSDs aren’t the only devices to get the heat treatment. Flashback sees a steady stream of cell phones come through its labs, such as this HTC Evo Android-based phone that drew its last breath in a swimming pool. Flash data recovery services run from the hundreds into the thousands of dollars, so it’s a safe bet that this phone’s contents weren’t your typical kid and kitty videos. Chozick says that it’s not uncommon to see phones come in containing the last known images of a departed friend or loved one. They also receive phones regularly as part of criminal investigations. A perp might try to crush his smoking gun underfoot, so to speak, but if the flash memory remains intact, the data can usually be retrieved for judge and jury.

The Evo is a couple of years old now. Newer phones, such as the Samsung Galaxy series and several others from HTC, often contain eMMC technology, which features the controller built into the memory module, as on an SD card. This can make retrieval considerably easier.

Hard Drive Vs. Flash Memory

The service area of a hard disk contains information that lets the drive communicate with itself. For the heads to be able to translate data into the read/write channels, the device must have information about where bad sectors are, how many heads there are, which are turned on or off, and so on. This information resides on the platters in a special service area separate from the user-addressable space.

With flash, manufacturers also leave room for a service area. This contains all the information about error correction codes, whether there’s a bit error in any given sector, where those sectors are located, etc.

Whereas a hard drive would be comprised mostly of 512-byte sectors, flash memory typically uses 528-byte sectors, where 512 bytes would be data and 16 bytes would be the service area. SSDs end up translating down to that user-accessible 512-byte size. But when Flashback reads the raw data, technicians get both pieces. The data area gets mixed in with the service area, so the resulting dump looks like data, service area, data, service area, alternating over and over. When technicians reassemble the image into workable information, all of the service area parts have to be removed.

Image: http://commons.wikimedia.org/wiki/File%3ADisassembled_HDD_and_SSD.JPG. By Rochellesinger at en.wikipedia [CC-BY-SA-2.5], from Wikimedia Commons.

Getting Up Close

Sometimes, technicians need to conduct a close visual examination of flash chips and their fragile innards. The best tool for this job is the Mantis microscope from Vision Engineering. Each unit costs $2000 or so, but they allow recovery workers to go hands-free and examine circuitry in 3D (via twin light paths projecting through a single viewing lens) at up to 20x magnification. The more natural experience and comfort of the Mantis helps technicians discover problems they might otherwise miss with conventional eyepiece microscopes. It also greatly helps with solder work, both in disassembly and repair.

  • Eggz
    What did I just look at? Maybe it's displaying wrong on my screen, but there were no words and the pictures were pretty bad.
    Reply
  • Snipergod87
    Needs more JPEG.
    Reply
  • kamhagh
    i would push the ssd in chest and yell BREATH BREATH !!!!!
    Reply
  • artk2219
    Thanks for the story! Very cool to get a general overview of the flash media based file recovery process.
    Reply
  • TechnoD
    Intriguing article. This is the kind of stuff I enjoy reading on Tom's.
    Reply
  • mouse24
    Theses Image articles are annoying to read.
    Reply
  • TyrOd
    "In an earlier article we did on data recovery, at least one commenter noted that essentially anyone could get into the recovery business and that Flashback was a small fry operation on a completely different level than more recognized names. Of course, the proof is in the recovery results and the client roster, which includes a broad spectrum of commercial and government accounts."

    No. Nobody was conflating the physical(or financial) size of the business or name recognition with the level of their technical capabilities.

    In fact, the biggest and most recognized data recovery labs are the most notorious for abandoning attempts at complex recoveries due to the shear volume of cases they receive.

    The point that you are still missing is that everything you saw at flashback was commercially available to start-ups.

    In fact there are literally dozens of labs out their that have sprung up in the past 10 years with the same array of tools.

    One laughably obvious example of the basic marketing spin is slide 12.

    Where exactly do you think "Flashback's Special Imagers" came from?

    Hw exactly do you call a commercially available tool,sold by the thousands to data recovery start ups for years "Special".

    Also, I literally Laughed out loud when I read the bit about ASCLD cert.

    Do you really believe that ASCLD Lab Cert actual says anything about the technical capabilities of a forensic lab?

    ASCLD lab is an operational certification and it's completely irrelevant to Data Recovery Labs on the technical level.
    Hoiw do I know this? because I called them and asked.
    I verbatim said "what criteria are used to test for technical the capabilities of a proprietary piece of hardware in Recovering forensically sound images?"
    There response was "...as long as it does what you say it does"
    Funny, right?

    Though Flashback seems to be making an honest effort, at least from a marketing standpoint, of doing complex recoveries(like monolithic/SSD Flash rebuilds) it says very little about their actual ability to minimize corruption and maximize recovered data.

    I was actually impressed by the thoroughness of this article, but unfortunately it could have been done at dozens of other labs with similar technology.
    Reply
  • TyrOd
    I also want to mention that ASCLD lab is definitely a robust qualification for Forensic Labs. But it only extend it's relevance to the Computer Forensics aspects that only sparsely overlap with physical data recovery.

    you actually see a lot of this cert padding among bigger labs, because people don't understand how different a data recovery Lab is from the rest of the IT world.

    For example: If you use a piece of encryption software or Virtual machine platform of some kind and something goes wrong, how useful is having a engineer certified by the software maker?

    For the IT professional, the answer is probably "useful' generally speaking.

    What about when their are physically damaged sectors and it's not possible to recover a few hundred sectors?

    Then what?

    Well you call the encryption software maker...you get elevated to their highest level of support...and they tell you it's not possible...in fat they don't even understand the difference between bits and sectors, claiming literally every single bit has to be read to decrypt with their software.

    That's the normal everyday of how far a certification, even from manufacturers will get you in the Data recovery field.

    Beyond that you're on your own to develop solutions.
    Reply
  • Eggz
    13365168 said:
    Texts are on the right side of the images. Perhaps you need to update your browser to make images along with text visible.

    Thanks. For some reason, the computer I was using didn't update the text corresponding to pictures. So when I scrolled photos, the text for only the first photo was displayed. Got it now. Great article! Thanks for keeping a flow of stuff like this.
    Reply
  • smeezekitty
    This is why on the fly encryption is a bad idea. And not only that, its pointless.
    Because if someone has access to the drive it does exactly zero. It only stops reads directly from the flash chips.
    Reply