Microsoft Advances Privacy-Enabling 'Homomorphic Encryption' For Analyzing Biomedical Data

Microsoft released a "manual for homomorphic encryption for bioinformatics" thanks to advances in the past few years for this type of encryption, which allows companies to do computation on encrypted data without having to decrypt it first. This gives almost perfect privacy to data stored in someone else's "untrusted" clouds, because the cloud service providers themselves don't have to see the data in unecrypted form when analyzing it.

Whether it's the OPM, Target, or even some medical institutions, data of millions of people being stored on centralized servers has become an increasing risk, as more attackers target that data, especially if it's kept on less secure servers.

Companies or medical institutions could outsource the security of that data to a well-recognized cloud service provider such as Microsoft, but even if that provider has much better experience with security, it can't guarantee that its servers will never be breached. Plus, many companies, or countries, don't trust their data to be seen and possibly analyzed by other third-party or foreign companies.

To improve on that issue, Microsoft has begun allowing its customers to encrypt the data with their own keys, and it has recently made deals with "data trustees" that follow the local laws in other countries more strictly before giving the data to Microsoft.

However, there's also another scenario where the customers might want to take advantage of a third-party's analytics intelligence capability, but they are still worried about the privacy implications. For instance, Alphabet, which is Google's parent company, has started analyzing gene information to find cures to different diseases and improve the lifespan of humans. However, that implies that the owners of those genes need to trust Google -- or law enforcement -- not to abuse that data, which may be a hard question to ask of many people.

This is why homomorphic encryption has been seen as a kind of "holy grail" for this type of situation, because the cloud service provider never has to see an individual's data, but it can still obtain results from that encrypted data.

Right now, companies say they already analyze data in aggregate to maintain user privacy, but only because that's their policy, and they decide which data to remove from their analysis. However, they could always decide to not follow that rule in secret and then identify a certain person's data, because they already have the data collected in individual form. Homomorphic encryption will actually enforce individuals' privacy simply because of the way it works.

The problem with homomorphic encryption so far is that it's been many orders of magnitudes slower than conventional encryption systems. This made it unpractical to use. Microsoft said that its "leveled homomorphic encryption" is now at a performance level that makes it practical, at least for certain cases such as analyzing biomedical data. It's still too slow and expensive for using it with email data at present, for instance.

Bioinformatics is a new and emerging field, and it has much stricter requirements for privacy, especially if the companies working in the industry want people to volunteer their genes to them. This makes homomorphic encryption more appealing, even if it's still slower than regular encryption. Eventually, if more research is done on it, and after our computers' performance evolves more, we could see it used for other types of information as well.

Another advantage of homomorphic encryption is that even if the data is stolen, it remains in encrypted form, so the attackers have no way of retrieving private data about individuals.

There are already some projects, such as Mylar and CryptDB from MIT, that offer similar guarantees while encrypting data in a more traditional way. They are also potentially more suited for something like email services rather than biomedical data, as individual users own the encryption keys to their own data that's stored on a third-party server, which only they can decrypt in their browsers. Google has been experimenting with a CryptDB extension (Encrypted BigQuery) for its own databases, and Microsoft will also use some of the learnings from CryptDB to apply them to its SQL Server 2016.

______________________________________________________________________

Lucian Armasu joined Tom’s Hardware in early 2014. He writes news stories on mobile, chipsets, security, privacy, and anything else that might be of interest to him from the technology world. Outside of Tom’s Hardware, he dreams of becoming an entrepreneur.

You can follow him at @lucian_armasu. Follow us on Facebook, Google+, RSS, Twitter and YouTube.

Lucian Armasu
Lucian Armasu is a Contributing Writer for Tom's Hardware US. He covers software news and the issues surrounding privacy and security.
  • kbannan
    Interesting. But pointless unless the organization takes a strategy that covers all the security bases -- not just one or two.

    --KB

    Karen Bannan, commenting on behalf of IDG and Dell
    Reply
  • tsnor
    @lucian_armasu I could not find additional information on the "manual for homomorphic encryption for bioinformatics". The other stories I read were based on yours. Any pointers you can provide toward additional information? Thanks

    If the encryption is now fast enough to handle in software that is interesting news indeed.
    Reply
  • tsnor
    16958951 said:
    Interesting. But pointless unless the organization takes a strategy that covers all the security bases -- not just one or two.

    --KB

    Karen Bannan, commenting on behalf of IDG and Dell

    Can you elaborate on why a very specialized niche encryption like homomorphic encryption is pointless unless it covers all security bases? At one point HTTPS was considered an oddball niche also. Or perhaps I did not correctly understand your post.

    Reply
  • Lucian Armasu
    @tsnor There's a source link at the top. I'm not sure if you saw that. On that page there's a PDF link in the top right corner. That's Microsoft's paper on it. There's also a link in it for the code, but it's not live yet.
    Reply