Last year it was revealed that contractors around the world listened to recordings from Google Assistant, Alexa, Siri and Cortana to help train the voice assistants' speech recognition capabilities. Microsoft was sharing recordings from Skype, too, and now The Guardian has reported that the company sent those recordings to contractors in China with very few safeguards in place to keep them secure.
The Guardian's report was based on information provided by "a former contractor who says he reviewed thousands of potentially sensitive recordings on his personal laptop from his home in Beijing over the two years he worked for the company." That anonymous source revealed that Microsoft emailed a URL, username and password to contractors so they could access the Cortana and Skype recordings.
There are of course worrying aspects of that arrangement. Microsoft allowed English recordings--the contractor specialized in British English--to be accessed from China, which could have exposed the recordings to the Chinese government. And Microsoft shared the credentials used to access these recordings via email, which could have been to intercepted or compromised.
But it gets worse. The Guardian reported that Microsoft generated the usernames and passwords used to access this system. The usernames were said to follow "a simple schema," which suggests they would have been fairly easy to guess, and the password was "the same for every employee who joined in any given year." Contractors were allowed to work from home, too, without direct supervision.
That kind of setup would have been relatively easy to abuse. Someone might have guessed a contractor's username-password combination. Contractors might have shared credentials with people outside the company, or recordings might have been played within earshot of people unaffiliated with Microsoft. It's almost surprising that there haven't been reports of unauthorized access to the recordings.
Microsoft offered The Guardian the following statement in response to its report:
"We review short snippets of de-identified voice data from a small percentage of customers to help improve voice-enabled features, and we sometimes engage partner companies in this work. Review snippets are typically fewer than ten seconds long and no one reviewing these snippets would have access to longer conversations. We’ve always disclosed this to customers and operate to the highest privacy standards set out in laws like Europe’s GDPR.
“This past summer we carefully reviewed both the process we use and the communications with customers. As a result we updated our privacy statement to be even more clear about this work, and since then we’ve moved these reviews to secure facilities in a small number of countries. We will continue to take steps to give customers greater transparency and control over how we manage their data."
Moving to "secure facilities in a small number of countries" is a start. But Microsoft and other companies could clearly do better about securing private conversations that many people don't realize will be shared with anyone but their voice assistant of choice. This Guardian report almost certainly won't be the last to reveal problems with the management of these kinds of recordings.