Skip to main content

New Microsoft Tech Translates Speech in Near Real Time

On October 25 during Microsoft Research Asia’s 21st Century Computing event in Tianjin, China, the company's Chief Research Officer Rick Rashid demonstrated new speech-to-speech translation technology that's capable of not only converting English into spoken Mandarin Chinese in real time, but keeps the user's voice intact as well.

In a blog posted on Thursday, Rashid said Microsoft's new software translator is based on a new technique called Deep Neural Networks, or DNN. It ditches the currently-standard "hidden Markov modeling" technique (which is based on training data from several speakers) in favor of human brain behavior in order to better recognize and mimic proper speech patterns.

By taking the gray matter route, Rashid said his team has seen a 30-percent reduction in translation errors when compared to the older Markov method. That means only one out of seven or eight words are incorrect compared to the old method's one in every four or five words error rate.

"While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modeling in 1979, and as we add more data to the training we believe that we will get even better results," he said in the blog.

The demonstration consisted of two steps. As he spoke to the audience, the system converted his speech into text. It then located the Chinese equivalent of each word (the easy part, he said) and reordered them to be appropriate for Chinese dictation – an extremely important step for correct translation between languages, he said.

"Of course, there are still likely to be errors in both the English text and the translation into Chinese, and the results can sometimes be humorous. Still, the technology has developed to be quite useful," he said.

In the next step, the text was quickly converted into spoken Chinese while retaining the properties of his own voice. "It required a text to speech system that Microsoft researchers built using a few hours speech of a native Chinese speaker and properties of my own voice taken from about one hour of pre-recorded (English) data, in this case recordings of previous speeches I’d made," he added.

Despite the team's achievements thus far, Rashid acknowledged that the results still aren't perfect – there's much work that still needs to be done in order to reach a Star Trek level of quality. "The technology is very promising, and we hope that in a few years we will have systems that can completely break down language barriers," he said.

To see and hear how this new translation system works, check out his presentation below.

Contact Us for News Tips, Corrections and Feedback

  • nieur
    now this is some real innovation
    Reply
  • GoldenI
    Very fascinating. It will be interesting to see the progress of this technology.
    Reply
  • viper666
    Oh Star Trek how you helped innovation...
    Reply
  • hoofhearted
    Will be like in that movie "The Last Starfighter"
    Reply
  • fuzzion
    viper666Oh Star Trek how you helped innovation...
    They gave us the tablet, pc, universal translator, stun gun,etc
    Reply
  • guru_urug
    This is really good innovation. Please don't ruin it fighting over patents. Let the whole world reap the benefits. Speech translation has global applications
    Reply
  • jkflipflop98
    fuzzionThey gave us the tablet, pc, universal translator, stun gun,etc
    Let's not get carried away here. There's a huge gulf between painting a wooden block to look like a stun gun and actually making one. Yeah, ST had ideas ahead of it's time, but that's all they were. . . ideas and grown men playing pretend.

    It took an actual smart guy to make those things real.
    Reply
  • noblerabbit
    Knowing Microsoft, When someone speaks to translate: 'I fed my dog this morning' , the fifth language at the end of the line will say: 'You kicked my cat tomorrow, you assclown'
    Reply
  • swimomatic
    Cannot wait! Now put it in my Windows Phone and in my car and I'm sold!
    Reply
  • friskiest
    Its good to see innovation being treated the right way,. its getting appreciated by both sides with no bashing or pointless remarks!

    Good job!!!
    Reply