Nuance Communications, the developer behind Dragon for PC and Swype, said that Dragon Assistant Beta is now available for Intel-inspired Ultrabooks. The Dell XPS13 Ultrabook will be among the first PCs to ship with Dragon Assistant Beta in Q4 2012.
The result of a "strategic" collaboration between Intel and Nuance first announced back at CES 2012, Dragon Assistant Beta allows users to speak to their Ultrabook to search the web, find content, play music, check and reply to email, update social statuses and more. Ultrabook owners simply say "Hello Dragon" to awaken Dragon Assistant, and then say "go to sleep" when they want to software to stop taking commands.
Intel also today announced the availability of the Intel Perceptual Computing SDK 2013 Beta, which includes the voice SDK components from Nuance, giving developers the ability to leverage the power of Dragon to create applications and experiences that drive natural, intuitive voice interactions, Nuance said on Tuesaday.
"Dragon Assistant is a direct result of Nuance and Intel’s vision for a more human, natural interaction between people and their technology. You speak and the Ultrabook responds. Working closely with Intel, we’ve created a voice assistant experience optimized for the Ultrabook – incredibly fast, reliable, and with the performance you expect from a combined Nuance-Intel innovation. Dragon Assistant drives productivity, creativity and simply a PC experience that fits today’s busy lifestyle," said Peter Mahoney, Chief Marketing Officer, Nuance Communications.
Dadi Perlmutter, Intel chief product officer, is currently demonstrating Dragon Assistant Beta today at the Intel Developer Forum in San Francisco. Perlmutter claims that the software "is running native on the platform, this is not a cloud service, this requires the high-performing CPU and the capabilities inside." He also said that Intel worked with Nuance to fine-tune the voice recognition software for its processors to maximize performance.
Nuance's flagship product is Dragon NaturallySpeaking. The company's technology also powers Apple's Siri voice search application which, according to numerous misleading commercials, can seemingly carry on a conversation with the user similar to the way the ship-wide computer used on Picard's U.S.S. Enterprise would respond to the crew.
I was able to significantly improve me accuracy by upgrading from the crappy headset that to program comes with, to a blue microphone yeti, but even then, it makes more mistakes than I would like.
Since it's errors are with word choice and not spelling, it is difficult to find it's errors when you use it to type up a 20 page paper.
Some voice software like this can adjust to your own personal voice, so although it might take a while, it might be able to adjust to someone's accent.
god yes same here, blue yeti makes that program almost perfect... but i still get problems, like the program not being able to correct for some reason and such... its a shame because i would use it more often if it would correct, and stop using the word ship when i say a... similar word.
How well does this technology differentiate between a statement directed to it (e.g. "open 'c:\The Door.docx'") a statement made in the ambient environment (e.g. "Open the door.") I get the "hello dragon" and "go to sleep" part, but is there something like Siri where you address the device by name, or does it simply assume that you're talking to it?
Here's where I think MSFT should take this:
First, use a pair of microphones so that you can determine the direction of the speaker.
Second, when you address the device, either with an individual statement such as, "Siri, find me a Chinese restaurant nearby" or with a batch statement such as, "Siri, record the following dictation," the following should occur:
- the device should identify and authenticate you using voiceprint recognition on its identifier - in other words, how you say "Siri" (or whatever name you choose to give it). If you have setup casual authentication, it automatically changes context to your associated windows login - and incorporates all of your favorites, characteristics, etc. that you have defined in that login. If you have setup strict authentication, the device should require some form of challenge-response authentication before it will accept any command on your behalf.
- the device should give a visual cue that it is in command mode, and is listening to you. Maybe a pair of semi-transparent eyes focused in your direction, positioned on the screen based on config preferences. The visual characteristics of the cue could be used to indicate who you have been authenticated as (to help avoid mistakes in a crowded room).
- audible and visual representations of the understood command should be (configurably) echoed so that you can verify it understood you accurately.
- identifications and commands from other individuals should be ignored for the duration of the command or session (e.g. "thank you Siri" or "good bye, Siri"), at which point the visual cues should clear from the screen.
Anyway, it's nice that they're trying to make it easier to use a computer. I suppose we'll have to see how well they designed the voice interface.