'Starship Commander' Leverages Microsoft Cognitive Services For Voice Command Recognition

Human Interact revealed Starship Commander, a virtual reality choose-your-own-adventure science fiction game that places you at the helm of an interstellar starship. But unlike any other game you’ve played, Starship Commander accepts only voice commands.

Virtual reality opens new doors to creative ideas. In the early days of this new medium, there are no norms and no rules. Guidelines for what makes a compelling virtual reality experience don’t yet exist; for now, imagination and technology are the only limitations. To that end, Alexander Mejia, Owner and Creative Director at Human Interact, sought to build something truly groundbreaking for his first VR project, Starship Commander.

Starship Commander is a first person VR narrative story set in deep space. You play as the commander of an XR71 space ship sent on a classified cargo transport mission to the “Delta system.” But more than the story, it's the input method that will raise your eyebrows. Starship Commander doesn't accept physical input commands. You must initiate all action by speaking to the computer. After all, you never see the captain of the Starship Enterprise manning the controls; commanders command.

Speech recognition is something you don't see often in games, and when you do, the implementation usually isn't all that great. Human Interact said that in its quest to bring voice commands to Starship Commander, it tried several “off-the-shelf” voice recognition technologies with little success. The team needed to be able to insert a custom dictionary to account for the made-up words in the game’s storyline, such as the names of alien races. Human Interact was also looking for a solution that could interpret natural speech so that players wouldn’t be limited to specific scripted phrases.

Human Interact turned to Microsoft’s Cognitive Services and used the Custom Speech Service to insert the custom dialect from the game’s storyline into the AI’s dictionary. During Microsoft Build 2016, Microsoft introduced 22 Cognitive Services APIs, which allow developers to integrate technologies derived from Cortana into their applications. The company demonstrated how its technology could be used to interpret the speech of a young child or to automatically identify objects in a photo and create captions to describe them. It was only a matter of time before someone found a reason to use this technology in a game.

Cognitive Services Starship Command

Mejia noted that the Custom Speech Service understands how people talk and automatically generates additional recognized phrases after it receives a handful of options. He also said that Custom Speech Service cut the word recognition errors in half compared to other speech recognition services that he and his team tried.

“We were able to train the Custom Speech Service on keywords and phrases in our game, which greatly contributed to speech recognition accuracy,” said Adam Nydahl, Principal Artist at Human Interact. “The worst thing that can happen in the game is when a character responds with a line that has nothing to do with what the player just said. That’s the moment when the magic breaks down."

Human Interact said that Microsoft’s speech recognition lets you feel like you are part of the story. Instead of following a set script, you get to add your own personality to the dialog of the game. Virtual reality sells the promise of immersion, and how better to feel immersed in an experience than to feel like you’re having a real dialog with characters in the game?

Starship Commander is coming to Oculus Rift on the Oculus platform and HTC Vive on the SteamVR platform. Human Interact has not yet announced a release date for the game.

Starship Command

Create a new thread in the US News comments forum about this subject
This thread is closed for comments
8 comments
Comment from the forums
    Your comment
  • Achoo22
    As far as I can tell, every single one of the Cognitive APIs is a black-box, online implementation. The last thing I want in an entertainment product is an open mic piped into Microsoft's servers. No thanks.
    0
  • uglyduckling81
    It's an interesting idea. It will be terribly implemented though with only a few actual key words I'm guessing. Also if Cortana is anything to go by with an Australian accent I will say
    Me: "Take them out"
    Cortana: "Sorry you can't buy 5 puppies in this game today"
    Me: "Ah for f**k sake, shoot something"
    Cortana: "It would be 35 and sunny at the lake today"
    Me: "Sh** game, give me a refund"
    Cortana: "Your credit card has been charged for 75 copies of Windows 10, enjoy your purchase, and thanks for buying with Microsoft"
    Me: "God f*****g damn it, how the f***............"
    3
  • AndrewJacksonZA
    *chuckle* Thanks for the smile uglyduckling81! :-)
    -1