Audio, Audio, Audio: The Key To Virtual Reality Immersion Is The Audio

I recently attended Immersed 2015, and after going through my notes, an obvious recurring theme emerged: Many, if not most, of the speakers and panelists at the event were of the same opinion that the auditory experience in VR is just as important as the visual experience -- if not more important.

"Sound is 50 Percent Of The Movie Experience"

There were many presentations at Immersed 2015, and if I learned one thing from them, it is that the people working diligently to bring the best, most compelling virtual reality experiences to fruition have all realized that audio can't be ignored, though it largely has been thus far.

Many of the early VR experiences that are available for the public to download and play have been unconcerned with the auditory experience, but a bad one can completely break the immersion. George Lucas once said, "Sound is 50 percent of the movie experience," and for virtual reality content, this is probably even more true. 

One of the early presentations at the conference was "Working With Immersive Audio For Virtual Reality" by Jan Nordmann, Senior Director Business Development, New Media at Fraunhofer USA (inventors of the MP3 codec and co-developers of AAC). His company is working on new audio standards for immersive 3D audio in virtual reality and has already partnered with Samsung on the Gear VR.

Nordmann said that people perceive where sound is coming from based on the delays heard in either ear. In virtual reality, this becomes incredibly important, especially in things like virtual cinema. Imagine looking away from a TV screen but the audio still sounds like it's right in front of you.

The way that noises sound is important, as well. If you leave an office building and head outside into a busy street, it shouldn't sound like you are still inside the building. This type of thing can break immersion quickly.

Fraunhofer has developed a technique called Head Related Transfer Function (HRTF), which uses binaural sound recordings to simulate the direction sound is coming from. The process to record sound this way requires two microphones placed where a person's ears would be, with speakers around the room delivering audio from all directions. Despite being recorded with multiple speakers, Jan believes that the typical VR experience will require headphones for total immersion.

Movies Need Surround Audio, Why Wouldn't VR?

On day two, there was a discussion panel called "The Audio Of VR" that was moderated by Jason Jerald, Principal Consultant at Nextgen Interactions. Jan Nordmann was one of the two panelists. He was joined by Mary Spio, President of Next Galaxy.

To start off the discussion, Jerald brought up a great point: Most people would consider watching a film without sound to be a ludicrous idea these days, so why do we see so many VR demos without audio? He said this makes the user deaf.

The answer to that question seems to be that audio has been treated as an afterthought in the quest for pushing virtual reality to fruition. Companies developing both hardware and software for VR have been spending their time and resources on solving the visual problems first. Based on my experience at Immersed, though, it appears that industry insiders are starting to make the shift to audio immersion.

The first generation of immersive audio technology is focused on getting the head tracking working in the audio environment. To keep things simple, Fraunhofer is working with stereo sound for the first generation. Second-generation immersive audio will bring object-based sound into VR, making the environments all-the-more realistic and believable.

Mary Spio and her company, Next Galaxy, are using a very different approach from Fraunhofer's. Next Galaxy uses sound tracing to imitate sound profiles for different environments. Spio said this is a physics-based approach of estimating the environment and replicating the sound to match it.

The company uses seven different variables to determine whether to lower or raise sound levels to mimic a sound's 3D origin. Spio said that by using this technology, even some compressed audio files can be processed this way, and that some existing audio content can benefit from sound tracing enhancements, as well.

The Trouble With Immersive Audio

Immersive, object-based audio will bring VR immersion to a whole new level, but that kind of audio profile will come with its cons. Developers and audio engineers are going to have to learn how to use the new set of tools required to create these immersive audio profiles. Spio believes that there are too many tools on the market right now. People might be waiting for a clear standard.

Jan Nordmann mentioned that immersive audio is going to cost more in processing resources. Stereo sound is the easiest to process and is much simpler than surround sound. Immersive audio will be a similar step up and will be even more demanding still.

It's Not Just The Audio Experts Talking About Audio

Later in the day we heard from three more speakers who reiterated several of the points that Nordmann and Spio brought up earlier. They all believe that sound is a key element in the VR experience.

Ricardo Laganaro, Director of O2 Films in Rio de Janeiro, has been working on a project to build "The Museum Of The Future," which focuses on current and future technology. He and his team are not VR developers -- they are film makers -- so they have been learning a lot as they go, but one thing they learned quickly is that audio can make or break an experience.

Laganaro and his team were trying to build an experience that would intentionally throw you off balance. They experimented with and without sound, and they found that having audio that tracked with the environment made it a much more powerful experience.

Resh Sidhu is the Creative Director at Framestore VR Studios. She and her team have been responsible for some incredibly immersive 4D experiences, such as the Game of Thrones: Ascend the Wall VR experience, which debuted at the second annual HBO Game of Thrones exhibition in 2014, and the Merrell virtual hike experience that had people walking a rope bridge in VR and in, er, literal reality at the same time.

One of the main points in Sidhu's speech, "The Power of Virtual Immersion," is that sound is an integral part of an immersive experience. She said that building the soundscape for a virtual experience should be considered from day one. Even the simplest sound should not be underestimated, because it can have a powerful effect on the experience itself. "VR is about fooling all the senses," Sidhu said.

The last presenter to mention immersive audio was Josh Farkas, CEO of Cubicle Ninjas. His take on developing experiences for VR is that audio and input are both just as important, if not more important than the visual experience. Without believable audio, it doesn't matter how good the visuals are, it won't be as great as it should be.

Farkas mentioned an experiment that the creators of Radial-G performed on pain management therapy with virtual reality. He didn't have the data on hand, but he said that their findings showed somewhere around a 20 percent improvement in effectiveness when audio was added.

Farkas's company has built a number of virtual reality experiences, and his team has learned that 3D positional audio makes a huge impact on the user experience. He recommends using multilayer audio with different sounds layered overtop of themselves and multiple points of audio. An example he used was of his company's Guided Meditation VR application, where you are sitting on a tropical beach. The sound coming from the ocean is constant, but there are other environmental sounds, such as seagulls flying overhead, that move about throughout the scene. 3D positional audio sources are key to make that work.

What About The Reverb?

Farkas did mention one thing that didn't get much attention by anyone else -- in fact, he specifically called out that no one talked about it. He believes that reverb is incredibly important when creating a sound environment. Applying the correct reverb can mean the difference between believing you are actually in the virtual environment, and thinking that something is slightly off, which can break the immersive feeling.

It's clear to see that audio is the next frontier for true immersion in virtual reality. Developers and audio professionals are starting to consider its impact on the experiences being created for VR, which is great, but it's also clear there is much more work to be done. While everyone agreed that audio is of massive importance, everyone seems to have their own approach to how this will be done. Eventually a standard will emerge, but I think we'll be seeing a fair bit more experimentation before that happens.

Update, 10/21/15, 2:17pm PT: Fixed typos.

Follow Kevin Carbotte @pumcypuhoy. Follow us @tomshardware, on Facebook and on Google+.

 Kevin Carbotte is a contributing writer for Tom's Hardware who primarily covers VR and AR hardware. He has been writing for us for more than four years. 

  • McWhiskey
    I see this as a great direction for the industry. I remember quite a few years ago there was a lot of talk about sound quality. 3D game engine developers were hyping that hallways sounded like hallways while outside sounded like outside. SoundBlaster was hyping how many individual sound creating items in a 3D field it could process with its newest sound card. Maybe development on this front continued but the hype disappeared. At the time, I thought I was going to have two very powerful cards running side by side in my PC; one for video and one for audio. On board audio became "good enough" and sound was never mentioned again. New game engines hype all of the visual this and that but sound is barely even mentioned.

    Hopefully that all changes soon.

    Imagine 3D environments where everything has a visual texture for appearance, a physics texture for behavior and a sound texture for both creating and echoing audio. engines run with virtual cameras. For VR two cameras are used and their spacing is changed based off of the users eye location. Why not use two mics and change their spacing based off of the users ear location?

    Maybe this will happen a few generations from now.
    Reply
  • thezooloomaster
    Audio huh?

    I'm not saying it's not important, but am I the only one who thinks immersion implies touch, smell, physical movement in the real world translating in the virtual one?

    Sure we need good audio, but let's stop fellating ourselves over half measures.
    Reply
  • McWhiskey
    Sure we need good audio, but let's stop fellating ourselves over half measures.

    That seems a bit harsh. I think everyone has a similar end goal to what you describe. But what you said is like saying "If it's not a holodeck, it's a waste of time." If you don't enjoy reading about the evolution of technology and are only interested in the accomplishment of the end goal, why are you visiting this site?
    Reply
  • hannibal
    We need new Direct-sound to windows and something similar to other platforms...
    The 3d-sound have been in sidetrack far too long time!

    Reply
  • gadgety
    Four comments on this topic, that's all? Hopefully audio will get the attention and resources it deserves, because it is so important for the immersive experience.

    "The way that noises sound is important, as well. If you leave an office building and head outside into a busy street, it shouldn't sound like you are still inside the building. This type of thing can break immersion quickly."

    And vice versa. In Skyfall, the Bond movie, one of the visually most compelling movies in the series, sound editing and mixing wasn't as convincing as the visuals. The movie starts in a silent building, a little later Bond steps out into an incredibly noisy and busy street. The silence in the building seemed slightly unbelievable, once the noise in the street hit. Now, for VR, audio seamlessness will be even more important.
    Reply
  • Mary_11
    Her name is MARY SPIO, can you please fix this?
    Reply
  • ravewulf
    Fraunhofer may have developed a specific implementation of a HRTF, but I'm not so sure they created the technique. HRTFs and binaural audio are certainly not new and have existed for decades.
    Reply
  • scolaner
    Her name is MARY SPIO, can you please fix this?
    Yes, fixed. Apologies!!
    Reply
  • Thomzey
    AMD TrueAudio could be used for this as it support surround? I think this could be a great idea to put the processing power needed onto a dedicated chip.
    Reply
  • zodiacfml
    Nah, the company is overrating it. Audio technology doesn't need to have complexity of graphics as the information from human hearing is too small in comparison to visual information. Good example in this case is MP3 technology which is good enough for many people.
    The audio technology available from games and movies is a good starting point where the usual limit are the speakers that we use. I can't forget the audio quality coming from the Diablo 1 and Warcraft when I first heard them.
    Reply