NVIDIA Shows How Physically-based Audio Can Greatly Enhance VR Immersion

Company launches VRWorks Audio SDK to bring physically-based audio to VR

39

Positional audio for VR experiences—where noises sound as if they are coming from the correct direction—has long been understood as an important part of making VR immersive. But knowing which direction sounds are coming from is only one part of the immersive audio equation. Getting that directional audio to interact in realistic ways with the virtual environment itself is the next challenge, and getting it right came make VR spaces feel far more real.

Positional audio in some form or another is integrated into most VR applications today (some use better integrations and mixes than others). Positional audio tells you about the direction of various sound sources, but it misses out completely on telling you about the environment in which the sound is located, something that we are unconsciously tuned to understand as our ears and brain interpret direct sounds mixed in with reverberations, reflections, diffractions, and more complex audio interactions that change based on the shape of the environment around us and the materials of that environment. Sound alone can give us a tremendous sense of space even without a corresponding visual component. Needless to say, getting this right is important to making VR maximally immersive, and that’s where physically-based audio comes in.

Photo courtesy NVIDIA

Physically-based audio is a simulation of virtual sounds in a virtual environment, which includes both directional audio and audio interactions with scene geometry and materials. Traditionally these simulations have been too resource-intensive to be able to do quickly and accurately enough for real-time gaming. NVIDIA has dreamt up a solution which takes those calculations and runs them on their powerful GPUs, fast enough, the company says, for real-time use even in high-performance VR applications. In the video heading this article, you can hear how much information can be derived about the physical shape of the scene from the audio alone. Definitely use headphones to get the proper effect; it’s an impressive demonstration, especially to me toward the end of the video when occlusion is demonstrated as the viewing point goes around the corner from the sound source.

That’s the idea behind the company’s VRWorks Audio SDK, which was released today during the GTC 2017 conference; it’s part of the company’s VRWorks suite of tools to enhance VR applications on Nvidia GPUs. In addition to the SDK, which can be used to build positional audio into any application, Nvidia is also making VRWorks Audio available as a plugin for Unreal Engine 4 (and we’ll likely see the same for Unity soon), to make it easy for developers to begin working with physically-based audio in VR.

SEE ALSO
Unity's Main Branch Now Supports NVIDIA VRWorks for Enhanced Rendering Features

The company says that VRWorks Audio is the “only hardware-accelerated and path-traced audio solution that creates a complete acoustic image of the environment in real time without requiring any ‘pre-baked’ knowledge of the scene. As the scene is loaded by the application, the acoustic model is built and updated on the fly. And audio effect filters are generated and applied on the sound source waveforms.”

VRWorks Audio repurposes the company’s OptiX ray-tracing engine which is typically used to render high-fidelity physically-based graphics. For VRWorks Audio, the system generates invisible rays representing sound wave propagation, tracing the path from its origin to the various surfaces it will interact with, and eventually to its arrival at the listener.


Road to VR is a proud media sponsor of GTC 2017.

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. See here for more information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • Raphael

    A revolution in gaming audio.

  • ima420r

    I always thought if when they make 3d objects and give them shape and skins, they should give them sounds as well. You still need the formulas for where the object is in the room and change the sound based on the user. Multiple players could hear things differently based on locations, and you could even have some sort of interference variable that would distort sound if needed; like knock on a wall would be changed if there was a stud behind it or if it was hollow.

    Sound can be great, but what we really need is some sort of touch feature. I feel like Al the hologram when I play VR.

  • Sponge Bob

    this is overblown

    we humans do not know apriori what kind of acoustical environment we are in
    basic 3d audio is mostly sufficient – HRTF filtering and other tricks like that

    • Raphael

      We don’t know what kind of environment we’re in? So we can’t tell if we’re in a cathedral or cave or open space? We can’t hear the acoustic difference in different environments?

      Basic 3d audio is archaic; manually constructing sounds for different objects and environments. While game engines have visual physics for motion.. Audio has been completely overlooked. You think staying with archaic manual design is fine and that people can’t tell when acoustics should alter according to environment changes. If you’re involved in game design let me know which ganes so i can avoid them…

      • VirtualBro

        Preach it, buddy! Audio is in a really cruddy state right now in the industry. Even decent 3D positional audio is super rare in general.

        • Raphael

          I envisage a future where rendering software like 3d max and cinema 4d can render the sounds to go alongside the objects.

  • Dan

    It’s a great idea in theory, but the implementation in this case sounds to me like “open space, normal recorded audio, enclosed space, blend in the same audio with a massively overcranked reverb filter”. If the underlying system is doing something more complex than that, it really doesn’t come across. Standing in the middle of all the cloth banners sounded almost indistinguishable from the tight stone corridor.

    Certainly interested to hear what a good developer might do with such a system though.

    • Raphael

      I think there are two problems here: 1: you think the cloth on the lower floor should transform the acoustics massively compared to the upper floor.

      2: Key Features
      Real-time modeling of the following effects:

      Sound propagation, direct and indirect paths
      Occlusion for direct and indirect paths
      Directionality/HRTF (Head Related Transfer Function)
      Attenuation
      Diffraction
      Material reflection, absorption, & transmission.

      That’s a complex list of calculations.

      • beestee

        The list actually seems really similar to how ray tracing for light interacts with a shader for the visual appearance of a rendered surface…since a lot of the terminology overlaps, I would imagine that these would eventually combine into one neat and clean ‘physically based’ wrapper that an artist could simply assign to model surfaces to serve both purposes.

        If that happened, couldn’t the calculations play nice with each other and share what they know so that only the non redundant data is calculated?

      • Dan

        The issue I have with it I would summarise more as:
        – I think being in an enclosed stone corridor shouldn’t sound like my head is encased in a steel drum.
        – I think being in an open-air courtyard surrounded by heavy cloth banners shouldn’t sound like my head is encased in a steel drum.
        – I think floating three feet from the edge of the roof with mostly open sky above shouldn’t sound like my head is encased in that exact same steel drum.

        The precise acoustics aren’t really my point, it’s that the fundamental sound seems to be way off, and considering the supposed complexity of the system, it seems to make very little perceivable difference to the sound in almost any case except “surrounded by walls”/”not surrounded by walls”… and well, that doesn’t require complex raytracing to implement.

        I’ve no doubt the underlying system itself is very clever, but this demo simply doesn’t serve to showcase it effectively.

        • Graham J ⭐️

          I agree, it doesn’t sound right at all. It does vary a bit between the floors and the variances might be about right, but the base effect has too much reverb and doesn’t match the environment at all.

      • ChipsAhoyMcCoy
        • Raphael

          I like the retro graphics. Nice spacial effect but as with all the other 3d sound demos the sound is never able to project in front of you.

          • ChipsAhoyMcCoy

            Radiosity and ray tracing are very hardware intense, but they are techniques already in use in the industry to simulate light. Sound, like light, is also a wave. Modern hardware is more than capable of simulating realistic 3D sound, specially given that humans are visual creatures and sound does not have to be simulated as accurately as light for it to be realistic.
            A modern smartphone would probably have enough processing power to simulate 3D sound realistically. The challenge is creating the engine and getting the industry to take sound realism seriously. Thankfully Valve and Facebook are bigger than Creative, so hopefully the latter won’t sue.

          • Raphael

            Creative haven’t done anything for sound-tracing. All they did was create another generic spacial 3d effect and inaccurately label it positional audio.

          • ChipsAhoyMcCoy

            They killed Aureal’s A3D then started messing with HTRF until windows fucked everything up with DirectSound.

  • Lucidfeuer

    Ah…Nvidia Vaporworks. You’ll never actually see it implemented anywhere.

    • Raphael

      You mean like creative labs and their many failed audio realism systems?

      The downside of Vrworks is that it relies on individual developers. Nvidia shouldbt have to hold a gun to their heads. There are vr games making use of vrworks but you can be sure that many developers will never bother to make use of it.

      The good news is that it will be part of unreal and unity so at least indie games will make use.

      How long has elite dangerous been around? Any hint from frontier that they’re gonna add vrworks to boost performance? No. They have no interest. I could get a significant performance boost on my 1000 series gpu if the game supported Vrworks.

      • Lucidfeuer

        If you give developers better tools than they already use, then they’ll use these. If you give them crap, unsupported, proprietary, locked, unoptimised shit tools, then yes indeed they won’t.

        All nVidia Works are vaporware crap. Ever seen Flex in a game? Every seen it in a regular 3D studio app? Ever seen Waveworks or TurfEffect implemented? Nope, because these are shit unusable tool. That’s a shame of course, but it’s nVidia’s fault, not developers.

        • Raphael

          You mean like Raw Data and Serious Sam? Proprietary…deal with it. If I buy an Nvidia GPU I expect features exclusive to Nvidia. 1000 series specific features.

          What about crossfire, Havok?

          I do hope you ain’t using Nvidia GPUs.

          • Dan

            Havok has always been GPU-agnositc… I’m not even sure it runs on GPUs. Not sure what your point is there.

            I stick with nVidia cards myself, but I would never expect vendor-specific proprietary features, they’re always a bad idea for developers and for consumers.

          • Raphael

            I want the vr specific features of the gtx 1000 series which means vrworks support in game. I think it would be much better to have systems that aren’t exclusive to one brand but realistically there will always be exclusive features. ATI have liquid vr yes? I don’t know much about it but i guess it’s exclusive to ATI?

      • burzum

        Frontier totally sucks when it comes to ED’s development. There is a ton of indi developers doing a better work. I stopped playing it long ago, the game had so much potential… :( Guess they have trouble getting these vendor libs implemented in their Cobra engine. From what I’ve read in the forums they have a high fluctuation in developers and according to company reviews of people who worked there a messy code base. I’ve not been surprised when I read this, because it just reflects their bug ridden and painfully slow development of this prototypical bugged game.

      • Graham J ⭐️

        The downside of everything nVidia is that it requires nVidia hardware. Many devs aren’t falling for their bait.

        • Raphael

          Nvidia are the enemy are they? Do you even know why you’re bitching?

          Nvidia purchased PhysX a long time ago. Many games supported it over the years. Deal with it.

          Vrworks is integrated with unity and unreal engine. Deal with it.

          Industrial light and magic was one of the first to make use of vrworks.. Deal with it.

          Vrworks doesn’t run on ATI? Deal with it.

          • Graham J ⭐️

            You know, there’s a coding practise called DRY; you should apply it to yourself.

            Calling them an enemy is too simplistic. They are a hardware company who are not shy about enticing devs to release software that requires their hardware.

            You can tell they bought Physx because it’s one of the few APIs that actually doesn’t require an nVidia GPU (it can run software-only on the CPU)

            It’s true that Unity and UE abstract devs from these APIs but they both had to deal with the fragmentation and that has increased timelines and platform complexity to no one’s benefit.

            Hardware-locked APIs are fine if you’re only developing internal tools as ILM (mostly) does. Everyone else has to decide whether to shut out AMD users (which includes consoles) or develop their own abstraction frameworks, again affecting timelines, complexity and cost.

            Cross-platform tools (like Steam Audio) are better for everyone.

          • Raphael

            Stating the obvious because? Of course cross-platform is better.

            Let’s deal with reality… Nvidia don’t really do cross platform very well just like octopusvr only support their own hardware and bribe devs for exclusives.

            Ypu can bleat on all day about the need for cross platform but thus far your bleating hasnt changed Nvidia one bit.

            Meanwhile a percentage of games will support vrworks. Ultimately it may be another nvidia system that lingers without great success. I don’t care either way. I have zero brand loyalty.

          • Graham J ⭐️

            If you don’t care why TF did you post? Go away kiddie.

          • Raphael

            Probability… What are the chances of your command working?

  • Sponge Bob

    actually, since HRTF is individual to each human, using generic HRTF already kills those finer effects
    If one wants to maximize audio realizm they need to individually measure HRTF for every user – hardly justifiable task

    • Dan

      Well, I think there’s a world of audio innovation left to cover before anyone hits up against the issue of catering for individual ear geometry. Also, I don’t think that would be a case of needing to individually measure each user’s response so much as a case of needing to invent an entirely different audio playback mechanism besides speakers/headphones – one that played back precisely calibrated audio in a positionally physically correct fashion, and allowed the user’s hearing to respond as it would to real-world sound.
      A pipedream, and certainly not the only thing that could be done to advance the technique beyond what currently exists.

    • Timotheus

      They don’t need to measure, they need to only apply the HRTF. The measurement or configuration can be done by the user and be saved in a profile. Several standard HRTFs to choose from could be provided.

      The current problem, is that all implementations take some single standard HRTF, which may give good results for 20% of people.

      For example, I couldn’t locate the sound in the front at all.

      I ordered, the OSSIC X headphones, which should be able to measure and apply a custom HRTF personalized to you.
      I hope sound departments in games realize, that custom HRTFs are the way to go.

      • Sponge Bob

        well, someone needs to measure it , right ?

        measuring individual HRTF involves placing tiny microphones INSIDE the ear and putting a user inside a sphere or something with multiple speakers emitting different signals from all directions – a BIG hassle, and expensive too,.
        If someone tells you that some “headphones” can measure HRTF they are probably lying…

        • Timotheus

          There is a difference between “professional” perfect HRTF, where you need an anechoic chamber and a hour long session to calibrate from which exact position a sound comes, compared to “consumer” grade HRTF.

          There was a site out there where you could choose one HRTF out of a lot of samples, in order to take the “best” HRTF for your biometric data. For 99% of the people those samples, were already enough to provide a realistic sound experience. The best HRTF for you allowed you to also exactly locate above and below and the hardest part, in the front. (Most of the times, a standard HRTF sounds like it’s behind, when it’s in the front).

          In essence for a “realistic” hearing experience two major factors play a role. The ear-to-ear distance, which defines, how long a sound takes to propagate from one ear to the other, and the ear shape, which defines how the sound bounces off and in the ear.
          The OSSIC X is designed to measure (at least) the ear-to-ear distance, for generating a customized HRTF, which SURELY is at least for a lot of people a dozen times better, than a standard HRTF used, by all those games and videos and whatnot.
          A lot of people already tested them (you can at some expos, where they are present) and they said, they really can locate the positions exactly.

  • rabs

    I wonder how it compares to Steam Audio SDK (hardware agnostic and free). Features seems quite similar, even if their demo looks less pretty than Nvidia one.

    Ref: http://www.roadtovr.com/valve-launches-free-steam-audio-sdk-beta-give-vr-apps-immersive-3d-sound/

  • Foreign Devil

    This is way overdue. . I remember Creative was trying to implement this stuff like a decade ago. Developers didn’t care enough about audio. You couldn’t show it on the game box cover.

    • Marcus Scottus

      True. Fortunately the days when audio was an afterthought are gone.

    • Bjørn Konestabo

      No they weren’t. They killed the company making hardware doing these sorts of things (Aureal Semiconductor) through frivolous lawsuits, which Creative lost, but Aureal went bankrupt. To add insult to injury, Creative bought all their patents to prevent anyone else doing this, and shelved them.

  • Bjørn Konestabo

    At least nVidia is too large for Creative to ruin them with legal battles, like they did Aureal. Maybe we’ll finally get traced audio like we had in the 90s.