Literally Putting Their Hearts into the Game – Interview With Moss Audio Director Stephen Hodde
Moss is an incredible PlayStation VR game that transcends how the medium is traditionally used to provide players with a very real bond with a digital mouse. If there is one, single must-own game that communicates the experience and possibilities of virtual reality, I think that Moss holds that very title. VR is about immersion, but Moss takes it one step further to make it about connection.
A couple of weeks ago, we had a chance to talk to Composer Jason Graves about his work on the Moss soundtrack and how his songs helped to evoke the kind of emotion that Polyarc was trying to communicate. We did this interview in collaboration with Polyarc and Graves releasing three of the songs off of the Moss soundtrack, all of which deserve a listen.
Following our look into the music of Moss, we wanted to get a sense of how the overall sound design worked, particularly when working with virtual reality and 3D audio. Polyarc Audio Director Stephen Hodde was kind enough to sit down with us for 30 minutes and give us his insight into working on bringing the world of Moss to life through audio.
Hodde has an extensive background of audio work in the video game industry, ranging from credit on Destiny to the Saints Row franchise. His work with Polyarc and Moss is the first time he’s done virtual reality audio, and he had a lot to share with us about how he crafted that experience, including comparisons to Lord of the Rings, and the story of how some members of the development team literally have their hearts in the game.
PSLS: I took a look at your portfolio. I’m a huge Destiny fan, so you’re off to a great start. You’ve done a lot of audio work that I’ve really loved.
Stephen Hodde: Wow. Thanks man. That’s great.
And then also Saints Row and all that really goofy sound effects going on with a lot of the guns and everything there. Your portfolio is very impressive. How did you go from that and from games like Saints Row and Destiny and then get into doing VR games? Was that kind of a happy accident or were you looking to move into the VR space?
It was a really happy accident. The story is that I wasn’t really excited about VR at all and I had a job and it was fine. I’d worked a little bit with some of the guys here [at Polyarc]. A lot of the guys and gals here [at Polyarc] are from Bungie. And so the people that I didn’t get to work with I knew by reputation. I was having a dinner with an audio programmer friend and he said, “Hey, these guys are working on some really cool stuff. You should go and just see what they’re doing.” This was around the time that a Moss prototype was in development. So I went over to the studio and at that time it was like a 10 by 12 room and everybody was smashed, just smashed in there.
I put the headset on and they started just giving me the stock spiel of “Okay, here’s what the control schemes are and this is our main character.” I looked at Quill and there was something in me that started to kind of, like some strange feelings started happening toward this video game character. And then Tam, our CEO, he was like, “Well, you know, if you reach out and grab Quill, you can feel her heartbeat.” So I went and I did that and with the controller there was rumble feedback. And at that moment I kind of resolved “hey, this thing is really extraordinary. Now I understand what could be cool about this medium.” It took Quill kind of showing me that, and then I started a shorter journey of convincing them to hire me from there.
You talk about that connection to Quill, how did you bring that across through audio? You already had that going into it without doing any sound design on it. How did you bring that then into your sound design for Moss?
I think that it works across a couple of different axis. From a subtle standpoint, exaggerating what Quill feels is what you want the player to feel. So that can come to from like, the scale of sounds. So if you notice, there’s a lot of really kind of big sized sounds in the game, like when you’re moving the brick with your hand or when you’re rotating the spindles around or there’s a giant fiery door. If you were to transpose that thing into real life, the scale of that sound would be pretty small and I think if you’ve ever just kind of dragged to a brick across the ground, it’s not terribly impressive, but in that world, that feels really heavy and that there’s a certain amount of power that’s required to move, it feels kind of difficult.
When those things sound bigger, you’re seeing and you’re hearing things almost from her perspective. There’s this kind of melding between, between the two of you, which is the magic of the game as well. This bond that you’re creating. You’re hearing things from her perspective. There’s also how ambient sounds are constructed. So like when you are in that forest scene at the very beginning, the vibe that I was going for was kind of like when I was a kid growing up. I spent a lot of time exploring in the woods. And, I think that if I were to go back in time and put up a microphone in the world, that the sound that I would’ve captured at that point in time would probably not capture the emotion or the thing that I have that’s in my head from that memory.
The nostalgia of it.
Yeah. Yeah. The feeling was of that time is, you know, you don’t have any worries. You’re kind of out there, the days are really weirdly long. Everything feels really–in the memory–feels really bright and just full of life. And so I tried to take that same kind of feeling from that memory and put it in that scene because that’s where Quill is in her life and her journey at that point in time. Before she encounters the glass, she’s just been out in the woods trying to find her own adventures. Exploring stuff, turning over rocks, looking for weird bugs or something. Kind of like how kids explore.
I got a Lord of the Rings vibe early on with it, like Frodo out, just playing around, happy until Gandalf shows up and…
Yeah, yeah. The Shire is a good point of reference. We would draw and map out the entire game and [ask] “what are the emotional beats for all the stories? What are the emotional beats for the story moments? Where are the emotional beats for the environments?” and at the beginning, yes, it’s The Shire. Bright and clean and green and idyllic and perfect. If you can capture that feeling through sound, like it’s birds and things scurrying through the woods, things kind of falling off of logs. If you can capture that emotion, then you kind of feel what Quill must feel like at that moment when she pops out of the bushes and you meet her for the first time.
I know audio is important. I’ve communicated that. I worked at a large music retailer for a decade of my life. Audio is a big part of my life and I think a lot of people actually realize how much audio gives to any given scene or moment, the subtle impacts on player perception and things like that. Can you talk a little bit about kind of how audio just overall impacts a player’s perception of a game?
Yeah. If you were to at any given point in your normal, everyday life, just take a moment and see if you could inventory all the sounds that you were hearing it would be probably dozens of thing,s from really innocuous things like cars going by or the air conditioner that’s humming in your window or your chair creaking back and forth. Just these really subtle things. [Stephen motions around the room] Can you tell where I am getting these things from? Those things arrive in your brain and they deliver your sense of space. And this is not only just the things that are emitting from those sounds, but how those sounds play in the environment. So right now I’m talking to you and my voice is bouncing all across the room and as it’s bouncing across the room it has a certain kind of character.
There’s this glass that’s right in front of me. There’s carpet that’s on the ground, there’s tiling that’s on the ceiling and each one of those surfaces has its own properties. So it reflects the sound in a certain kind of way. And so in addition to just saturating world with all that detail, there is “how does the sound react to the environment?” So understanding that like when Quill is running around in a really tiny space that is made of wood and has some hangings, that the sound that propagates throughout that space responds as if it were a real room. And in some instances the reality of what that room might sound like and what it ends up sounding like in the game are different in a couple of different ways.
As Quill is a descending down into the very lowest part of the ruins, there’s this kind of feeling like that the rooms get smaller and smaller and smaller and smaller, and the ambiance and the wind and everything gets quieter and lower and deeper and deeper. That just provides the player with a sense that they’re moving from one location to another, but also as things drop that they themselves are physically dropping down into these smaller places until there’s almost nothing there. All you’re hearing are really low–like when you get the key in the ruins to go unlock the door that gets you out of there–there’s this low kind of thrumming “Mines of Moria” type of boom. And just the sound of her breath and her feet, and just a very subtle, kind of howling wind tone.
Almost so subtle that a players aren’t going to consciously notice these things, but it helps with that immersive aspect.
Oh yeah, you’re totally right. I think that if it’s doing its job, it’s totally transparent and it feels like a world where they’re feeling some sense of unease or something. Maybe they can’t exactly–players can’t put their finger on it, but it works the way that sound works, which is that unless there’s something dangerous, like a siren or something that’s coming by you, most of the time you’re just kind of living your life, doing the things that you do.
So that kind of brings into play, now, VR where you’ve got this 3D head space. you’re able to lean in and around the sounds and you talked about how the sounds interact with the room. How does that change how you do sound design for when the player actually has the ability to be in the middle of the sounds as opposed to staring at a flat screen?
Yeah, I think that a lot of the projects that I’ve worked on have borrowed from film in their aesthetics and that there was a lot of attention paid on Moss to getting things sounding like the way hearing works, so that when you’re turning your head in certain directions, the sounds that are emitting from those locations are staying where they are and in the vast majority of those situations. Again, it goes back to detail. I’ve never put more sound into the environment than I have on this project. And getting each environment to sound unique too. I think that with games where you’re separated from the sound–it’s over on a screen–moving between locations isn’t always as necessary to have really distinct sounding spaces.
But in VR part of the pleasure of it is just being in the world. We did pay a lot of attention to the game mechanics and the things that make the game fun, but if the player wants to, they can just sit and be in our environments for long periods of time. It should feel natural enough that it doesn’t sound repetitive. It should always be this evolving texture of sound the same way it might work in the world.
Are there any audio or sound design tricks that you might normally use in 2D media–somebody staring at a screen, working on Destiny, working on on Saints Row–that you couldn’t use for Moss or vice versa? Are there any tricks that you could use for VR production that you couldn’t use otherwise?
The great thing about VR is that we’re pretty certain all of our our players are using headphones. There are certain things that you can do when players are using headphones that you can’t if they’re using TVs. So in my house, there is no separation between the place where we watch TV and the place where we cook food. There’s also a washing machine close by and other kinds of sound-creating things. And so that means that the general atmosphere of the room already has a lot of stuff going on in it. And so you have to create a soundtrack that has a narrow band of loudness. You can just increase the total volume of the sound and everything still feels pretty good, like it’s in the same place. But if your players are using headphones, that means that they’re isolated from all that sound. And you can use truly high dynamic range. The distance between the very quietest sound and the very loudest sound is a lot greater.
When you’re in the ruins, for instance, and you’re at the very bottom, like I just mentioned, it’s extremely quiet. And when you’re fighting Sarfog, it’s actually really loud, and you can’t do that kind of thing in any other situation. If you aren’t using headphones, that stuff might just get lost in the ambient atmosphere of wherever it is that you’re listening. And that also happens to help with immersion because that’s the way the world is built. The world has a ton of dynamic range in nature just built into it off the bat.
When you go to a quiet room and you can hear the ringing in your ears or you can hear the thrumming of your blood in your head, these are incredibly delicate, quiet sounds. Then you can go to a rock show, right? Like there’s huge, huge amount of range there. It also helps with VR fatigue in general. Having a really quiet game hopefully makes people feel really comfortable all the time and that was a big thing that we focused on with the design of the game was wanting to make people feel, feel comfortable in the headset.
And then, also, you can kind of change that range where you can create tonally appropriate moments throughout the story, where the start can be kind of quieter, comfortable, and more Shire-like, and then as you move on to more dangerous areas, you get these more, almost aggravating sounds to your emotional sense.
Yeah. Yeah, totally.
Would you say then that it does a disservice to the audio to not use headphones when playing it? I mean, most people are going to be using headphones, but there are a lot of people who don’t necessarily like to [use them with VR] because of the total isolation that they get.
I think it’s the best experience in headphones. I understand if you want to play with your family, it’s not going to be. It was designed so that if you are playing on your TV [speakers], you’re still gonna have a good experience, but it was designed with a sort of subtle compromise built into it.
So it does still work over speakers, surround sound, etc.
Yeah, it absolutely does. But there’s something to turning your head and having the world stay put, you know, that really just makes it–it’s a multiplier for the sense of being in the world. You can only get that with headphones.
Moving from the interactive game elements to the more linear media–I know you’ve got experience with both sides of things–how do you make sure those to flow together effectively? Because with linear media, you can have an audio track that goes point A to point B, straight through along with what the player is seeing and you’re done. But with interactive elements, you have to account for anything the player could be doing.
It’s a challenge for sure. You are creating dynamic systems which take into account as many of those factors as possible, so that when Quill is a running around the world, her movement sounds feel really natural and kind of blend into the environment. And then when she goes into combat and the music kicks up, her movement sounds kind of pick up and become more clear and you’re able to hear her against the backdrop of a new thing.
So you’re even altering–not just interactive and linear portions of the game–but the different interactive portions of the game.
Yeah, absolutely. It’s creating as [many] dynamic layers and qualities of sound as we can. And in understanding how all those– what the worst case scenario is for however many sounds are playing it once, when a certain sound might be getting lost in the mix or the overall soundtrack and trying to compensate for that with scripting or programming.
What’s the biggest difference working with VR audio specifically over 2D-presented audio, in terms of mixing and your process for what you have to account for?
Good question. Apart from, you know, a few of the things that I just said, I think that really just the dynamic range that I talked about before.
When you’re using the positional audio are you actually setting emitters within the world then? So that it’s positional so that this is over here and the sound is coming from this location and this object or point, and if you move your head around that…
That’s precisely it. Yeah. There’s a little points that we dropped down in the world or that we embed in the creatures. And so as that creature is moving around, the sound follows it. In the case of ambiances, there are dozens of dots up in the trees or down by the water or in embedded in the dragonflies that fly around.. It’s safe to say that anytime you move your head and the sound stays where it was when you were last looking at it that it has been hand placed there. Then it’s either updated just based off of the fact that the object is moving or is kind of baked into the map itself.
Is that similar to how a traditional sound design within games works, or do flat screen games use less of that than VR games do?
I think the biggest change is in the overall far-field ambiance. A lot of that is kind of flat and static–the things that are happening kind of outside the view of what you can see. I would also say that there’s a granularity of points of location, so making sure that you’re really precise about where the object is in space. If you’re standing away from your TV–that’s six feet away–and the point isn’t precisely on when the character is saying something and it’s not on the character’s mouth, that’s kind of okay. But in VR you have to be really diligent about placing things exactly where you might expect it to emit from.
With TV screen media, you only have to account for at most five or seven speakers, and that’s if they have an actual surround sound setup.
Yeah. Which is pretty rare.
Going back to your portfolio again, you did have some hand in the Moss story design. How did you come into that, not just with audio, but bringing some story into it as well? And then how did the story and audio go hand in hand? What do you feel like you contributed to Quill’s story?
The great thing about Polyarc is that we are a highly collaborative bunch and if you feel really passionately about something and it’s not what you were explicitly hired to do, you are encouraged to talk about it. And one of the things that’s really important to me is story, and having a good understanding of what the spine of the story is and what the meaning of the story is. I find it inspires me to do good sound design. And so I spent a lot of time just talking with our writers about what the story means and who the characters are and contributing some small details to the world and making suggestions about how to improve them. I think that it’s hard to, in a collaborative environment like this, it’s hard to really nail down actually what your specific contribution is. I think that a lot of the ideas go into a big bowl and are kind of a stew, and kind of shake out in some new form that wasn’t maybe the thing that you said or maybe the thing that you said and inform some other point in the chain to something else entirely.
And the same goes for the audio actually. So like we have a lot of really great audio minded people here. The sound of the page turning, for instance, it’s something that people talk about a lot. That was something that was an idea that Tam or maybe Danny came up with, that every time you move between these rooms, you hear the sound of the page turning, which was a really elegant way of reminding players what the container of the world is. You start out reading a book and you’re reading the story and you’re transported into the book and you’re kind of living inside a book while while you’re participating in the story of Moss. So you’re just getting these little reminders every time you’re going from room to room that you’re still there. You’re still there. There are a lot of examples of people on this team walking into my room and saying something like “Hey, I have an idea. We should get sound playing out of the book, coming out of the book at the player.” That was a really great idea that something one of the engineers came up with.
A lot of things that are really only perceptible in VR, that if this were a standard screen game, these same kind of ideas may not have had the same impact or effect.
Yeah, absolutely. Absolutely.
What do you see for the future of a virtual reality audio? Where would you like to see developers explore further? What things are you kind of developing and exploring, maybe things you didn’t get to get to do with Moss, but ideas that you have for future projects?
So the thing that’s really exciting about what’s happening in VR audio right now is technology that is out there and available right now, which models the way that the head processes sound. When I’m listening to you right now, there’s a sound coming out of the speaker and uh, it is arriving at different times. There’s a time delay between my left ear and my right ear and that lets me know when I move my head, and that timing is the same, that you’re directly in front of me. It’s also passing over my head, over my nose and a certain frequencies are getting filtered out, and it’s also traveling around this outer part of the ear and down into my ear canal. And that’s another thing that video games are trying to do right now. As the technology progresses, it’s going to become more transparent, and feel more like hearing, uh, and uh, that’s really cool.
Kind of like those real audio, uh, the name of them escapes me now, but those ones like the hair cut at the barber shop…
Yeah! Binaural recordings. Is there anything like that? Do you think that is the future of VR audio? And did you do any of that for Moss?
I think that that is a tool in the tool set for sure. I think for games that require a lot of pin point accuracy of sound that can be really useful. I think that the thing that I like to focus on, and the thing that is great for Moss is again, creating really rich and environment sounds and finding a lot of a unique sounds and beautiful sounds that people have never heard before. Gradually the technology is just going to become available and like I said, become more transparent and over time we’ll make use of it more.
You mentioned beautiful sounds, things people have never heard before and I’m fascinated by foley audio. Were there any interesting kind of “fun facts” sort of things that you may have used for audio that people wouldn’t expect or that were kind of odd or off the wall?
[To PR] Should I tell the heartbeat story? [To PSLS] Yeah, there’s so much to choose from. We literally put our hearts into the sound of the game. One of the things I did was I assembled a microphone that is essentially the tube that comes off of a stethoscope. I kind of just ripped that off and then took a small microphone that’s generally used for dialogue recordings on movie sets and just put those two things together. And then I went into a really quiet space and put the stethoscope on my heart and recorded the sound of my own heartbeat. And so when you reach down to heal Quill, you hear this “thu-thunk,” and that’s my own heart.
One of the other things I did, which is part of [Quill’s] healing process, was I got a fetal doppler monitor, which is something that is a home device that you can get to hear the sound of a baby’s heartbeat whenever you want. There’s a guy that works here that has a really kind of naturally fast bio rhythm. He’s a really thin guy and really quick heartbeat. And we took this device and got his heartbeat. And the great thing about the sonogram sound is that you hear the valves pumping and you hear the blood rushing around in the veins. I don’t have any children, but my sister has some. And I went with her and heard her son’s heartbeat for the first time at the doctor’s office. It was a really powerful experience, It provokes this sense of the need to protect and feelings of love. It’s something that I think you can’t help but feel when you hear that sound. Whenever Quill is distressed, you hear this sound, this sound of a heartbeat through that mechanism. And we actually hooked up a method of monitoring what the peaks are of the heartbeat, so that when it goes “ku-thunk,” the backpack flashes kind of at that same moment.
So like you said, quite literally put your hearts into the game.
Very cool! Is there anything specific with audio that you’re kind of–I mean maybe obviously you can’t get too specific about on unannounced things or whatever you’re working on next–but is there anything that you’re particularly excited about that you’re working on that is coming?
Yeah. I don’t know what I can–[To PR] what can I say about that?–[To PSLS] I can say that I am really excited for the future here and I can also say that watching our fans play the game has been really awesome and has even yielded more ideas about things that we can do in the future. I’m just watching how they react to Quill, having them laugh and cry and jump and be frightened, and that audio played a part in that is really quite extraordinary experience.
Any last things you want to say about VR audio? Anything you wanted to talk about or get out there?
Just out of, out of curiosity–that was the really the last big thing–but I just wanted to see what your experience was like with the sound and if there are things that jumped out at you.
Everything we talked about. I guess one thing we, we kind of touched on, but didn’t talk about directly is communicating the tiny nature of the world. You have this world that is on a microscopic level through a macro lens, but you were able to make it feel big but also small all at the same time. And I thought that was just a really interesting feeling because you could tell it’s this huge grand adventure, but also that you’re down in the roots and the rocks and at the base of trees and you’re hanging out with mice and bugs. From my perspective, that was really cool. It’s one of the things I love about VR the most is those two alternate perspectives. Either the, the God-view over everything, you’re massive and everything’s tiny beneath you and you can see over a world map or something, or on the opposite end of the spectrum, getting down really small and really macro and in close to something that is normally worlds away from us.
We’ d like to thank Polyarc for lending their Audio Director to us for 45 minutes, and Stephen Hodde for indulging our questions about Moss, VR, and sound design.
Moss is available now exclusively for PSVR on the PS4.
- Collaboration With Polyarc and Emotional Bonding With a Mouse: Interview With Moss Composer Jason Graves
- Supporting the Narrative Through Audio Design – Stephan Schutze Interview
- Polyarc and Composer Jason Graves Release Three Songs From the Moss Soundtrack, Includes Trailer and Credits Song ‘Home to Me’