Hello all,
I am doing a small piece of audio localization work for an upcoming Russian game that is being translated for the English-speaking audience. You can find the original Russian narration here: http://www.acsu.buffalo.edu/~breslin/intro_speech.ogg
I have not yet recorded the actor. I am planning to use audacity, going through an m-audio mixer w/ Delta 66 sound card, and a Shure SM58 mic. This is my first professional recording job, and it’s an audition for me as well, so I want it to go perfectly. Any advice would be appreciated.
Listening to the original sound file, it sounds like the Russian engineer did a light ‘duck’ on the music track. (So the music is a little quieter while the narrator is speaking.) I don’t know what other effects are being used on the Russian voice. (Not that I necessarily have to match the exact effect profile, so long as the final result sounds good.) I guess I’ll mess around with light reverb, compression, and… well, probably try a lot of random things. I’m getting the Russian studio to send me a clean music track, so I can experiment with the music+voice mixdown.
Anyhow, please let me know if there’s anything in particular I should look into for voice actor recording, particularly for this little project.
You’re writing that like the voice capture step is a snap. Describe your studio. Do you have a nice quiet, echo free room? Most people don’t. This came up before in a similar post.
http://audacityteam.org/forum/viewtopic.php?f=26&t=10293&start=0&hilit=raspberry+sorbet
Koz
Don’t fall in love with that music ducking thing. That can get really annoying in a hurry. The sound people call that “pumping” and it’s the sure sign of somebody trying to record a track in their basement.
The Russian track could easily have been someone applying a compressor to the final in order to make it tight and dense. You might try Chris’s Compressor for that. He wrote a very graceful compressor that can be made to sound like a radio station. The show has limited dynamic range and you can’t quite put your finger on why.
http://pdf23ds.net/software/dynamic-compressor/
Koz
Yah to be honest, the music in the backgroud sounds heavily compressed, and -like is mentioned above- gets tiring after awhile. Really, it is what keeps the audio from having that crisp, professional sound. What a pity, too, because it is quite a stunning soundtrack, and matches the Russian quite well.
To build on all the above, apply Chris’s Compressor (or any of the other compressors) to just the voice to make it more forceful and dense, then gently mix in the background music. This only works if the voice track is clean – no echo or trash underneath.
You still need to start the day with a good voice and no Metrobus in the background.
Koz
Thanks!
Yes I do have a good recording space, so no worries about room ambiance problems or external sounds (metrobus, etc.).
About the music. Probably the original sound engineer was dealing with a compressed file. Actually I think he mixed two compressed files to make the music track. Then, he mixed the result with the voice track and compressed that. Maybe I can get the original uncompressed music files. I will certainly look into this.
For reference, here is my recreation of the background music: http://www.acsu.buffalo.edu/~breslin/intro_speech_bgmusic.ogg
@Koz – Understood about applying a compressor to the voice alone. But then you say “gently mix in the background music”… Obviously you mean gently in the sense of “not too loud background” but do you also mean compressing the music+voice tracks together (or other operation)?
<<<gently in the sense of “not too loud background”>>>
I mean without aggressively trying to follow the vocals. Rent major movies and listen to what they do with the sound. I don’t know that I would listen to other games. They’re all making the same mistakes.
<<<do you also mean compressing the music+voice tracks together (or other operation)?>>>
Do Not Do That.
The interaction between the vocal and the background during joint compression is what gives you the odd pumping and strange volume changes. You can gently manually duck maybe once at the beginning of the narrative and just leave it down there for the duration of the dialog.
If there is a stinger point or place where the interest or plot changes, that’s when you change things around. I wish I had a sample of a popular movie before they added the sound. It’s stunning how flat it seems until all the track sounds are added – manually.
Koz
Thank you. Let me share a couple sound files with you. First, I have a file which contains a number of sound tests. I hope you can help evaluate it.
http://www.acsu.buffalo.edu/~breslin/voice_test.ogg
I think I have already learned that I should do noise reduction before amplification.
Otherwise, I think this test reflects a reasonably good setup. Any suggestions for improvement?
Now I want your thoughts on the project. First, here is the original Russian audio file, for your reference:
http://www.acsu.buffalo.edu/~breslin/intro_speech.ogg
And now, I have mocked-up an English version. This is just my voice, not the actor’s:
http://www.acsu.buffalo.edu/~breslin/intro_speech_breslin_mockup.ogg
Thanks to Koz, I think the English music track sounds much better than the original Russian, because we’re avoiding the pumping. 
Still, I can certainly tell that the voice doesn’t sound correct yet. (And not only because I don’t have a good acting voice.) But I don’t know what precisely is lacking. It feels like I need more aggressive compression on the voice track, and maybe a bass-heavy equalizer effect. But honestly I’m shooting in the dark. All suggestions would be greatly appreciated!
Edit: I’m recording the actor tomorrow. Then I will edit the recording over the weekend.
You need to do that again and this time put the round nylon pop filter between you and the microphone. Then back up another foot. The track is full of the ultra presence and odd dynamics that only happen when an announcer is trying to talk to you in real life three inches away from your ear. There are actually a number of crack sounds in there from plosive overload and peak distortion. Back off, dude.
And take the waveforms with you. The track I got has voice waveforms that live at 0dB. That’s very dangerous because sound systems run out of steam right there and it invites overly crisp peak distortion. Tracks rarely recover from that. If you’re going to Normalize or Amplify, do it to -1dB.
A surgically clean straight capture track is really important. I would probably not be able to fix that mixed track even if I had the voice by itself. Capture with lots of headroom – don’t get anywhere near zero during the live performance and do final production in production, not during the capture.
I did a live capture last week for one of our productions. I didn’t have a working script – the director didn’t either – so nobody told the sound guy, me with the headphones, that the next shot was the actor yelling at the top of his voice. We did that one over and my ears are still ringing.
After you straighten that out, duck the orchestra very gently just as the voice starts. In this sample, the orchestra is too hot behind the vocal, but it probably starts out OK.
Viva Don LaFontaine.
http://abclocal.go.com/ktrk/story?section=news/entertainment&id=6365552
http://www.youtube.com/watch?v=ZJMGS7l0wT8
Koz
Have you noticed that this entire thread is in the wrong forum? We should be in Recording Techniques.
Koz
[Moderators note] Topic Moved [/Moderators note]
Thanks again. I recorded the actor, and I got a good baseline track. (Actually ended up cutting a number of tracks together, but I think it’s seamless enough, especially with the music background.)
In post, took a lot of messing with, because the actor’s volumes were not at all consistent. Some words I had to manually select and amplify by hand. I ended up doing normalization, amplification, and a number of light “bass boost” effects at various frequencies between 500 and 1000. (This worked better than the eq for some reason…)
Anyway, I think it is in good enough shape to submit, but please tell me what you think:
http://www.acsu.buffalo.edu/~breslin/intro_speech.ogg
Any suggestions about making the voice sound better? Did I duck the music alright?
Oh, that’s just vastly better! No crunchy distortion or peak cracking. Clear and crisp with good expression.
I know you went nuts patching it together. That’s OK. I can’t tell what you did. That’s how Editorial is supposed to work.
I’m going to listen four or five more times, but just the fact that I’m having to work at it is very good. I’m sure I could do much better finding problems with the raw voice track, but the mix may be good to go right now. I like the actor’s presentation even if it was a pain in the butt to assemble it.
Koz
Thanks, Koz. Just for fun, here it is:
http://www.buffalo.edu/~breslin/just_voice.ogg
Mind that there’s a second of silence at the beginning.
EDIT: Also, I used different rules for cleanup. I normally try to shy away from breath removal, because it creates an unnatural sound. (Breath is part of the acting.) But with the music I did this a little more strategically.
Are you beginning to appreciate the ten to one rule? A three minute piece takes thirty minutes of production.
Another editorial oddity is that everything you did in that thirty minutes is devoted to making it seem like you didn’t do anything. You know you succeeded when the final sounds like the actor walked in cold, spread out the scrip, harumph’d once, ran through it once word-for-word perfectly, packed up and went home. Which is very nearly what this one sounds like.
Is there a little “room” in the raw voice track? You can just tell the size of the room that the track was recorded in – and it wasn’t very large? Did you pile the furniture moving quilts all over?
What was the separation between the actor and the mic?
Koz
Dueling postings. I haven’t heard your voice track yet.
Koz
Hah, I wish! It’s a 1.45 minute piece, and I worked on it for about 4 hours. Then again, I am still learning… Anyway, it’s a pleasure, you know.
Another editorial oddity is that everything you did in that thirty minutes is devoted to making it seem like you didn’t do anything.
Oh yes. I also work in the field of translation. As you can easily imagine, the translator has done the best job when their efforts are entirely invisible.
Is there a little “room” in the raw voice track? You can just tell the size of the room that the track was recorded in – and it wasn’t very large? Did you pile the furniture moving quilts all over?
You have an amazing ear for this. The room is 12x12 foot, dense-weave carpet floor, various quilts and canvas dropcloths on the walls, bare painted foam tile ceiling (bare=not covered).
What was the separation between the actor and the mic?
Yeah, that was just like you said: pop guard about 4 inches from the ball of the SM58; actor’s face about 18 inches from the pop-guard.
EDIT: Also, the mic was pointed directly at the actor’s mouth.
EDIT EDIT: The original was a little uneven. I need to give better “mic control” instructions to the actor. My fault, that. But the couple places in the audio that sound the most uneven are actually not patched, but it’s just how the actor delivered the line. But anyway, I hope the breaks are covered up enough by the music track.
Was the voice actor acting? Sometimes you have to tie their arms to their sides so they don’t emote too much. Bobbing and weaving is Not Good in voice capture.
This is rough now. You’re right at the edge of I can’t help any more.
Random Silly Notes:
– The grownups would be mixing and composing the two tracks to each other so that the orchestra emotion and presentation matches the voice actor. Sometimes they’re recorded together. More often, the orchestra goes first and then the voice actor – with carefully matching script – listens intently three or four times and then matches his/her presentation to that. You can make a lot of mistakes vanish if you do that.
– Deader Room (I still haven’t heard the voice track).
– Slightly further away from the mic, although this is getting harder to find improvement. Naah. Leave it alone. Additional room suppression should do.
– And last, days and pages of postings later, what happens if you add very gentle echo to the voice? Remember the original post? NOW you can do that. I hope you’re in Audacity 1.3. The effects there are much better than earlier versions. The echo effect is very difficult to use because of the ease of overuse. Particularly difficult if you have some room back there to begin with. You can put room in but you can’t take it out. You might easily decide the effect is a bad idea, but I’d be curious to see what it sounds like.
– Since I have actual experience with an SM-58, the pattern is mostly in front, so you can pile or hang the quilts right behind the actor, or create a quilt tunnel with the actor in the middle. You don’t have to deaden the whole world, just the space between the actor and the mic.
I’m sure this list will get more comprehensive when I listen to the voice track.
Koz
I know what you mean. No, the actor was not gesticulating. He is more experienced as a stage actor and director (very solid, impressive resume, etc.), and a little nervous in front of the mic the first two takes. But very sharp guy. I could have directed him better – especially, don’t go into the really low dynamic range. It’s a hard to find the right balance between directing and distracting. If I gave him lessons on mic control, that would have improved the next session, but it probably would have diminished the current one. That’s simplifying, but I guess you know what I mean.
Anyhow, it’s still more-or-less ok. I amped those low moments. If I had told him more clearly the objective, he could have optimized his performance further, and the performance could have been a little sharper. Well, lessons learned. Vague lessons, but that’s education.
You’re right at the edge of I can’t help any more.
I hope you mean the inside edge. 
Honestly, the way I look at it, you have years of knowledge. But yeah, perhaps it’s already the tail part of the “long tail”: Long tail - Wikipedia
I cranked through the clean voice track. I think you’re at the limit of what you can do with what you have. I can’t find anything wrong with it. There’s a good deal less room slap in the track than I thought I was going to find. The furniture quilts worked.
This may be where you step gingerly into a condenser microphone. The SM58 is nice, but it’s an indestructible live performance microphone with internal blast (and spit) filter and “presence” peaks up around 5KHz. Not at all graceful or smooth.
I’m going to mess with Chris’s Compressor. Chris had a very simple job to do. He wanted to listen to classical music in the car without constantly turning the volume up and down. He kept adjusting code modules until he couldn’t hear it working. When he got done, he had a full radio broadcast compression chain in a convenient plugin. Or so he found out when I told him.
I posted on his board, “Dude do you realize what you have here? This processor mops the floor with any other compression tools available.”
You may be able to apply this one tool to your raw tracks and let it adjust the volume levels for you. I’ll see what it does to your voice track.
Koz
Oh, and we quote the ten to one production numbers to producers who naturally assume that a three minute piece takes three minutes. They’re horrified. Then we tell them that’s an average.
Koz