I so appreciated seeing a post with a voice test protocol included. Thank you for devising an easy way to begin talking about voice recording issues. I am new to the engineering sides of things, and it can feel a little daunting.
I was hoping to get some feedback on the voice test I made with regards to the following:
My immediate need is to produce clean, super solid recordings that I can send to a studio for post-production
My secondary need is to see if I can set myself up to pass ACX guidelines with the set-up I currently have
For this test, I was aiming to get my Foreground Contrast at around -22.0 dB. Setting levels has been 3-day battle. I imagine it will continue.
For this test, I had the USB mic gain at 3, Recording Volume in Audacity at 84%.
My mouth was about 4-5 inches from the mic (w/windscreen), about 30-degrees off-axis.
There’s a fuzzy, unofficial rule that if you can pass ACX, you can submit anywhere else successfully. ACX has very strict guidelines that compare favorably to broadcast standards.
This is the short, graphic version.
Please note this is valid only in Audacity 2.3.3. If you’re using 2.4.1, the rules change a bit. Which Audacity are you using?
After Audiobook Mastering, I get everything passing easily with a background noise floor at -75dB. The standard is -60dB and the real-life recommendation is at least -65dB.
Check with the client, but voice work is many times submitted mono, with one blue wave instead of stereo. It cuts the transmission and storage time down by half at no loss of quality. If you read too loud and your waves go all the way up at any time, you could have permanent sound damage.
Tracks > Mix > Mix Stereo down to Mono.
Export an Edit Master as WAV or Save as Audacity Lossless Project. That’s your Archive.
ACX requires submission as lower quality, compressed MP3. You can’t edit those without sound damage once you make one, so any further work (edits, changes) should be to a copy of your archive, then make a new MP3. That’s not obvious.
ACX will allow you to submit a short voice test for evaluation. It’s a New User mistake to submit a whole book as a first effort. It’s not that unusual to burn permanent damage into the work and have to read it again. (I don’t think this will be a problem for you, but I’m not an inspector).
I was aiming to get my Foreground Contrast at around -22.0 dB. Setting levels has been 3-day battle.
You don’t need the battle. Nobody can read directly into ACX. The actual goal is to read so your wave peaks occasionally reach half-way or -6dB (depending on where you’re looking). Audiobook Mastering tools will take it from there. If you read too far off, you may not pass noise. That’s the evil lurking behind bad readings.
Your client may need other specifications. Your mileage may vary, consult your local listings.
Thanks for your illuminating posts. I’m learning a lot.
To answer Koz’s question, I am using 2.3.2, but your step-by-step list does not correspond to my choice of Effects. For example, I have no Filter Curve option on the pull down list. Normalize is on my list, but not RMS Normalize, if there’s a difference.
Also, I have tried adding the ACX pluggin several times, and it doesn’t show up in my Add/Remove pluggin list. I have about 6 copies of the .ny file in my Downloads folder, though. What step am I missing?
Will all my problems be solved if I just install 2.4.1 instead?
Back to the audio question, if I aim to peak at -6.0 dB instead of -3.0 dB in the raw recording, is there anything lost acoustically by relying on mastering to compensate for the narrower dynamic range? I sometimes find that the mastering in spoken word can sound so over-produced that the voice no longer appears to be a genuine human sound. And of course, this is a matter of taste and the conditions called for in a final product.
But am I mistaken/misguided in wanting to make the most usable, honest, raw data possible, with the intent to only use mastering for smoothing out the edges rather than bumping up/covering up/beefing up every partial in my voice because that’s the only way it will meet specs?
And Koz, I’ll get you that evil automated message to you as soon as I figure out how to workout the kinks in my audio set-up.
For example, I have no Filter Curve option on the pull down list.
In older versions of Audacity I think you’ll find that as part of the “Graphic Equalizer” or “Equalizer” or “EQ”.
Normalize is on my list, but not RMS Normalize, if there’s a difference.
There is a BIG difference. Regular normalization targets the peak level and you need to set the RMS level. You can download it here: [u]RMS Normalize[/u].
Back to the audio question, if I aim to peak at -6.0 dB instead of -3.0 dB in the raw recording, is there anything lost acoustically by relying on mastering to compensate for the narrower dynamic range?
There is no significant difference. Your acoustic & analog levels could make a difference. i.e. A “stronger” voice gives you a higher signal-to-noise ratio relative to the room noise. But, turning-down the digital levels is not a problem. (Pros often record around -12 to -18dB.)
I sometimes find that the mastering in spoken word can sound so over-produced that the voice no longer appears to be a genuine human sound.
Normalization is simply a volume adjustment. It doesn’t affect the quality or character of the sound (assuming no [u]clipping[/u]/distortion).
Since RMS Normalization typically ends-up amplifying, it will also amplify the background noise. That doesn’t actually affect the sound quality because it’s no different from the listener turning-up the volume control, but it does make your ACX measurements worse…
The low frequency rolloff only removes low frequency noise. It doesn’t affect the normal voice frequencies. You may not hear any difference but your ACX measurements should be better.
Limiting does slightly change the character of the sound but it prevents the clipping that would otherwise be caused by RMS normalizing and it brings your peaks into spec. With most audiobook recordings, limiting (slightly) improves the sound but making it “stronger” and more consistent.
Noise reduction is where you might hear processing artifacts. (Or, if you over-equalize in an attempt to “improve the sound”.)
LF_rolloff_for_speech.xml (299 Bytes)
Some of this is going to seem stunningly painful, but you only have to go through it all once. Not only does the process give you the tools, but the tool settings stick each time you use them past the first time.
I can master a short chapter in about twenty seconds. It’s not rocket surgery, but you do have to build the toolkit and that takes a little doing.
I feel like I’ve had success! Thank you all so much for your help!
I got the new plugins set up for ACX check and RMS Normalization.
My Equalization effect does have a low roll-off for speech option, so I’m using that. I assume I don’t have to do that coding thing that Koz sent. Or now do I have to learn a programming language too? If so, I might have to quit this whole endeavor and take up knitting or something. It took me at least a 1/2 hour just to find my hidden plugin file. But like you said, painful at first but now it’s in there and I don’t have to deal with it.
So, I did the mastering protocol Koz sent. And the first ACX check came back with the noise floor at -59.6, so then I ran the Noise Reduction effect and got it down to -71.4 Was that the right order to do that in, or should I have added that Noise Redux bit somewhere else in the protocol?
I feel like the order of operations for the effects is extremely important.
Which brings me to my final question: When/how to use Noise Gate? I have applied Noise Gate using the default settings to the sample here after running the ACX Process chain and Noise Reduction.There are several levels to set in Noise Gate and I’m not sure how to arrive at the proper ones for my particular breathing dB which varies, of course.
I have a particular breath at 6.5 sec that’s as loud as some of my spoken words. I think it’s natural to hear a breath between “I said,” and “My dear bird…” but I’m wondering what my options are for reducing those audible breaths even more. Obviously improving my mic technique is one option but I am sure that I’ll never completely avoid that. Do I need to alter noise gate to adjust, or do I just work magical effects on each intentional audible breath one-by-one.
Also, would Punch copy/paste be used after Noise Gate? Or before?
Thank you as always for your help! I feel like I’m getting close mastery at the remedial level!
Never? Noise Gate is too easy to create sound damage by accident: clipped words, partial breaths or pumping background noise.
You are right at the limit of regular Noise Reduction, too. What was your noise? There may be other, better ways to deal with this. Burn a small sound test like this and post it. Don’t correct anything.
It’s possible to fake out ACX Check. If you never leave a 3/4 second quiet gap anywhere in your presentation, ACX Check is going to measure some of your actual performance as a desperation move. That’s going to appear unnatrually loud and doesn’t reflect the actual background sound.
The goal is to announce something, master it to get the loudnesses in the right places and that’s it. If you have to start messing with multiple noise reduction strategies, you’re probably doing something wrong.
Trebor, thank you for the magic enveloping tool. I knew about it for mixing tracks, but never considered for editing loud sounds. I look forward to casting shrinking spells where appropriate.
I’ve included the raw sound clip here.
When I run EQ, RMS Normalization at -20, and Limiter my noise floor consistently comes in at -58.9dB. If I change the RMS Normalization level to -21.2, I can get it to pass ACX. But that white noise at the beginning gets so loud after running the mastering chain, IMO. I tried to record this clip at a higher gain level to get my peaks in the -6 to -3 range in hopes of making more distance between my voice and noise floor, but I don’t know if it really helped. I feel like so much of what I read has the creepy, slightly whispered tone, so it take something really special to get into the yellow orange area in general.
Glad to know about the perils of noise gate. I’ll take your advice and not use it.
Am I understanding it right that Noise Reduction is more for spots rather than an effect to apply to the whole file?
There is a piece missing. What is the microphone and where is the computer?
Most of the background noise is plain, soft, rain-in-the-trees shshshshshshsh. Piece of cake, but there are two tones in there which could be computer fan noise.
What kind of lights do you have? Do you use a dimmer on older tungsten-incandescent lights? Ceiling lights with dimmers?
This is also the kind of thing you can get from noises coming up through the floor, your air conditioner or downstairs neighbor’s air conditioner. That snaps us back to the microphone. Most short desk stands have no sound isolation.
There’s a bunch of other oddities and distortions in there I can’t account for. I can force this to work by carefully hand-correcting each little error, but the errors should not be there.
Please tell me you don’t have a BM-800 Professional Broadcasting Studio Recording Microphone Kit including swing arm, vibration mount, and blast filter for $14.99. Get yours today!!
You’re right. That’s much better than Hudson Valley Cows.
I thought I had sent this reply days ago, but seeing as I haven’t heard from anyone, I must have goofed something up when I submitted it.
So, my mic is a Blue Yeti X. I read from paper not a device, my computer is quiet, my light bulb is not a dimmer. I occasionally hear noise from the building, but try not to record when I hear a vacuum, a/c unit, lawn mower.
My biggest struggle is managing speaking quieter and still reaching decent peaks given my equipment and recording environment. If I keep the gain around 3 and speak at a medium loud volume all the time, I have no problems with the RMS Normalization in post (See first example of cows) because it only has to amplify about -3 dB and my noise floor is in good shape. If I whisper-speak, then to get peaks at -6 to -3, I raise the gain level and get a way louder noise floor in raw data (see second example). Or I try whisper-speaking at a lower gain, and then get the loud noise floor in post because RMS Normalization amplifies the sound clip -8dB or more.
I’ve tried changing my proximity to the mic, and it’s only slightly helpful, because then plosives become an issue or it just sounds too much like a weird effect or pillow talk.
So, the example I’ve attached here is Mic gain at 4, double-thick nylon pop-filter, 45-degrees to mic. I made sure to include a passage with lots of Ps, to see if my methods work. If I put the gain up to 4 and 5, I see slow undulating hums in the noise floor, and at 6 and above, the noise floor stops undulating and just looks like a thick rectangular blue strip that just gets taller with every incremental increase in gain.
How do I handle mic gain levels for whisper-speaking when the noise floor becomes an issue? This pertains to both raw and post. I thought I had found a solution in post with noise reduction, but apparently that’s not good form. Your advice would be much appreciated.
Is the RMS Normalization effect intended for longer passages of audio that include more dynamic range perhaps? Said another way, do some of my issues have to do with recording softly for a short clip without enough loud bits for the effect to be balanced?
If RMS Normalization creates red clipping marks, should I go back into the raw data and do something about that perhaps with the envelope tool? Obviously the limiter takes care of those peaks in the end, but I can’t help but wonder if that artificial clipping and chopping off the peaks create odd sounding bits.
Is it possible to tell if my problems are with my recording environment or with my gear? I don’t know mics enough to know whether a high quality XLR mic and interface will solve my problems with levels, if indeed the issue is with my closet sound booth. The last thing I want to do is plunk down a bunch of money on gear that only serves to capture my background noise with more clarity and beauty…oh joy!
Ok, I think that’s it for now. Thank you all for your help!
It has adjustable sensitivity patterns. One of them may have lower noise than the others.
Cardioid is the optimal pattern shape for one vocalist, but maybe try the others to see if they produce better results.
RMS Normalization does not effect the dynamic-range:
it adjusts the gain of all of the selected audio by a constant amount.
You may be thinking of compression which reduces the dynamic range,
Compression may be necessary if you’re doing the voices of loud & soft spoken characters.
Audacity’s built in compressor is OK but slow to use, there are free compressor plugins which are much better,
e.g. GMulti which is a multi (3) band real-time compressor, (the high frequency band can be used to DeEss).
I’d use an expander, (like the free version of Couture), to bring down the noise floor, (when you’re not speaking),
before resorting to noise-reduction. Expander only kicks-in then the volume is very low, whereas noise-reduction is constantly on. If you must use noise-reduction use the least amount necessary to pass.
( If expander alone is not enough could use expander & [hint of] noise-reduction ).