Audiobook Advice?

I’m currently narrating an audiobook, and I was taught some basics by another author. I used Noise Reduction, created a sound profile based on 10 seconds of silence, then applied it. I’m now currently editing it chapter by chapter, trying to get the pacing right, shortening "S"s, hard Ks and Ts, and applying Clip Fix to sections that are too loud to get it more in line with the rest.

The only other thing I’ve been applying is a Graphic EQ filter to deal with popping the mic (sometimes my Ps blow it out a bit, but the EQ filter I was shown fixes it most of the time).

My question is, however, that once I get all that done, is there any kind of pass I can do to help everything sound more consistent, like it was all done at one sitting, reduce how easy it is to tell that two sentences were done in two different takes, if that makes sense? I try to be tonally consistent, but there just seems to be little tells nonetheless.

Thanks in advance!


Most people can get rid of popping by placing the microphone to one side (B). Oblique Positioning. Most plosives go straight in front.

Your recording engineer should be taking care of that for you, but failing that you can do very well by wearing headphones for live monitoring. After you get used to it, you can make most wild volume swings go away just by listening. It’s instantly obvious when you’re doing something wrong.

Note you have to do that to your microphone or preamp. You can’t plug headphones into the computer. That will give you echoes and delay.

There’s production tricks to that. If you know you made a mistake, leave the recorder going, stop reading, look back to the next even sentence or idea break and read the whole thing again, matching the tone, volume, and cadence. The correction will just be deleting the bad parts later. The before and after should match—for example because you did them quick and didn’t try to do them on different days.

We have two handy tools for audiobook production. ACX-Check and Audiobook Mastering. Mastering is a collection of tools (a Macro) which does leveling and tailoring in one go to meet ACX technical standards.

If you perform in a quiet, echo-free room, it doesn’t matter what kind of microphone you use.

That’s my iPhone in Lossless Voice Memo.

I read it, transferred it to my Mac, applied Mastering and a very gentle Noise Reduction. It passes ACX-Check.

ACX no longer will do a short performance evaluation like they used to, but if you produce a short sound test and post it on the forum, we can do a quick test.


There are plugins specifically for that called de-essers.
there are free ones which work in realtime in Audacity …

Thanks for this. I’m beyond a noob at this, but will try to muddle through with whichever version seems to be the most intuitive to figure out.

So with Desibilator, by saying it doesn’t operate in real time, you’re talking about selecting the file after you’re done and running it, right? The other stuff is meant to kill the sounds while you record?

Edit: So, after trying it, it seems to quiet the S’s, but the problem I am told is that they are too long, so I’ve been manually shortening them. Still, the de-esser seems like a good thing to apply after I’m done.

(as I said, very much a noob here)

Post-production processing should not be a career move.

I have a very nice AKG professional/recording/broadcast microphone. I got a reasonable deal on it. I did wonder why the company stressed the “professional” quality of the work and they didn’t do that to their regular top quality microphone line. Then I found out. I did exactly one show with it and it’s been shut in its carefully fitted aluminum case ever since.

It’s Bright. Aggressively so. Very sparkly, sharp, and gritty. You might be able to deal with that with a pro announcer and a heavy windscreen with mixing desk in a studio, but for home use it points up and emphasizes every little mouth noise, saliva motion, and irritating twitch.

Sound familiar? Where did all those mouth noises come from? You have my iPhone voice sample. Did we ever get a test from you?

Record A Voice Sample.

It’s quick. 11 seconds.

Without that, we are all guessing what the problems is. Do Not apply any effects or processing, including volume adjustments. Just record it, stop, File > Export a WAV and post it on the forum. The posting symbol is the heavy bar and up arrow.

The cows are waiting for you.


Oh, sorry… I didn’t really realize how important that was to this process.

Hope this helps.

We’ll probably need to deal with studio protocol before we go much further. ACX has a performance they call Room Tone. That’s the sounds the room makes when you’re holding still and if possible not breathing at all.

This is my explainer in the Catskill Farm instructions.


It’s one of the two blue links.

This is what my Room Tone segment sounds like boosted and looped to make it longer. Turn your volume down a bit.

This is what yours sounds like.

You need to stop that. Noise Reduction will not get rid of that. Those thumps, slaps, and ticks are now a permanent part of your sample.

I also found evidence of being too close to the microphone. Is the mic off to one side as in Oblique Positioning? How far away are you? One Shaka?

Are you using a pop and blast filter? Either a sock over the microphone or a tennis racket?

What is the microphone?


Here’ s the best room tone I can manage. When I made my sample I had a different chair from what I usually use (a stool with no moving parts) so that might have affected it.

To answer your questions as best I can…

Yes to the one shacka.

I have a tennis racket.

It is the mircophone I have:

Much better. Now do the two second quiet lead-in and instead of doing the cows, do something typical of your target presentation.

“And because of an easily excitable Broadway producer eager to get on with it in New York, Elizabeth, her maid and pug dog missed sailing on the Titanic by one boat.”

That’s one of my stories. Pick your own.

The forum will accept sound files up to 22 seconds in stereo. Same rules. Don’t patch or fix anything.


Hopefully this will do.

That’s more better. I opened it in Audacity, applied the Audiobook Mastering Macro and a very gentle noise reduction.

It passes ACX technical standards and is submittable just like that.

Screen Shot 2024-07-01 at 20.52.32

I can’t do vocal quality from here. That will need to wait for tomorrow morning.


Here’s the 192-Constant MP3 that you would be submitting.

It passes ACX-Check and it probably sounds exactly like you.

I can listen to a story in that voice.

As an experiment, I applied Effect > Bass and Treble with the Treble reduced 3dB (slightly muffled). Didn’t appear to make that much difference. So with the idea of the less processing, the better, I left it out.

I need to drop for a while.


There is one strong recommendation from ACX. Do everything in mono. One blue wave, not two.


I got there by the little drop-down menu to the left of the track > Split Stereo to Mono. Assuming they’re the same, delete one track.

Everything is better in Mono. It takes up less data space, it’s easier to edit, it’s faster to transmit. You would think actually recording in mono would be The Golden Way. That’s where you run into troubles. If you force Audacity to record in mono, it might give you a mono show at half-volume (small blue waves). You can boost it up later, but that also boosts the background noise.

You can go ahead and record in stereo, but cut it down to mono after the performance. That one gives you the least sound damage, it’s just more work.

I believe recording directly in mono is a Feature Request for a future Audacity.

Now all you have to do is pay attention to all the little audiobook rules and regulations. When to announce the title, when to include Room Tone, how to produce the Retail Clip, etc.


I thought there was a posting with both ACX-Check and Audiobook Mastering Macro in one place.

If you do everything else exactly correct, that and Noise Reduction may be the only tools you need.

I don’t know if I mentioned it, but your submission passes ACX-Check without the Noise Reduction, but it does it by half the thinness of a razor blade, so I included a 6, 6, 6 reduction.


Audacity effects are all post-production: i.e. effects are applied after recording.
The Desibilator is not realtime: you preview a few seconds of the effect, then adjust the settings, then preview again ‘wash, rinse, repeat’ .
whereas realtime effects can be adjusted in real-time as if they were a hardware device.