Trying To Get Things Right For ACX Submission

assume that I’m a total novice, and won’t understand any jargon

You can, over weeks, become an Entertainment Producer, Recording Engineer, and Theatrical Performer, or you can show up at a professional recording studio and let them handle most of it for you at great expense.

They make it look easy, so it’s not unusual for someone to studio-publish audiobooks and decide to do it from home. How hard could it be?

Fair warning the longest message on the forum is Ian recording from his apartment in Hollywood (the real place, not the metaphor). Over a year of prompting and guidance we got him going and he is a successful reader.


Thanks very much for that, Koz. I’ve attached a sample file, as requested.

I’ve used the Audacity tools you outlined, on one of my recorded chapter files. It’s sounding good to me (as a total novice, and probably not listening for all of the things that the Audible assessors will listen for).

Also, the recording volume now seems to sit largely up in the -12 to -6 dB range. Is that correct (given their -23 to -12 range requirements?

I’ve got a couple of clicks I’m finding it really hard to get rid of, because they’re concurrent with speech (not in the silences between). They haven’t come from the mouse, because I don’t use it during recording - I have a PDF file of my book, and I use the automatic “scroll” function (Control/Shift/H), so movement of the text is entirely silent. They must come from my mouth somehow.

Happy to have further feedback from here. Thanks a lot.

Re the ACX Check plugins, (from your link

  • Are you suggesting that I should download and add them all to my version of Audacity?
  • With some (e.g. Peak Finder, Silence finder, etc.), they add labels. What actions are needed, where labels are added? Does that mean “start again”, or can files be edited to overcome the issue?
  • Some of my chapter files (even those recorded on the same day, at different times of day) appear to have a different level of background noise in them. Can I do anything about that, to “even them out”, so they’re all the same level?
  • Are there any fully-comprehensive tutorials online, helping newbies to use each plug-in successfully?



I think that’s going to work out just fine. After mastering, it passes ACX technical standards with no further work and it sounds good. As you noted, there may still be theatrical patching and corrections here and there, but I think we’re over half-way home.

I need to go play Real Life for a while. Every so often it catches up.


Many thanks, Koz. And yes, I get the need to do “Real Life” from time to time! (It’s 7:06 pm here in NZ, so I should be sitting with my feet up, glass of wine in hand, and talking to my wife - not staring at the screen of my laptop!)

It’s not easy to know whether each of my files will pass the tests. Over the course of a 25 minute read, and chopping out errors, the modulation of my voice varies a lot more than it did in that small 10-second clip sample I sent to you.

Is there anyone (either at your end, or at ACX) who can/will check them for me, once I’ve mastered all of my chapters, using the four effect editors:

  • Noise Reduction
  • Normalization
  • EQ
  • Limiter?
    Or do I simply assume that using all of those will be enough to get me through?

Should I download and use the other ACX check plugins from your hyperlink

If so, how do I use each, and what do I need to do when it highlights an issue (see me post above, asking for additional help on this).

Many thanks.

There is only one current working ACX-Check. It’s by Steve Daulton based on work by Will McCown.

You don’t have to use it, but checking your work for Audiobook Conformance without it, while possible, is just not fun. You can’t do it just by looking at the blue waves. This is the process.

Since audiobook reading was getting more and more popular, it was determined that designing a one-pass, dedicated test was desirable.

the modulation of my voice varies a lot more than it did in that small 10-second clip sample I sent to you.

Right. That’s a hardware error. The first thing the recording engineers do in their studio is clap broad, fuzzy headphones on your head so you can hear yourself.

That greatly reduces theatrical voice volume wandering. It’s nice to think you can solve all of these problems by pushing a button or applying a correction or effect later, but the headphone trick is a lot faster and more efficient.

This must be done by plugging your wired headphones into the microphone, interface, or sound mixer.

Screen Shot 2022-01-07 at 8.41.34 AM.png
No we don’t recommend Apple earbuds for theatrical recording. They were handy for the picture. You can’t listen to the computer without reverb and echoes, so that lets wireless out.

The other trick to this is watching your recording levels which is what the recording engineer would be doing behind the glass wall.

This, in general, is what good recording volume should look like. We call for occasional sound peaks at about -6dB on the bouncing sound meter or about 50% height on the blue waves.

Yes, it’s good to keep one eye on the volume meters while you read. That’s what the recording engineer would be doing while the performance artist performs.

Another way to insure even, graceful reading is to not stop. If you make a fluff, pause really briefly, read it again—correctly—and just keep going. Some performers clap loudly once so they can find the error later by looking at the lumps in the blue waves. If you find yourself constantly doing that, then you may be reading too fast or the wrong person is reading.

If you find ticks or pops in your performance for No Good Reason, that can be a computer hardware problem that needs to be investigated.

One not very popular universal solution to a lot of recording problems is stop recording on the computer. That’s my Zoom H1n audio recorder.


Is there anyone (either at your end, or at ACX) who can/will check them for me, once I’ve mastered all of my chapters, using the four effect editors:

  • Noise Reduction
  • Normalization
  • EQ
  • Limiter?
    Or do I simply assume that using all of those will be enough to get me through?

You probably shouldn’t arrive at the party with a laundry list of effects and corrections. ACX has a failure called “overprocessing.”

The goal is to sound like you’re telling someone a fascinating story over cups of tea, not Zoom Voice.

After you get your basic chapter voice recording down, Master it, apply Noise Reduction if needed and maybe nothing else. Mastering will take care of volume and basic wave conformance. EQ, Normalization, and Limiter, are all built-in. Mastering was designed to, as much as possible, not affect your voice other than volume. Remember Mastering has to be done in the right order with the right tools. There is no ad-libbing and fine tuning.

Unfortunately, ACX no longer offers voice quality analysis before submission. Jury’s out for Findaway. I haven’t read that far.

Submission requirements are published both in ACX and Findaway.

That has to do with length of chapters, amount of Room Tone before and after each chapter, etc. etc.

A note that Silence is different from Room Tone. Room Tone is the natural, quiet room noise behind you when you stop talking. That’s the sound they want at the beginning and ending of each chapter.

There are also requirements for promotional and publicity segments, what to do if your chapters are too long, etc. etc. That’s wearing your Producer hat.

It’s not the worst idea to buy and listen to an audiobook. I can talk up the process and rhythm of a book all day long, but there’s nothing like actually listening to one to bring it home.

(…Silence…) “My Mighty Great Book, Chapter One”) (…Silence…) "Lucy had no good idea how she managed to be tied up on the floor of an opium den… , "

I have most if not all of the Sarah Vowell audiobooks. She has an unconventional announcing style that appeals to me.


One other important consideration. When you get done reading one chapter, errors and all, File > Export a WAV sound file as a backup. Save it somewhere safe.

At the other end of the process, Export another WAV file as your Chapter Edit Master. That’s the one you save in case you have to change anything. You can Save an Audacity Project for this, but oddly that’s not recommended as the only method. Projects can be unstable and brittle.

Only then Export the MP3 for submission. You can’t change the MP3 once you submit. Repeated editing MP3 files degrades the voice quality. Edit or correct the Edit Master WAV and make a new MP3. (Being obsessive, I would save the original).


I found an original reading I messed up because I lost my place in the script. Rather than stop right then and go through all the patching routines, I paused, backed up my reading to the next even phrase break and kept going. I went back later and joined the two good phrases.

The first clip is the error and the second is the patch. You can’t tell by listening to the second clip where the error is. There’s no volume, rhythm, or emphasis change.

This was recorded in my quiet bedroom with that Zoom sound recorder and roll of paper towels for spacing.



Note in the error correction, I did not stop the recorder. I left it running and just looked back in the script for a good break or pause point and started reading again. Because I didn’t take a long time for the correction, I didn’t forget what the vocal tone, rhythm, and volume was in the sentence.

This particular example was not a long chapter, so I didn’t need the hand clap to find the mistake later during editing.

Also, I did not try to exactly match the precise word where I made the mistake. Reread the whole sentence. That’s enormously easier to smoothly edit.


Yup, I do that :smiley:

The portion of this post that bothers me most is the odd or ticking sounds in the middle of words. That’s really rough to get rid of in post production, can signify something wrong with the computer, and is usually impossible to predict.

We know it’s not overload or clipping because the test post was low volume.

Home microphones or microphone systems almost always arrive low volume to avoid clipping distortion.