Audiobook distribution

This really does have something to do with creating files, I just didn’t know what to name the topic.

A lot of folks here talk about ACX, and I remember a while back I was helped with producing a book and getting stuff together.

I have a distributor for my own stuff that’s NOT ACX and I was looking at the specs for it today as I uploaded. I realized hey, there may be folks here who can use this information.

They’re called Author’s Republic - and I’ve always gotten more from them than Amazon. Anyway, here are their specs.

  1. Consistent in overall sound and formatting.
  2. Audiobook must be comprised of all mono, or all stereo files.
  3. Audiobook must include opening and closing credits, in separate tracks.
  4. Audiobook must be narrated by a human. Text-to-speech recordings are not allowed.
  5. Each uploaded file must contain only one chapter or section.
  6. Each uploaded file must be no longer than 120 minutes. Files longer than 120 minutes are not supported.
  7. Each uploaded file must contain a section or chapter announcement at the beginning of the audio file.
  8. Each uploaded file must have between 0.5 and 1 second of room tone at the head, and between 1 and 5 seconds of room tone at the tail of each track.
  9. Each uploaded file must be free of extraneous sounds such as plosives, mic pops, mouse clicks, excessive mouth noise, and outtakes.
  10. Each uploaded file must measure between -23dB and -18dB RMS.
  11. Each uploaded file must have peak values no higher than -3dB.
  12. Each uploaded file must have a noise floor no higher than -60dB RMS.
  13. Each uploaded file must be 192kbps or higher MP3, Constant Bit Rate (CBR).
  14. Each uploaded file must be 44.1 kHz.
  15. Each uploaded file must be no larger than 170MB. Files larger than 170MB are not supported.

Files I use for ACX tend to pass but I also like my noise floor very low.

My process is pretty much the modified one that was shown here, though. Record as clear and well as possible, Noise floor, Distortion → leveller, Equalization → LR for speech, RMS Normalize, Limiter, and sometimes very rarely the Noise floor again. Noise floor has proven to be amazing. You can tweak the attack time to get it very close to words, but if you put it too close your ending s sounds on words may get docked.

There is also LibriVox.
LibriVox are special in that they do free public domain audiobooks (read by volunteers from around the world).

The recording guide for LibriVox is here:
Additional notes on quality here:

The reason that we tend to focus on the ACX quality requirements are:

  1. Lots of Audacity users ask about ACX quality guidelines.
  2. The ACX guidelines are precisely defined.

A “good” audiobook recording will satisfy any of audiobook distribution outlets. If your audiobook production passes Amazon/Audible quality requirements, then it is likely to be good enough for all audiobook distributors, but we do also stress that meeting the “technical” requirements is only a part of the job. The recording also has sound good (the “human listener test”).

You can pass ACX by heavily treating your bad voice recording with filters, effects and corrections and it may pass the ACX first test, the one performed by their “robot.” But it will drop dead at the second test, Human Quality Control. Once a human listens to the performance, it’s usually obvious what you did and they bounce it.

To their credit, they usually post suggestions how to improve it.

ACX’s goal is a natural reading like you were telling someone a story over cups of tea. Not bad cellphone voice.

ACX-Check in Audacity is a simulation of that first ACX test. The Robot. They didn’t write it, Flynwill did using a collection of existing tools and filters.

It’s a terrific shortcut. You can keep submitting to ACX over days until you get it right, or you can apply ACX-Check and do it all very quickly. Post on the forum if you really get stuck.

You still have to sound human.

And yes. If you can pass ACX-Check and voice quality, chances are terrific you can submit to almost anybody else for anything else.


Well, I’m noticing that ACX doesn’t quite match up for music. I have to learn to get that compressed properly next.

But when it comes to AR, I did have to change the save format to stereo. Thank goodness Audacity lets you duplicate mono and link them as a stereo track.

ACX doesn’t quite match up for music.

ACX doesn’t do music. That’s radio theater and the most exciting thing I ever heard them do was brief interstitials or into and outro music and the rest of the performance is pure spoken word.

You shouldn’t bulk process a mixed performance. In particular, the first step in Mastering 4 is the Low Rolloff for Speech Equalization which can profoundly affect music (and people with low voices). The conflict can get you in trouble quickly because not using Low Rolloff can cause Home Microphones to misbehave and throw off other tools and effects.

I would probably produce the show as two tracks, voice and music, and treat them as appropriate. The Audacity bouncing sound meter will tell you what the combination volume is while you’re working and you can also Tracks > Mix > Mix and Render to a New Track. The new, single track will be the combination mix.

Audacity will also mix everything—or whatever you select—into a finished show when you export.


Well that’s interesting. Although with music, I was thinking straight songs on CD Baby. I’ve been short on finding good tutorials with that the way it’s been done for ACX. So far I haven’t put music with any audiobooks. May stay that way. I am lazy. :stuck_out_tongue:

May stay that way.

Good idea.

You were going to pay royalties on all that music, right?

Music would be desirable from a listeners point of view, but a nightmare for ACX (and maybe you, too).
“The stinger music is too loud twelve minutes into chapter four.”



There are a lot of good sites with royalty free music you can purchase reasonably and even have for free thanks to people sharing into the public domain.

And with Audacity can’t you just lower the volume on the layer wiht the music in it?

can’t you just lower the volume on the layer wiht the music in it?

Totally, but somebody at ACX had to go down through the performance and find the error, and you had to drop everything to correct it. Plus, music makes pre-production storage more complicated and post production archive something of a nightmare. The usual perfect quality WAV go-to file format is reduced to split left/right with voice on the left and music on the right. Can’t do stereo music.

There is a New User “error” of making production as complicated and error prone as possible instead of the business case of minimum necessary to produce a desirable and saleable product. I reference the more than one reader determined to hand-edit every word of an audiobook.

That’s the retirement project since it will take several years to edit each book. Don’t give up the day job.

In my opinion, if you want to produce a mixed-media podcast or radio theater, then produce the podcast, but don’t try to force the work into AudioBook format.


Good lordie.

I had hired out one whole time for a book I did, and the person added a brief bit of audio for me. The book starts with the song, and it fades out before he begins talking. ACX passed it, and I’ve never heard complaints. But when it comes to myself, I just can’t hear the differences to know when things are too loud, etc.