Complete noob, looking to see if I'm close to ACX standards

I pulled down and listened to the processed one.


I’m curious how come the leading silence and the brief silence in the middle of the passage easily pass ACX noise, but the silence at the end fails. I’m pulling the raw version down and see if I can determine what’s wrong.


It’s that way in the raw version, too. I think you were deep breathing in that final second. It’s not obvious from listening where the sound is coming from. So yes, I think we’re good to go.



Ok… almost ready to leave you alone for a few weeks…

Here is the first chapter (just now realized I forgot to say “Chapter 1” at the beginning. Dammit.) edited, but not mastered :

Here is the version mastered with Koz’s EQ/tips… passes ACX, naturally.

One last question for now - I know I won’t be encoding to MP3 until the very end, but I want to make one now for family to preview… I installed the LAME plugin, but it seems to export at 128kbps by default, and I don’t see where to change it. Is there a setting somewhere for higher quality, or will I need a different application to export to 192kbps to meet ACX standards?

192 or higher. It just has to come in under the ACX maximum chapter filesize limit. Higher is good because ACX is going to cut the work down for their various venues and the sound quality is going to get worse every time they do it. The higher quality you can start the process the better all around.

One could ask why they don’t just ask for the work in perfect WAV format. My guess is it would pentuple their server costs at no significant quality advantage. We’re talking about one person speaking, not Mormon Tabernacle Choir in stereo.

File > Export Audio: Format, Options.

If I didn’t mention it, open the MP3 in Audacity just before your ship it and make sure it still conforms to ACX standards. Making an MP3 causes sound damage (intentionally) and It Is My Opinion that the list of filters is enough to pass, but maybe not. Some of it is informed guesswork. That’s the oddball Normalize to -3.2dB thing. That should account for MP3 messing with the peak sizes.

If any of the MP3s fails, please post here details of what happened and maybe a screen shot of the acx-check panel.



Hello again sir… I’m plugging along with your tools and advice. I’m avoiding any miracle plugins to remove noises and breaths, instead doing it by hand. Tedious, but reassuring it’s the way I want it :slight_smile:

One question simply out of curiosity - why is the routine to normalize>compress>normalize…instead of just compress>normalize?

Shooting fish in a barrel. It’s a lot handier when the fish are all in one place like that.

When you Normalize ahead of Compression, Compression doesn’t have to go searching for the sound. Normalize forces the sound to a known, stable volume. When I do it that way, many of the Compression sliders and adjustments are irrelevant. Also, If any DC errors happens to make it through processing so far, they die here.

Normalize at the end is a little different. That’s the one that takes the sound where Compression put it after processing and forces it into ACX Peak Compliance (No higher than -3.0dB). If you’re recording in a quiet room, you may be able to pass ACX Check at his point.

I designed the suite of steps so Normalize could be the same tool with the same values, yet do different jobs. The settings either act positively or don’t make any difference, so you don’t need to remember two different tools.


Much appreciated… one other question before I head back to the cutting and clipping…

For a little while, my process was as follows :

    • record narration, save as RAW file
  • edit file, removing all breaths and bad takes manually, essentially getting it to completed state without respect to dB levels. Save as EDIT file.
  • run the chain of processing you provided and suggested, save as MASTER file.

I started to notice, though, that for all the time I spent editing, after mastering some unwanted noises became audible - sounds I didn’t hear before normalizing. So now I :

    • record narration, save as RAW file
  • edit file very quickly, removing only coughs and the like, as well as extended silent periods, anything that would throw off the normalizing and compressing. Save as EDIT
  • run the chain of processing you provided and suggested, THEN removing all breaths and bad takes manually. Save as MASTER file.

I’m happy with the results, but it takes a LONG time, essentially every sentence break needs a silent period pasted in.

Does that workflow make sense to you? And without resorting to a plugin to magically do everything at the risk of compromising quality, is there anything I could or should run to speed up the process of manually adjusting virtually every pause?

Oh, we can get a lot more obsessive than that.

every sentence break needs a silent period pasted in.

You’re supposed to be editing/pasting with Room Tone, not silence. Silence will give you unexplained, robotic dead spaces between the words and it really screws up ACX-Check. Remember the goal of sharing a story with somebody over cups of hot tea. The acoustic world doesn’t suddenly drop totally dead between sips.

This is the same problem that happens with people trying to use the noise gate. It takes all the life out of the narration and sometimes pieces of the words, too.

I don’t remember, do you have a way to post longer (over ten second) clips.



No worries, I am using room tone.

I have a finished [?] sample of Chapter 1. Here’s the link to the .WAV file :

Let me know what you think…

EDIT : Also, I’m a little worried about this chapter… the book will be narrated 80% by me, and 20% by this woman [she’s reading my mother’s journals]… her voice is quieter than mine, so I’m worried that this chapter - which is all her - sounds too hot when compared to the others. Welcome your thoughts on this one. Also, because the normalizing had to do more pulling up than on my portions, the noise floor came up a bit, and I’m worried that the portions I pasted between sentences [taken from a different room tone] might be too quiet in contrast.
CHAPTER 13 [in process]

Also - somewhat unrelated; is there a way to change the settings so the position line stays in the center of the screen, and the sound waves actually scroll through it, instead of the waves being stationary and the line moving through them? Seems like I’m always nudging the cursor back and forth…

position line stays in the center of the screen

I think Cool Edit would do that. That puts huge stresses on the computer which is why Audacity flips pages/screens instead.

For Recording, I set Audacity to continually update the screen even though it’s flipping instead of scrolling. But during playback and editing, I switch to fixed screens to keep my edit points from flipping away from me by accident.

Audacity > Edit > Preferences > Tracks > [X] Update Display… (or not).

You can manually scroll the work sooner and later by holding Shift and running the mouse scroll-wheel.

I only use three Zooms.
Zoom into a drag-selected segment: Control-E.
Zoom Out a little bit because I screwed up and zoomed in too far: Control-3.
Zoom Out to the full show: Control-F.

There are pages of zoom options.

I need to listen to your postings.

Chapter 1.

The good news is neither ACX-Check nor I can find anything wrong with it. I like the transition to the woman’s voice and back. Here’s a trick. Drag-select a bunch of your dialog only, say a minute.

Analyze > Contrast > Foreground > Measure Selection (Read the number) > Close

I get about -21.

Now do that to that chunk of woman’s voice. I get about -23. Lower, right? Call out the gendarmes? No, not really. Nobody can hear a 2dB shift. I couldn’t hear it. If you have too much of that in the chapter, her voice will lower the overall average of RMS (loudness) and you risk falling out of ACX compliance, but that chapter works.

You are intended to paste gaps with Room Tone taken at the time of the dialog, not Room Tone you got last week from a different recording session. This is a problem with Noise Reduction, too. The Noise Profile step has to be taken from the same shoot as the dialog.

I’m going for the second chapter.


That’s going to work less well. You can’t use actual Room Tone because it’s too loud. You’ll never pass ACX Noise if you do that.

You are digging yourself into a corner rapidly. Tell me again why you need to do thousands of edits over the course of a chapter? You understand human breathing is an expected part of the narration process? I don’t know that anyone has ever been rejected from ACX publication for excessive or inappropriate oxygen use.

In her case I can very clearly hear you managing the silent gaps with firm hand and stern gaze.


I would be managing her shoot (after it’s collected into one chapter) so that the noise is suppressed before you try compression and other tricks. You may even need a stiffer noise reduction than 6, 6, 6, say 12, 6, 6, or even 18, 6, 6. After you do that, you may decide that you don’t need all the surgical micromanaging after all.

– Select the whole raw but edited show by clicking just above MUTE.
– Effect > Equalization: LF Rolloff for speech, 8191 Length > OK

– Effect > Normalize: [X]Remove DC, [X]Normalize to -3.2 > OK

– Drag-select Room Tone, silence or the flat area between spoken phrases.
– Effect > Noise Reduction: Profile
– Select the whole clip or show by clicking just above MUTE.
– Effect > Noise Reduction: Settings 12, 6, 6 > OK

Do an ACX-Check and publish the panel.


You have just experienced why reading in a quiet room is such a big deal. High Room Tone noise just kills you in post production.

So, a confession.

I didn’t mention it the first time around, since I was trying to keep it brief (I usually fail), but the problem with Chapter 13 is… I lost the raw data.

While she was recording, I was getting nervous about the peaks of her narration seemingly only reaching -15 or so. So when she took a break for water, I figured I would do a quick normalize/compress/normalize to see what the numbers looked like, and listen to the noise floor, knowing it was likely going to pull up 10db or more.

Well, I forgot to save the raw file, so when I hit “save” I captured the boosted version. If it was me I would have shrugged it off and recut the chapter, but this is a family friend doing it as a favor, so I REALLY didn’t want to have to tell her to read for another 30 minutes because my dumb a** didn’t hit save. I tried to salvage it with the forced silence room tone (I normalized a stretch down to -60db, how hackish is that) to insert in pauses. No bueno.

Good news is, she jacked up a couple lines on her other chapter, so I need to bring her over one more time anyway.

As for the noise reduction feature though…

  • should I use it on my chapters, or just keep with the pasting of current room tone like Chapter 1? Assuming I try to use this to save the OCD-indulging process of removing all breaths by hand, where would that fit in the chain? Would it be :
    • EQ
  • Normalize
  • Noise Reduction
  • Compress
  • Normalize

I’m screwed on Chapter 13, right? Considering there is no real room tone to start with?

EDIT : to further elaborate on Chapter 13-gate… for recording myself I set the input level to 0.8. I didnt’ adjust it upward before she started, hence the lower readings and my clumsy attempt at fixing it. For her OTHER chapter, Chapter 15, I increased the input level to 0.92 and her readings were closer to -9db or so, AND I have the actual raw data, so I’m hoping that one will go smoothly. I will use noise reduction on that one once it’s cleaned up. I understand the room tones are supposed to be native to that track, I just got desperate with 13.

In case it’ll solve any confusion later, each chapter has one, two, or three short journal entries in it [usually no more than 45 seconds each], other than chapters 13 and 15, which are entirely her narrating.

Also, sorry again…but a question regarding her voice vs mine… I have all her segments as 30 or so clean, edited, but unmastered tracks. I have been pasting them into my chapters, THEN mastering our combined track. This might explain the 2db gap?

Should I instead master her snippet and my chapter separately, then stitch it together as the last step? That would eliminate a gap, but I thought it might make her short portion too loud in contrast. Thoughts?

I can’t thank you enough for your support…

The slight difference in level between you and her is very minor I would just leave it the way it is.

…leave it the way it is.

So would I. You have theater working for you here (count your blessings). The passages are not you two throwing the narration back and forth between you. They are intended to be first person narration followed by ‘removed by time and distance’ reading of a personal journal. They’re supposed to sound different. In a full radio drama, the timber, volume, background, sound effects, everything would change. In film, this would be the cut between two scenes.

That POV thing will shift when she reads entire chapters. She becomes the main narrator in that case.

Have you noticed you graduated from “faking out the ACX Robot” to theatrical production? That’s part two of the ACX Production activity. It’s lovely that you can produce a minimum standard sound file. Now you need to produce something people will want to listen to.

Oh, right.

I forgot to save the raw file


Who says being obsessive can’t get you into trouble? So this is where you play the Director/Producer faced with telling the actors they need to play the same scene again. Multiple times.

I was getting nervous about the peaks of her narration seemingly only reaching -15 or so.

…on the Audacity recording sound meters. Exactly correct. I was about to beat up on you about “putting masking tape and magic marker labels on all the knobs.” But you know enough now to watch the meters.

This is where you lean back on your years of experience shooting live sound. Do you change the levels or not? Most times not. If you change the levels you have just created two totally independent sound shoots with different post production effects and filters. If you don’t, you may create 30 minutes of unusable trash. It’s a given that you never change settings during dialog. I’m depending on you to do the right thing.

It’s generally considered a terrific idea to do a sound test before rolling for record.

“Just read through the first couple of sentences like you’re going to do it. I want to make sure the equipment is OK” [hunching quickly over headphones]. So that’s the last time you make that mistake.

I think I’m going to take giant steps backwards. You got all the tools and techniques and you know the goals: It has to sound reasonable when you casually listen to the finished product (like Chapter 1) and it has to pass ACX-Check. We obviously can’t do guarantees (some of these tools are still guess-work on our part), but I bet you sail straight through the ACX publication process.



Can you enlighten me a little on the Noise Reduction tool? Is it essentially there to take an unacceptable noise floor and push it down to more palatable levels?

Does it work by essentially telling you “show me your noise floor”, then finding bits in the track acoustically similar to that and pushing the dB level of those sounds down?

If the above is mostly right, then it would be entirely a noise floor-aid, and really wouldn’t effect my editing, pasting, etc, right?

unacceptable noise floor and push it down to more palatable levels?

Right. After you reduce noise in the more conventional manner, locking up the dog, recording between metro-bus passes, turning off the air conditioner, etc.

Does it work by essentially telling you “show me your noise floor”, then finding bits in the track acoustically similar to that and pushing the dB level of those sounds down?

Exactly correct. You let it “sniff” the bad sound (doesn’t have to be noise floor) by itself (the profile step) and then in a second pass it goes howling off down the show looking for similar smells (sounds).

Obviously it’s a really bad idea to include the slightest bit of voice in the profile. If you want to destroy a voice, that’s a good way to do it. Not so obviously, this limits Noise Reduction’s ability to rescue Super Bad voice tracks like CSI surveillance and forensics work. If you can’t tell exactly where the voice is, you’re dead.

You can also get into trouble when the bad sound has a lot of the same harmonics, overtones and pitches as human speech. The bloodhound can’t tell. That’s when you get Cellphone Voice and an “Excessive Processing” rejection from ACX. The “Yeti Curse” works like that. The Frying Mosquitoes sound from a USB Yeti Microphone is very difficult to suppress. Like trying to ignore a kid screaming on a jet.

Noise Reduction is a vast improvement over the older Noise Removal. Let’s say ‘Noise Removal’ was optimistically named.

If all you need is a gentle push, the Noise Reduction of the Beast (6, 6, 6) is usually enough. If you’re at all paying attention to the application, nobody can tell what you did. “For Some Reason” the background noise is quieter. Higher removals (increase the first value) are sometimes required when you’re really in doo-doo (technical phrase) with your recording. Other values and settings are required if you need to process music and other sounds, but I’ve been happy with those settings for AudioBook work.

LF Rolloff and Noise Reduction should be done early in the process in order to provide as well-behaved and clean a track as possible for the other loudness, peak and compression processing tools.

I’ve never found a “standard” set of tools that everybody can use. Each time I process somebody’s work I stand back and look for common threads with the last time I did it. Each person screws up their track a different way, so I haven’t found one yet.


Thanks again, a million.

Quick question - do you have any idea of anecdotal feedback of any ‘wiggle room’ on the ACX standards? My last chapter weighed in at -23.8 RMS, and I’m curious if it’ll be considered close enough or if I need to play with it to make it safely on the other side of -23…