Newbie to ACX narration

Hi. I’m an author and have embarked on a project to publish all of my backlist onto ACX. As a first step, I’m recording a scifi anthology that I’m currently working on with a group of other authors. I’m not having any problem (yet), just want some reassurance that I’m not already in the ditch and too ignorant to know it. I apologize if this seems vague or I’m posting in the wrong place, but perhaps one or two of you might listen to the attached raw and lightly processed files and offer guidance?

So far things are going well, but I’m about to embark on the the part of the project where I believe I’m most likely to go wrong, where I’m least able to have an unbiased ear, and where I could needlessly waste a lot of time.

I’m recording using a King Bee condenser mic into a Tascam DR-40x recorder, with my mouth “1 shaka” away from the mic and using a pop filter. This gives me mobility and independence from computer noises, and after some experimentation, I’ve treated a walk-in closet at the quietest end of the house to improvise a booth. To my ear, background noise is almost undetectable (as long as the dogs don’t bark and no one slams a door). I can’t even clearly detect whether the air conditioner is running in the raw recording even in “Waveform as DB” mode, which seems excellent. I do still, however, have too high a noise floor to pass ACX without processing.

So what I’ve been doing is this:

  1. First, I load the raw .wav file into Audacity 2.3.1 on my Windows 10 machine and edit out all the outtakes.
  2. Then I use the Waves “de breath mono” plug-in, but instead of just trusting it, I duplicate my track and run Waves in “breath” mode on the duplicate. This gives me a visual list of detected “breaths”. I then mute that track and listen to the main track at every point where waves detected a breath and the waveform looks right to see if it really was a breath and if I think it’s too loud. I then run the plug-in again (in “Voice” mode) to de-amplify those breaths I find objectionable.
  3. Next, taking the track in 5-10 minute chunks, I run the de-clicker plug-in to remove the vast majority of my mouth sounds (and the odd bit of static occasionally introduced during debreathing.
  4. Finally, I select a segment of silence and use the noise reduction plug-in to create the noise profile and reduce the noise by -12db with a sensitivity of 12. I’m not sure where I got those numbers, but the result is a near total elimination of noise floor and no distortion that I can detect through my $100 Sony studio monitor headphones. The ACX guidance says files should all begin and end with a short span of “noise floor,” but surely you can’t ever have too little noise, right? Of course I know that normalization is bound to amplify whatever noise floor remains, and I lack the expertise to tell (say by means of the frequency plot) if there are any remaining issues (like do I need a high-pass filter in addition to the noise reduction already in place).

Now I’m at the point where the next step (after recording and processing a zillion more chapters) is to normalize and perform any eq needed to “pretty up” my voice and normalize to ACX expectations. The trouble is,
A) like most people, I can’t judge my own recorded voice to know whether it needs anything,
B) I don’t know that I have enough experience to really detect subtle artifacts introduced by my processing so far (and I’m of the opinion that less processing is always better than more), and
C) online advice differs wildly on what processing is needed or appropriate for ACX (i.e., compression, limiting, various eq, normalization.)

Since this is for audiobook narration, a slightly more intimate, subtly breathy sound is appropriate, but I’m not sure I can tell “intimate” from “muddied by the proximity effect,” and I have no idea what other filters, etc. might be appropriate–if any.

The attachments include a few seconds from one of the stories, first direct from the recorder, then processed to minimize breaths, clicks, and noise floor

Any guidance is appreciated.

For future reference, I’d recommend recording at a sample rate of 44100 Hz. ACX require the sample rate to be 44100 Hz, so if you record in this format you avoid an unnecessary conversion step.

Also, your initial recording level is a little low. Aim for around - 6dB. (Currently your recording level is around -12 dB)

After recording, apply the Equalization effect with the “Low roll-off for speech” preset, but also ensure that the “Filter length” is set to maximum. A long filter length will retain the lower frequencies in your voice better, while still effectively reducing low frequency noise.

Not much more required other than tweaking to taste and possibly a tiny bit of noise reduction.

To get the peak and RMS levels in range, Normalize to -1 dB, then Limit with default settings:

with my mouth “1 shaka” away from the mic and using a pop filter.

If you’re using a pop filter, the recommendation changes to one Power Fist.

your initial recording level is a little low.

And if you get very slightly closer to the microphone, that volume problem from Steve may go away. No further effort.