Passing the ACX test, BUT does it sound any good?

Like so many newbies on here I’ve been reading and re-reading many extremely useful posts on this Forum, and frankly don’t think my voiceover business would have legs at all without it, so firstly - THANKS :smiley:

So I’m using Audacity 2.1.0 with Windows 8.1 a focusrite pre-amp and a Rode NT1A (with 2 pop shields!)

Using the ACX checker plug-in, my audio (with effects chain) is now passing the test, at least in theory! My question is, does it still sound like my vocal-booth is the ‘bathroom’ it once was (all tile and concrete) and do I have a plosive problem? Would the quality control robot send me away?

A snippet of audio:

I have listened to a few of the “good audio samples” on the ACX website and I don’t think it would be added any time soon.

And here’s my effects chain:
Screenshot 2015-07-13 16.25.09.png
What else can I do to improve the overall quality? I suspect EQ? Limiter? All things I as yet don’t know much about :frowning:

my audio (with effects chain) is now passing the test, at least in theory!

As I have posted multiple times, the standards compliance is a robot. You either pass or your don’t. The jury is out on whether or not there is a “fudge zone” at ACX. There isn’t one with Audacity. If you fail, flynwill’s ‘acx-check’ will give you a list of numbers that didn’t succeed and full stop. It’s up to you to figure out what to do about it.

In ACX-Land, that’s only the first step. If the robot passes it, I believe a human spot-checks the work to make sure you didn’t destroy your voice in the effort to make the robot happy. “Overprocessing” is a common failure when the human gets it and it’s not unusual for a human to offer suggestions for repairs. I know ACX doesn’t listen to the whole work. They said so.

I’m in and out. As we go.


With your bathroom and hard tiles I was expecting, well, a bathroom sound. But that’s not what happened. You have an insane crispness to your voice which will never work as it is. “Splendid” at 4.3 seconds is enough to drill holes in wood.

I can hear the other elves leaping to using “de-esser” plugins and I don’t know whether that is the required tool. The “S” sounds don’t crash, they’re just too high. As a first pass, I tried a custom equalization tool and turned you into muffled AM radio. OK, that’s not good, either.

Somewhere in the middle…

As we go.


What’s the microphone again? It’s most unusual for a microphone to deliver voice with that much damage. I think there’s something wrong/broken/misadjusted. Also, if you speak like that in real life, I would expect people to run away while holding their ears.


Indeed the whistles associated with your esses are truly piercing.

Could you post a “before” (raw recording) of the same section so we can see if this is possibly the result of post processing?

I was able to tame them a bit with an equalization filter that cut in abruptly at about 9 kHz, but it did significantly change the timber of your voice. I don’t know if there are de-esses tools that can do better than straight filtering.

I should add that this may not be of any concern to the ACX folks.

Indeed the whistles associated with your esses are truly piercing.

I don’t know that it’s whistling. I just hear a system that really likes frequencies between 5KHz and 15KHz. Attached, that’s around the 4.3 second mark. It’s not crunchy or harsh beyond the S sound being much too loud in comparison with everything else.

Yes, by all means. Shoot us something with no processing at all.

This is not a terrible formula for posting a clip for evaluation.

In this case include plenty of sibilants. “Sister Suzy shovels sofas.”

Screen Shot 2015-07-13 at 21.16.59.png

That’s your typical condenser 10 KHz peak. People seem to like it. It gives “air”.

I’d seldom use a condenser mic for voice, personally. But with a typical dynamic mic you’d need a very good preamp because input level is 20 times stronger on an NT1. So most people end up with a mediocre preamp and a condenser…

There are almost no designers left who can create an analog work of art, like most great preamps. We keep copying the same designs over and over. And the implementation of those designs takes a lot of old-fashioned care. Like hand picked transistors, to reduce noise. Can’t be done economically, so we just turn out mediocre copies.

My favorite voice mic shows a 40-18.000 Hz frequency range. That’s a number most people won’t buy. But for voice, it means it picks up a lot less room. And it’s omni-directional, so it doesn’t have the proximity effect. No changes in sound if the VO artist leans back. Just a little less volume, which can easily be compensated for with a little automation.

And it doesn’t have any resonant HF peaks.

And the fulcrum of modern technology is the USB condenser mic. A sloppy preamp with a basic AD that doesn’t have settings for gain, but is expected to work from 60 dB to over 120 dB… Can’t be done. So they all fail in one way or another.

What I hear is an NT1a on a preamp that has a not so very good impedance matching with the mic. It exaggerates sibilance.

Condenser mics frequently have a slight “presence peak” at around 10 kHz, but the peak shown in Koz’s image looks way bigger than expected.

It’s probably exaggerated by the room treatment. Most lighter textile, such as duvets, is almost transparent over 12 KHz. So you take a mic with a boost, add reflections from the room and end up in hell…

I haven’t got an NT1a, but I’ve got an NT1. That one bites even harder, sometimes. Don’t get me wrong, it’s a good mic. Just not for some voices in some rooms. Especially, small reflective rooms.

Are you certain that those are the only effects that you applied? It sounds (and looks) like Equalization has also been applied.

I shall post a sibiliant-rich raw recording on here tomorrow when I’m back in my booth. Thanks for all the feedback, everyone

Post it just as it comes out. Don’t help it.

We had one poster who had really odd sound clips on the forum and it turned out he was trying to “help us hear it” by making the clip louder with the volume sliders. True, he wasn’t applying any effects or filters, but still. That threw us off for days.

Don’t do anything to it.


That’s your typical condenser 10 KHz peak. People seem to like it. It gives “air”.

Nope. Sorry. The last time I had a microphone with a 40dB peak in the response, it was broken. If you play that clip into earbuds it’s unlistenable and if you happen to also be a young woman, we’d be calling an ambulance.

I wonder where the performer is.


Buongiorno a tutti!.. it’s morning for me here in Italy anyway.

For your listening torture, please find below my super-essy raw wav file.

A screenshot of my Audacity, in case that’s helpful…
Screenshot (1).png
and while I’m at it, this is what my scarlett solo is up to… gain to about ‘4 o’clock’ (in order to get the green light), to get recording level peaking around -6db
Just to cover all bases, I still get a slightly flickering screen when working in Audacity, this was worse before I recently updated the driver on the scarlett solo. I also intermittently have trouble getting sound through the headphones when recording (but not on playback) and have’t a notion why.

Is it terminal?

There’s a limit to what can be done with de-essing …

Paul-L’s high-resolution de-esser is the only tool for Audacity I’ve seen which can help your severe sibilance problem.

Oddly “chaise lounge” doesn’t suffer the excessive sibilance of “sulkily gazed” :confused:
Whatever the arrangement / settings were for “chaise lounge” keep them.
Whatever you did differently for “sulkily gazed” don’t do that again.

Thank you Trebor, much appreciated. I can hear some improvement with that treatment, is it enough? In terms of those phrases you mention I guess the difference in sibilance is between SH (as in ‘chaise’) and S (as in ‘sulkily’). The former not being so harsh, I think?
It will be interesting to see if taking a step or two back from the mic changes things…

If you are listening to your voice through headphones as you record, then possibly your problem could be due to acoustic-feedback : i.e. the sound of your voice in the headphones leaking-out and being picked up by the mic as you record. That would cause certain frequencies to be boosted, ( and the unintentional boost could vary in intensity during a take, depending on the volume of your voice ). Maybe try recording without headphones ?

Aha! Yet another thing I hadn’t thought of! I’m going to try recording my super-sibilant piece again, with a bit more distance and minus the earmuffs :wink:

Here’s another recording of the same piece, this time I’ve left the headphones outside and I’ve taken an extra step back from the mic. To me, it sounds reasonable, quieter (peaking at -15) but not drilling any holes in wood, methinks?
How’s the room tone this time?

Your thoughts and advice for improvements much appreciated :smiley:

What sort of pop shields are you using?