Passing the ACX test, BUT does it sound any good?

It makes me completely insane when somebody with a good voice and terrific presentation (the hard parts) encounters one stupid problem and the whole thing falls apart.


Koz I think I love you!

Yeah…well I’m glad to hear questions about the room…I had assumed it was the room. I have covered it in not very deep pyramid type foam, and laid a thick carpet, and it not being a total ‘box’ (there are sloping ceilings), I thought it was reasonable, but still with a little bathroom echo - hence the use of the eyeball.

So here is this morning’s bit of raw audio heaven…at 15cm from the mic, with the standard rode pop filter, nowt else.

The whistley “T” sounds are not due to the the room . IMO it’s your teef . No dentistry required though; it’s fixable with Paul-L’s “DeEsser”, ( it’s not being used to de-ess on these settings : it’s de-whistling the "T"s ) …

This processed version has only been de-whistled, (then DeClicked) , it has not been de-essed : it still has the original sibilance above 6.5KHz

And the 10 KHz peak is almost gone…

Seems the Eyeball is made of some denser foam. And that’s almost transparent for high frequencies. And probably there’s also a resonance goin’ on in the cavity of the Eyeball.

The raw recording also has a tremendous amount of low frequency noise. And that’s the room and some hum at 150 Hz. But that should be easy to filter out. It’s a shame the NT1a doesn’t have a high-pass built-in.

Anyhow, thanks for the test files. The Eyeball seems to behave as expected.

OK, so to drag this back to the problem:

Record dialog according to the last known good setup.

Effect > Equalization: LF rolloff for speech > OK
– removes any rumble or thumps.
– I don’t remember if you have this custom filter yet or not.

Effect > Normalize: [X]Remove DC, [X]Normalize to 0 > OK
– Just changes the volume to make De-ESSing easier.

Effect > Pauls’s De-ESSer according to the posted settings and values > OK
–Where does one get this De-ESSer?

Effect > Normalize: [X]Remove DC, [X]Normalize to -3.2dB > OK

Analyze > ACX Check

You could be missing many pieces of this. Just yell where you get stuck. These are all published software elements, but they’re scattered.

It should sound terrific with no harsh dentist sound, but it may not make ACX without a little more gentle nudging.


The settings I posted are for sound normalized to 0dB , the submitted sound was quieter : normalized to about -10dB, so in that case threshold on the DeEsser should be lowered by 10dB to -40dB , (rather than -30dB shown) , i.e. some trial&error experimentation with the threshold is required , ( in +/-1dB steps), to optimize , as it depends on the volume , however I think the other settings shown have been optimised.

The settings shown are to correct the whistle which occurs on some plosives like “T” & “D” , which is the most conspicuous problem. Another application of the DeEsser with different settings will be necessary to address any excessive sibilance above 7kHz.

One can get a copy of Paul-L’s DeEsser & DeClicker plugins here … Updated De-Clicker and new De-esser for speech , they are attached to the bottom of that post by Paul-L.

Here’s a link on how to install those plugins into Audacity … Audacity Manual

Thanks everyone.

What’s the verdict, ditch the eyeball?

I’ll have a go at that process as you suggest, Koz, and report back.

Fortnight’s hols imminent so I may have to do this on my return.

The eyeball is not responsible for the sibilance problem :
excessive-sibilance is present on both the with & without eyeball recordings.

I noticed you mentioned you’re using a a focusrite pre-amp . Some of those focusrite products have real-time voice-processing, including real-time de-essing & compression. It is possible to configure a de-esser to re-ess , i.e. to emphasise sibilance. So If your focusrite pre-amp has real-time processing, it could be responsible for your severe sibilance problem by being accidentally set to re-ess : exaggerating your normal sibilance.

[ A real-time compressor, set to expand, could also exaggerate plosives like “T” & “D” ].
''sitting unnoticed in the window'' , [ ''no eyeball'' ].png

I agree to a degree, but I think that the eyeball exacerbates the problem. To my ear the “S’s” and “T’s” sound more cutting when the eyeball is used. The Rode NT1-A is certainly a “bright” microphone, but the recordings made with the eyeball are really super-super-bright with that huge peak at the top end. The “without eyeball” recordings seem to require much less correction than the “with eyeball” recording, in fact I find the “without eyeball” recording perfectly listenable with just a little Eq to push up the mid and drop the high a little.

The settings I posted are for sound normalized to 0dB

Right. That’s why I preceded the correction with LF Rolloff and Normalize to 0 in the Processing List. I’m finding it more and more valuable to precede or bracket processing steps with either Amplify or Normalize. It eliminates or greatly reduces Wandering In The Wilderness setting of a bunch of variables.


the recordings made with the eyeball are really super-super-bright

So actually eliminating one hardware step is an advance. Works for me. So we don’t need the De-ESS step in favor of a gentle equalizer?

Do you have such an equalizer?


So actually eliminating one hardware step is an advance. Works for me. So we don’t need the De-ESS step in favor of a gentle equalizer?
Do you have such an equalizer?

Ok, well perhaps we’re getting somewhere. I need to learn about equalization…BUT thanks to you guys I am a lot further down the audacity (audacious?) road than I was. Thanks to all of you for the advice, means a hell of a lot when you’re in a foreign land, no know one, and don’t even have your mum on hand to make a consolatory cup of tea.

Do you have such an equalizer?

That was to Steve who was supposed to leap forward with a custom equalizer (as he has done before).

You understand I am an anathema to most engineers for whom more processing and additional software is always better. De-Essing is a massively complicated affair with on-the-fly decisions made to separate harsh “essing” from simple crisp speech, and with errors possible all along the way. Equalization can be as simple as turning the treble down. We like simple.

consolatory cup of tea.

Oooo. Middle English. That’s good [writing that down].


I’m waiting for Steve’s leap…it’s surely only a matter of time.

I see that ACX in their youtube videos say DO NOT USE DE-ESSERS! - amongst other things. They’re also anti noise-reduction/removal…

They say in the right recording environment only 3 mastering steps are required - EQ, Compression, Normalization. That’s what I’m aiming for, will certainly make life a lot simpler.

Yes, you may need a trip to Blighty (England) to experience the wonder of a consolatory, nice cup of tea, Koz.

or Trebor - he’s good at this sort of thing.

Which audio sample are we talking about? (could someone repost the link)

I don’t think equalization can fix this sibilance problem because it’s intermittent, rather than continuous throughout the recording.

BonnieVO is using kit which could have an in-built real-time de-esser, if that was adjusted wrongly it could [u]r[/u]e-ess, causing the sibilance problem.

However if the excessive sibilance is natural , rather than a consequence of unintentional real-time processing , then de-essing is the only solution IMO.

I see that ACX in their youtube videos say DO NOT USE DE-ESSERS! - amongst other things. They’re also anti noise-reduction/removal…

Yes, they do. Their goal is natural, clear speech. My metaphor is someone telling you a story at the kitchen table over cups of hot tea. ACX is specifically trying to avoid someone who starts out with a terrible room, bad microphone and dreadful street noises and then smashes the performance into ACX compliance with industrial-level corrections and processing. But that only gets you past the ACX robot. It will not get you past human quality control. ACX has a rejection called “Overprocessing.” No, you’re not supposed to sound like a bad cellphone.

Noise Reduction in the latest Audacity can be adjusted so with moderate corrections, it can’t be heard working. This is in sharp contrast to the earlier Noise Removal which created audible sound damage everywhere it went.

There is a sister posting to yours that’s instructive. Attached is his last posting. He claims it’s a raw recording. I opened it in Audacity and adjusted his volume down slightly and it passes ACX compliance. No “processing” at all. Of course, it sounds perfect except for the slight honk from his closet walls. So yes, it can be done. I did something similar in my quiet third bedroom a while back just to see if I could do it.

While his posting conforms to ACX specs, I don’t recommend he actually record like that. If he does anything wrong or the wind blows the wrong way, he’s going to violate one or the other limits, it’s too tight, so I’m recommending some gentle compression and one rumble filter. If he does that, he can make tiny mistakes and get theatrically expressive without running into problems.

I tend to agree with ACX about De-Essing. That’s just so difficult to get right…


Part of the stumping nature of this is the absence of testing. Try this. Take a sheet of newspaper as close to “real” as you can get*. Start a recording of a broad, flat sheet and slowly crumple it up into a little ball in front of the microphone. Keep it under ten seconds. Make sure the sound channel doesn’t overload. No red lines in the blue if you turn on View > Show Clipping.

It’s mechanical white noise.

*LA Times, New York Times, St Louis Post-Dispatch, etc. They all use about the same paper stock and make about the same sound when they crumple.


Some of the hot-spots in BonnieVO’s recording are 16dB too loud …
BonnieVO ''Mr Beebe no eyeball 15cm 220715''.png
To remove those with equalization would require deep narrow notch filters, which would damage the whole recording. Something dynamic like de-essing is required instead : to [strongly] attenuate only when above threshold.

I know it’s attractive to design ever better software packages to solve problems like this, but just how did the microphone make these sounds? They’re not just “bright.” That affects everything—the rock band microphone presence hump— and usually responds to proper equalization. It’s more serious than that. Anything not more or less responding the the performer’s voice is distortion.

It’s scary that someone in the design chain designed a dull-sounding microphone and decided to punch it up a little by adding extra crispness to the SS sounds. As has been posted, that gives you an audiobook with ice picks in your ears.

But that just drags me back to my favorite design technique. If someone paid you do that to a microphone, how would you do it? Or how would you break a microphone in such a way that it exhibits this effect?

It’s also a time bomb. A recent posting appears to be clean and clear and passes ACX conformance with almost no effort. However, you can’t equalize or change the character of the sound in post production without it falling apart. Instant ice picks.

What’s wrong with this picture?

Can we agree that the problem is probably on the analog side? What can you do to a bitstream that only affects loud SS sounds?

I still think crumpling a newspaper in front of the microphone could be instructive. I have a couple of tests here in archive. Looking.