Updated De-Clicker and new De-esser for speech

jlatt, follow these instructions: http://wiki.audacityteam.org/wiki/Nyquist_Plug-ins

They go in the plugins folder inside the Audacity program folder.

I think this is the installer instructions.



Hi Paul

There seem to be suggestions that the De-Esser is very good. Is the De-Esser also good for music vocals? Does it need more development? Would you want it to be published on the Audacity Wiki if so when?


From whom have you heard that?

I am not sure I’m doing a smartest things mathematically with this de-esser, but it seemed to drop out of the more elaborate de-clicker as a sort of simplified special case. As for the de-clicker, I am also not sure I am doing the smartest possible things there. It all runs at a snail’s pace. This is a free tool I offered to the patient.

Have you tried it yourself and formed an opinion?

Trebor speaks highly of your De-esser - he said it was better than Spitfish.

Not yet - I’ve no use for a de-esser most of the time. But tools don’t have to be mathematically perfect to be on the Wiki. And if they are on Wiki there is more chance of getting feedback from multiple use cases to improve them.


No joke!

Well I don’t really know what “real” de-essers do, but mine could be described as a multi-band brickwall limiter, with many narrow bands in the default settings, which account for the slowness but perhaps also for precision of results if you can wait for it.

I hear it said that “sibilance” is eq’d down at 7 or 8 kHz, but I find that s sounds in my speech recordings sometimes have unpleasant piercing whistles in them that are much lower, between 3 and 4 kHz. You can easily see such things in spectrogram. I wrote something that does the program equivalent of scanning the spectrogram for such bright bits, and then fixing each one precisely over the right interval with the right gain using the Nyquist eq-band function.

Hi Paul-L ,

No Joke : your de-esser is the best I’ve encountered, it allows far more precise control of sibilance than any other I’ve seen …

SpitFish can’t do that.

If it’s use is extended to processing stereo music then it will need the option of linking the stereo channels , rather than processing each channel independently , ( if they are not linked then there is an unwanted stereo flutter effect ).

Well thanks Trebor!

Your suggestion for stereo is a good one one. Have you observed that defect? I haven’t, I work only with narration, all mono, no mixing. Will you test improvements if I write them? Or send me a test case?

Attached is my own pet example of a bit I captured from a professionally produced audiobook that I enjoyed very much for story, but the sibilance was a knife in my ears sometimes. This especially painful s whistled at about 3400 as you see, not the 6kHz and above range. This was my acid test case as I developed. The “after” with default de-esser settings is right.

Your picture shows “multi-band brickwall limiting with many narrow bands” in action. Find just the white stuff and hammer it down. The danger is changing things too much and making the voices lisp. I have used this on enough hours of voice to believe it does not not.

Do use use default settings on the spectrogram and do you understand the colors? Every frequency band at -20 dB or higher appears white in default display settings. And -20 is exactly what I limit to in the tool’s default settings.

Trebor, I see you use even more bands than I do and a higher top frequency. You must be very patient! My work has everything above 11.5 kHz stripped out by other parties so I don’t worry about that.

You probably know this: if you choose Spectrogram view, then the bands you specify do not correspond to equal divisions in screen height. But choose spectrogram log F instead – then they do correspond to equal heights. And that’s how a graphic equalizer’s sliders work too.

+1 to linked stereo. :slight_smile:

If there is a rule in the code that colours relate to a specific dB level at default display settings, that might be nice to put in the Manual. All we know there is that blue is least energy and red and white are most energy.


It’s there, in the descriptions of Gain and Range preferences for Spectrograms. Perhaps better explanation is needed.


I figured out from playing around that with default settings, 0 to -20 dB is indistinguishably white, -100 and below is indistinguishably grey, and -40 is red, -60 magenta, -80 pale blue, with gradations between.

Try generating sine waves with amplitudes 1, .1, .01, .001, etc. which correspond to those dB values.

Oh I see. I was expecting that explanation to be at Audacity Manual .

Spectrograms Preferences says:

Range (dB): Affects the range of signal sizes that will be displayed as colours. The default is 80 dB and means that you won’t see anything for signals 80 dB below the value set for ‘Range’.

So that is another way of saying that by default you won’t see a colour for -160 dB? And set at 20 dB, you won’t see colour for signal at -40 dB?

That seems to hold, but you say that by default -100 dB and below is indistinguishably grey. At default range of 80 dB, I can just about see a tone at -100 dB.

I’m not sure this is completely clear to the uninitiated (count me as one of those).


Hey wait! That should say

[…] below the value set for ‘Gain’.

The one page does point to the other for Preferences, but it isn’t explained that there are red, magenta, and pale blue colors evenly spaced along the dB scale between “Gain” and “Gain - Range”. I figured that out by playing and later confirmed that is how the code also works.

“Gain” is just where white gives way to pink and “Gain - range” is just where pale blue fades to grey.

There is also monochrome view.

You know, I thought it might mean that, but could not believe it was wrong.

The Manual is not yet frozen for 2.0.6, so I changed it.


Or more precisely, the NEGATIVE of that value!

I’ll leave it as is. :wink: The description for “Gain” does say that “20” corresponds to -20 dB. And it is just possible to see a tone of -100 dB at default Spectrograms settings, so I don’t want to start giving examples.

I can’t see a tone at -120 dB.


You might not want to document it, but try this:

Generate a tone of exactly 2 minutes and amplitude 1.0.

Select all and do this in Nyquist Prompt:

(prod s (pwev 1.0 1.0 (db-to-linear -120.0)))

Now you have something decaying exponentially at exactly 1 dB per second.

Then look at the spectrogram.

To make the spectrogram especially sharp, use Rectangular window type from preferences and just the right tone frequency, like 11025 Hz.

You can pinpoint 1:40 or 100 seconds as exactly where the colors end and 0:20 as exactly where the white ends. Select some to dim the colors and it might improve contrasts.

That was an extreme example : pushing it to the limit , for most cases less than 10 bands would be sufficient.
[ I think SpitFish is only has one band ].

For voice-overs only lasting around minute the processing delay isn’t too bad, and the results are worth the wait.

Here I’ve used the de-esser on the trumpet which was center-panned , but after applying the de-esser the trumpet moves about in the stereo field because the channels are processed separately rather than linked …

trumpet test settings.gif

Those frequency settings aren’t de-essing any more! It sounds like you are using it as a band limiter instead to change a mix without having separate tracks. Interesting to see novel uses.

For fun, subtract the original from the modified signal, by inverting the original and mixing. In the residue, I hear no bass at all. I do hear a lot of the voice in the right channel. It’s not quite unchanged then in the treated track.

I am not sure I have the ear training to hear all that you hear, but I think the bad effect is very noticeable between :09.7 and :11.2, yet I don’t notice it at all after that. Am I hearing the right thing?