Updated De-Clicker and new De-esser for speech

Paul_L · September 14, 2014, 5:58pm

It’s there, in the descriptions of Gain and Range preferences for Spectrograms. Perhaps better explanation is needed.

http://manual.audacityteam.org/o/man/spectrograms_preferences.html

I figured out from playing around that with default settings, 0 to -20 dB is indistinguishably white, -100 and below is indistinguishably grey, and -40 is red, -60 magenta, -80 pale blue, with gradations between.

Try generating sine waves with amplitudes 1, .1, .01, .001, etc. which correspond to those dB values.

Gale_Andrews · September 14, 2014, 6:58pm

Oh I see. I was expecting that explanation to be at Audacity Manual .

Spectrograms Preferences says:

Range (dB): Affects the range of signal sizes that will be displayed as colours. The default is 80 dB and means that you won’t see anything for signals 80 dB below the value set for ‘Range’.

So that is another way of saying that by default you won’t see a colour for -160 dB? And set at 20 dB, you won’t see colour for signal at -40 dB?

That seems to hold, but you say that by default -100 dB and below is indistinguishably grey. At default range of 80 dB, I can just about see a tone at -100 dB.

I’m not sure this is completely clear to the uninitiated (count me as one of those).

Gale

Paul_L · September 14, 2014, 7:02pm

Hey wait! That should say

[…] below the value set for ‘Gain’.

Paul_L · September 14, 2014, 7:07pm

The one page does point to the other for Preferences, but it isn’t explained that there are red, magenta, and pale blue colors evenly spaced along the dB scale between “Gain” and “Gain - Range”. I figured that out by playing and later confirmed that is how the code also works.

“Gain” is just where white gives way to pink and “Gain - range” is just where pale blue fades to grey.

There is also monochrome view.

Gale_Andrews · September 14, 2014, 7:16pm

You know, I thought it might mean that, but could not believe it was wrong.

The Manual is not yet frozen for 2.0.6, so I changed it.

Gale

Paul_L · September 14, 2014, 7:18pm

Or more precisely, the NEGATIVE of that value!

Gale_Andrews · September 14, 2014, 7:36pm

I’ll leave it as is. The description for “Gain” does say that “20” corresponds to -20 dB. And it is just possible to see a tone of -100 dB at default Spectrograms settings, so I don’t want to start giving examples.

I can’t see a tone at -120 dB.

Gale

Paul_L · September 14, 2014, 8:19pm

You might not want to document it, but try this:

Generate a tone of exactly 2 minutes and amplitude 1.0.

Select all and do this in Nyquist Prompt:

(prod s (pwev 1.0 1.0 (db-to-linear -120.0)))

Now you have something decaying exponentially at exactly 1 dB per second.

Then look at the spectrogram.

To make the spectrogram especially sharp, use Rectangular window type from preferences and just the right tone frequency, like 11025 Hz.

You can pinpoint 1:40 or 100 seconds as exactly where the colors end and 0:20 as exactly where the white ends. Select some to dim the colors and it might improve contrasts.

Trebor · September 14, 2014, 11:55pm

That was an extreme example : pushing it to the limit , for most cases less than 10 bands would be sufficient.
[ I think SpitFish is only has one band ].

For voice-overs only lasting around minute the processing delay isn’t too bad, and the results are worth the wait.

Here I’ve used the de-esser on the trumpet which was center-panned , but after applying the de-esser the trumpet moves about in the stereo field because the channels are processed separately rather than linked …

trumpet test settings.gif

Paul_L · September 15, 2014, 1:33pm

Those frequency settings aren’t de-essing any more! It sounds like you are using it as a band limiter instead to change a mix without having separate tracks. Interesting to see novel uses.

For fun, subtract the original from the modified signal, by inverting the original and mixing. In the residue, I hear no bass at all. I do hear a lot of the voice in the right channel. It’s not quite unchanged then in the treated track.

I am not sure I have the ear training to hear all that you hear, but I think the bad effect is very noticeable between :09.7 and :11.2, yet I don’t notice it at all after that. Am I hearing the right thing?

Dr_Righteous · September 17, 2014, 5:15pm

WOW…
I’m excited about trying theses.

I need all the help I can get cleaning up the less than great live recordings I’m given to work with.

planetlizz · September 18, 2014, 11:55pm

AMAZING!!

I’m in awe!

DE-CLICKER Defies the old adages:

(1) “If it sounds too good to be true, it probably is.” Not so!

(2) “There’s no FREE lunch.” Not so!

You ROCK!! You’re a STAR!!

Muahhhhh!!!

Liz

#lifesaver #extratimeonmyhands #byebyefrustration

Trebor · October 2, 2014, 2:53am

Paul-L’s de-esser avoids the need for expensive dentistry …
4kHz whistle through gap in front teeth cured by Paul-L's de-esser.gif

This feat would not have been possible with any other de-esser I know of.

Paul_L · October 2, 2014, 3:40am

Exactly what I designed it for!

Gale_Andrews · October 2, 2014, 6:04am

I’m still +1 for that if linked stereo is added.

But, Paul gets to write a user-level description of each control on the Wiki.

Gale

waxcylinder · October 2, 2014, 8:14am

+1 I second that

And a great pice of work this, Paul

WC

Trebor · October 2, 2014, 4:18pm

The acid test for a de-esser : “Herbert” from the vulgar cartoon “Family Guy” …

To remove a whistle each band has to be about 100Hz wide , or less, within the whistle frequency range.
At that high-resolution, processing on my (five year old computer) take several times the duration of the track,
standard de-essing which only requires bands say 1kHz wide is a lot quicker: less than duration.

Paul_L · October 2, 2014, 5:53pm

Bands have equal width in log f space, not linear, so describing them in octave or step terms is more appropriate than Hz. Higher bands will span more Hz than lower.

Thanks for the examples. I have not yet heard the Family Guy. What settings were used for the 4 kHz whistle example?

It is quite possible that my methods for steep sided band passes are ridiculous and I need to research the math better to learn a more efficient computation.

Trebor · October 2, 2014, 6:13pm

The range was 3.8 to 4.8kHz , (i.e. 1000Hz), with 10 bands , which is where I got the “100Hz” from, (1000/10).
Paul-L's de-esser settings to remove 4kHz whistle.gif

That shows you have taste : " Family Guy" is like the Simpson’s cartoon, only much more vulgar.

Paul_L · October 2, 2014, 8:52pm

(ln(4800 / 3800) / 10 ) / ln 2 = 0.0337 octaves per band, or 0 steps and 40.44 cents.

Those are REALLY narrow bands! Isn’t that excessive for good results? My default settings, which I find satisfactory for blanket treatment of tracks, are 10 bands between 2500 and 8000 which works out to 2 steps 1.37 cents, call it two.

This math isn’t very exotic – graphic equalizer sliders are logarithmic too, after all – and Audacity’s graphic equalizer has 4 steps per band, so I only halved that width for my defaults.

Narrower bands are more expensive to calculate than wider ones, and lower frequency bands are more expensive than higher.

Were you tuning settings just to fix this one bad whistle, rather than a blanket treatment of a track? Come 2.0.7, I’d say, don’t do that, do your spot fixes in spectrogram easily with this OTHER new development of mine… http://forum.audacityteam.org/viewtopic.php?f=20&t=81302

[quote=“Paul L”]
I have not yet heard the Family Guy
[/quote]

That shows you have taste : " Family Guy" is like the Simpson’s cartoon, only much more vulgar.

Just watched. Oh man, that’s gross. I couldn’t do that whistle on every s if I tried.

But how do results compare with “standard de-essing?”