Please help with de-sser.

Felipe_Zanabria · March 16, 2018, 6:59pm

Today I send you a very useful plugin for music producers and announcers. It is the De-sser, an effect to reduce the sivilances in the voice, mainly when increasing treble to the audio.
De-sser.ny (2.28 KB)
It has the following controls:
Listen: processed sound or removed parts (default processed sound), this switches between listening to the effect applied to the audio or the frequencies that will be attenuated.
Frequencies above: 5000 to 12000 (default 6000) controls the frequencies that will be affected.
Threshold: 0 to -24 db, (default -10 db) controls the noise threshold that will be attenuated. The higher the value of this control, the less audio will be taken, the lower the value, the more autio will be taken.
Mute Level: 0 to -100 db (default -6 db) controls how much the detected audio will be attenuated.
Look ahead: 1 to 100 ms (default 16 ms) controls how fast the sound dims.
Release time: 1 to 1000 ms (default 20 ms) controls how fast the sound returns to normal after passing the sivilance.
Note:
Use the removed parts option to listen to how much audio is attenuated, while adjusting the threshold and cutoff frequency. In general, Aehad look and release times work well in most cases. Set the Mute Level to a value that the sivilances sound balanced compared to the rest of the audio.
In this plug-in I have used a high-pass filter to filter high frequencies, and I have almost good results.
How can I use eq-band or eq-highshelf to do my de-sser?
Can I implement a simpler code than the de-sser already published in the forum?
Greetings.

Trebor · March 17, 2018, 5:21am

When it reduces frequencies above specified value, (4kHz-12kHz),
it is also increasing the frequencies between ~1kHz & the specified value …
Felipe Zanabria's De-esser plugin.gif
That increase doesn’t seem right to me.

Felipe_Zanabria · March 17, 2018, 7:19am

Maybe it happens because I put this code in the plug-in for an experiment.

(setq freq (/ freq 1.6))

Another reason may be because I am using the high and low pass filters, these change the sound a bit apart from cutting at a given frequency.
Below I leave the file without this line.
De-sser changed.ny (2.25 KB)
Remember, I want to use eq-band or eq-highshelf instead of this:

; Applying the de-sser:
          (case apply
(0 (diff (lowpass2 * track * freq) (multichan-expand # 'de-sser (highpass2 * track * freq))))
(1 (SUM (MULT (highpass2 * track * freq) -1)
(multichan-expand # 'de-sser (highpass2 * track * freq)))))
)))

The function that controls the volume is already defined in the plug-in.
regards

steve · March 18, 2018, 7:00pm

Some comments from going through your code:

You don’t need to wrap the main part of the script in a “progn” block.
You don’t need to check that user values are within the ranges specified by the controls as slider controls are validated by default to disallow out of range values.
You probably don’t want to scale the frequency by 1.6. The filter “corner” frequency of the Nyquist high-pass and low-pass filters is the -3dB cut-off point. For shelf filters, the specified frequency is the half-gain frequency. I’d suggest not scaling the frequency so that the frequency that the user selects is the frequency that the effect uses.
You should check that the filter frequency is not too high so as to avoid bad things happening if the track has a low sample rate.
Your “de-sser” function is actually an “inverted gate” function. Better to give it a name that accurately indicates what it is.
For effective de-essing, you will probably need to set the filter frequency lower than 4000 Hz. I’d suggest a range starting at around 1800 Hz up to around 8000 Hz.
The release time should probably be a bit longer than the default so as to avoid “flutter” when the signal level is close to the threshold level.
You could use integer controls as fractional values are not really required.

I don’t think your algorithm is correct at the end of the program.
I think what you intend is that for the “processed sound” to be composed of the sum of non-gated low frequencies, plus gated high frequencies.

The built-in Nyquist filters are pretty close to “ideal” Butterworth filters, but summing a low-passed and high-passed signal does not precisely reconstruct the full frequency band because of phase shifts through the filter. Rather than writing the low-pass filter as (lowpass2 signal Hz), it would be better to write it as (diff signal (highpass2 signal Hz)). This will ensure that below the gate threshold, the two signal paths combine to precisely reconstruct the full frequency range. This technique also has the benefit that if you use a different type of filter (for example a shelf filter or a band filter), that the pass-band signal will still be complimentary to the gated signal (they will still sum to the original signal below the threshold).

The attached file has the changes listed above:
deesser.ny (1.68 KB)
However, I think this could be improved further. Currently, the effect cuts in and out as an all or nothing effect, which tends to draw attention to the effect as the signal crosses the threshold. This can be mitigated by using a slower attack / release, but possibly a better approach would be to make the effect more “progressive”, so that the high frequencies are attenuated proportionally to how far above the threshold they are. To do that you would need to modify the amplitude follower (the “inverse-gate” function).

Felipe_Zanabria · March 19, 2018, 3:30am

Thanks Steve, you are a genius.
The effect improved a lot.
With my code the audio is amplified a little and now it is diminished, but I know that it is because of the sivilances and I can amplify later.
Yes, it is a reverse gate. I am not an expert in Nyquist and I could not find ways to do this, until I remembered that the Audacity wiki is the one that I have based on.

I think this plug-in was made before audacity put limits on controls in Nyquist. I was more interested in the effect and I did not make many changes. What I did was remove the help screen and put in a new code.
I do not understand much how I can do what I said below.

To do that you would need to modify the amplitude follower (the “inverse-gate” function).

Felipe_Zanabria · March 23, 2018, 2:41pm

Hello Steve, I was trying the plugin and I have some concerns.
The cut filter does not reduce the s or sh much. To see it closely I tried the simple difference of the sound and the filter to see how accurate it is.

(diff *track* (highpass2 *track* 2500))

This means that high frequencies should be reduced completely.
I try to increase the value of q, but it has a lot of resonance.
One effect that has the filter well defined and without recognition is the Tonman de-sser.
Can I approach this effect with the Nyquist filters?
Thank you

steve · March 23, 2018, 3:37pm

If you want a wider notch, decrease the Q.

Felipe_Zanabria · March 23, 2018, 5:23pm

I do not want that, rather I want the frequency cut to be more aggressive. For example, when I want to attenuate 3 db per octave, I want more attenuation, it’s like the filter is of more poles like highpass4, highpass6 or highpass8.
I have tried to make a difference with these with the original sound but it does not work.
Below I leave an mp3 with what I want to get.
The first sound is using highpass2, and the second one using highpass8, which I’m going to replace with lowpass8 since I can not make the difference with the sound.

Try to do:

(diff *track* (highpass8 *track* 2000))

It does not work.
I understand that this is complicated because there are 4 filters in a loop.

steve · March 25, 2018, 8:39pm

I don’t think that this is really a good approach for making a de-esser.

Certainly you need some sort of envelope follower to track the amplitude, and you need to use that to modify the gain of the “S” frequency band, but using a “gate” type control will abruptly switch the effect on / off, which I would not expect to work well.

The way I would approach this would be to create an amplitude follower using “snd-avg” (http://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/manual/part6.html#index494) This is a little bit tricky because you need the envelope to precisely match the timing of peaks in the audio and “snd-avg” is always looking ahead to the next block of samples. To get synchronization it will be necessary to offset the envelope by the length of “blocksize”.

Then I would clip the envelope so that only those parts above the threshold remain, and scale the envelope to a range equivalent to the amount of attenuation required.

The envelope can then be used as a control signal for “eq-band” (http://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/manual/part6.html#index351).

This is actually quite an ambitious project.

Trebor · March 25, 2018, 11:17pm

A simple de-essing strategy is to band-pass the sibilant frequencies, invert their waveform, then mix them back with the original audio.
Assuming the original & band-passed waveforms remain in-phase, then when inverted they will be in anti-phase and will attenuate any excessive sibilance.

No envelope-followers necessary.

steve · March 25, 2018, 11:25pm

but that will attenuate those frequencies throughout - essentially identical to simply filtering out those frequencies. A de-esser should attenuate the high “S” frequencies when those frequencies are excessive, but otherwise leave them alone, and that’s why it needs to be a “dynamic” effect.

Trebor · March 25, 2018, 11:50pm

Don’t mix-in the inverted “s” frequencies at their original volume, just mix-in at a fraction of the original volume, so it will preferentially attenuate the loudest “s” frequencies.

Could gate the inverted signal, so there is no de-essing whatsoever below a volume threshold, (nor when the signal is of short duration).

steve · March 26, 2018, 12:34am

and there’s the envelope follower.
Rather than a gate, a downward expander could be used (like a gate with a soft knee), which would avoid sudden and probably noticeable “switching” of the filter. That’s essentially what a de-esser is.

steve · March 26, 2018, 1:23pm

Here’s a plug-in that I’ve put together based on the above description.
I’ve only done basic testing, and don’t have a good selection of sibilant speech recordings to test with, but from limited tests it seems to work reasonably, though a bit fiddly to set up. Previewing the “Residual” (the “S’s”) helps, as does checking the frequency response in Plot Spectrum.

I’ve commented just about everything so that you can see what it’s doing and how, so hopefully it will be a useful starting point for your own plug-in.
de-esser.ny (2.87 KB)

Felipe_Zanabria · March 26, 2018, 3:46pm

I do not understand much the snd-avg function but this plug-in sounds very good. The only strange behavior occurs when you select to listen to the residue, where it clicks, as if it were dc-ofset.
The attenuation of frequencies is much more defined as well.
Another problem, if the selection of the center frequency is low, a lot of audio is taken out of the sivilancias.
Below I leave an mp3 without applying the effect and after applying.
It is a radio ad that communicates the sale of a car.
It is normal that they do not understand what I say because I speak Spanish from Argentina.

Trebor · March 26, 2018, 5:01pm

By-George I think you’ve got it …

settings used.png
Nit picking here …
the attack and release should be different : ~10ms for attack ~100ms for release.

Felipe_Zanabria · March 26, 2018, 6:08pm

Your recording sounds very good.
I also think the same of attack and liberation.
With my voice I achieved a good result by setting the cutoff frequency at 4000 Hz, the bandwidth at 1,100, the attenuation at -10 dB and the attack and release time at 35 ms (0.035 seconds).

steve · March 26, 2018, 6:11pm

That is likely to happen if the selection begins above the threshold. Generally you should try to start and end effects in a silent, or near silent part of the audio. If I do that, I don’t notice a click.

SND-AVG is described here: Nyquist Functions

That could be done, but it adds another layer of complication to the effect - in particular, it makes synchronising the envelope with transients in the audio even trickier.
The way you would do it is to use a shorter step size for snd-avg, so that there is a faster moving envelope, then chase the envelope with a “follower” such as SND-FOLLOW. Where attack/release times are quite short, this additional complication rarely gives much in the way of perceptual improvement, but it is definitely the way to go where slow attack/release times are required (such as “transparent” compression for classical music).