Developing automatic click cleanup for speech -- new version

This is a continuation of this thread: https://forum.audacityteam.org/t/automatic-removal-of-mouth-smacks/31481/1

I am trying to automate the process of cleanup of the clicky mouth sounds that afflict my narrations, which can be a tedious business with the basic Audacity tools if you are too perfectionistic about it as I tend to be. :frowning:

I think this work so far is very promising and may save me a lot of labor in getting fine results. :slight_smile:

But perhaps my methods are also useful for other applications of click removal such as captures from old tape and vinyl.

I have a major new revision of my experimental code and thought it best to start a new thread with the link to that at the top.

I am very pleased now with the detection part, which you can try out by selecting the labels only option.

I am still experimenting with the most intelligent thing to do for the repairs, which I feel I understand less, but even so I tried the default settings on some speech samples and the fixes seemed to help more than they harmed. Certain little crackles of the type I would surely have zoomed close in on and fixed by hand before, became inaudible. A few times, a plosive consonant was dulled in a bad way.

If you duplicate a track, and “Isolate changes” in one of them, you can easily test before and after by playing the original solo or playing the fixes simultaneously. You can even edit the changes to “silence” fixes you don’t like while keeping the rest.

I have also rearranged things so that the progress indicator bar should advance more evenly.

The controls are many, as this is still an experimental thing with lots of knobs, but I think I have arranged them by decreasing importance for other curious testers, and that they are to some degree self explanatory. But feel free to ask me questions about what they mean and how the tool works.
DeClicker.ny (22.4 KB)

Here’s one example of the fixes I can do. Above, an “s” sound with two hot spots around 2kHz, which sound like little bumps when you listen closely. They are a distraction of the sort I sometimes snip out by hand.

Below, the same after applying fixes with the default setting. In context it is now a smooth sound without the distracting bumps.

This demostrates that not all clicks I want to remove are for lowpass filtering.

I was VERY pleased to get this cleanup done automatically.
esses.jpg

It looks very promising. :stuck_out_tongue:

On synthetic test signals with default settings, the performance is very impressive with excellent detection rate.

Testing on music (guitar + Vocal) the default settings produced hundreds of false positives.
After careful tweaking of the settings it managed to detect two out of three actual clicks, and one false positive. Interestingly, one of the clicks that it successfully detected is one that I had missed when previously checked by ear.

Are you running it separately on vocal and instrumental tracks?

Are clicks natural mouth noises or other things like electrical noise?

Of course lower thresholds increase detection. Perhaps “false positives” are real but subtle things.

“Clicks” can be bumps in low frequencies too. I made the picture above. Another sort of distraction in even lower frequencies (like 250-750 Hz) is what I call the “implosive” smacking of lips together that you can hear sometimes in “sm” or “sp” combinations. My tool can highlight those too, but I have not (yet) found good repair settings for those flaws.

Of course I can treat higher frequency crackles too.

I’m wondering if different passes of this tool with different settings might be good routine treatment for voice. Treat lower frequencies with perhaps longer block size and different filtering options (to be discovered).

I’m abusing the plug-in :smiley:
I do a lot more work on music than pure vocal, so I tried it on a music mix to see what it came up with. It does not handle steel string guitar well, probably because a steel string guitar produces lots of thumps and clicks that are supposed to be there. Even so, with careful tweaking it did a creditable job.

I’ve not looked at the code yet, but it may be worth making the core detection routine into a self contained function that can be reused elsewhere. I suspect there could be lots of applications where your detection method could be useful.

In case you are wondering about the meaning of labels.

Identify blocks in which the ratios of the peak amplitude of the test frequency component to that in each of the neighboring blocks, expressed in dB, exceeds the “relative” threshold, and in which the amplitude also exceeds the (default -50 dB) “absolute” threshold.

Find the intersection of overlapping such blocks, which may be smaller. Combine the two peak-amplitude ratios for the edges of that sub-block into one by taking geometric mean.

Combine information for different frequencies and overlapping intervals into frequency ranges. (How to define “overlapping?” The simple criterion now is common starting time. Take maximum length of interval.)

Label intervals with those frequencies, and the largest of the amplitude differences from background for any of the frequencies. I don’t tell you which frequency, because I thought the labels were cluttered enough already.

(Then combine labels for the channels in case of stereo, indicating which channel with a prefix of L or R or L/R. But my work is all with mono.)

Note then that if the block size is too small, less than the period of the fundamental of the tone, then you should get lots of false positives. The default is 10 ms which corresponds to 100 Hz. Is your guitar growling at lower frequencies than that? Try higher block size.

I suppose I posted a scary face picture up there without intending to. Happy Halloween!

steve, should I extract the detection part as a “library function” and put it on the plug ins board, even though it is not a complete effect?

I do think this part of the work is more stable. I have no more ideas I want to try, except perhaps stepping the test frequencies logarithmically as an option. Experimentation is all in the filtering methods now.

I think that’s an excellent idea.
It could go on this board (as a separate topic), or on the New Plug-ins page. If you put it on the New Plug-ins page, make it clear in the topic title and in the post that it is a “library function” rather than a complete plug-in, otherwise we will probably get people writing in to say that it doesn’t work :wink:

Perhaps I should post two files, a library function, and a complete Analyzer plug in using it but doing the detection and labelling only? That would encourage people to play with it before adapting it.

Certainly. If you have time to do that, that would be terrific. The plug-in would also serve as an implementation example.

Also, if you have time, it could be useful if you also made a version that include lots of comments explaining what it is doing and how.

I hope you say that as a general rule and not a complaint about this code in particular :slight_smile: I tried to comment a lot as I wrote it.

Not a complaint :wink: