Who is able to create a simple VST plugin or nq?

Who can help and create simple plug-in for Audacity (ideal would be a VST that would work in Sony Vegas too).
Based on previously set parameters-controls (levels, attracts and releases) a plug-in should

  1. determine a gaps between useful voice - mainly voice
  • my idea is that plug-in starts selecting from end of previous useful signal and its decoy
    and stops according to set parameters - attack, level and duration of useful signal.
    The duration is to distinguish between short noises like mouse click and a words - which are longer
  1. clears this gaps to -infinite level

The plug-in is to clean “silent” pauses between words.
It is to successfully to get rid of unwanted noises of a speaker, like munching, breaths, openings and closings of the lips.
I am working on some voluntary translations and this plug-in would dramatically help me,
while these noises are just awful to hear.
I tried whatever possible, but no gate can clean it completely ad most important,
aggressive gating destroys a quality of the sound drastically.
I was doing the above process manually in audacity - with indescribable success, but it is very much time consumable.
A simple plug-in can do it easily.

I could help with a Nyquist plug-in.
I can’t help with VST.

Thanks Seve,
so can you help me?
What should I do so you can help me with that nq plugin for audacity?

You don’t need to immediately become a Nyquist expert, but I think the first thing is for you to get a little familiarity with Nyquist.

Nyquist is well documented these days.
This page is not yet complete, but has some fairly simple examples that you can try: http://wiki.audacityteam.org/wiki/Nyquist_Audio_Programming

Also try to read through / work through this: http://wiki.audacityteam.org/wiki/Nyquist_Plug-ins_Reference

Other very useful pages that you will probably want to bookmark:

Spend a bit of time just playing with Nyquist and see if you can get it to do something (anything).

The basis of the code that you need will probably be similar to this plug-in, so after you have had a few days playing, see if you can work out how this works. You will probably need to look up all of the functions one by one in the Nyquist or XLISP manuals. (The XLISP manual is particularly useful as it has examples for each of the functions, though most of the functions that operate on sounds are specific to Nyquist and will only be in the Nyquist manual). http://forum.audacityteam.org/download/file.php?id=2122

Don’t be disheartened if you don’t understand everything straight away. Feel free to ask questions. Also, if you tell me about your successes then it will help me to ptch advice and suggestions at a useful level. Do you have any previous programming experience? (You don’t need to have).

I read through the requirements. You’re trying to make a system that recognizes human speech and separates it from not human speech (noise and interference).

I’ll watch.


“- my idea is that plug-in starts selecting from end of previous useful signal and its decoy
and stops according to set parameters - attack, level and duration of useful signal.
The duration is to distinguish between short noises like mouse click and a words - which are longer”

So it’s all based on the duration of “sounds”, ie the signal level is greater than a given threshold. We’re not attempting the impossible.

It is to successfully to get rid of unwanted noises of a speaker, like munching, breaths, openings and closings of the lips.

[continuing to watch]

Yes your right.
Whlie speaker speaks to the condenser microphone it is somehow very sensitive to aweful unwanted noises that human mouth produce between uttered words,
especially before sentences. These are munching, chewing, swallowing saliva, opening and closing the lips.
These noises are quiet loud comparing with “normal” noise, hum, etc.
So when applying classic methodes, like gates, in order to cut them, the usefull signal (speech) is distorted, because the noise level must be set too high on the gate effect.
By the way to get rid from normal noise there are numerous ways to succesfully do it. As I use Rhode mic which, if recorded in “noise clean” room, it produces no noise at all at my system. So for this reason I do not need to use gate to clean noise.

My question Steve was unclear.
I am not programmer.
I can hardly to make simple html, thats all.
I am looking for either exsisting plugin or someone that could do it for me.
I know it is simple, but I am rather user than programmer.
Can anybody create this for me?
Thanks a lot, guys.

Many guys do it this way.
They open file in audacity.
Whoever, who has been working with sound for about a week, can easily predict what is usefull signal and what is trash.
They simply select “silent” pasages and silent them (Ctrl+l in Audacity).
To make a song that lasts 3 minutes is piece of cake.
But imagine speech lasting 1-2 hours… it would be just waisting the time, when this is very simple algoritm, that human barin can learn quickly.
Algorithm that my brain uses is very simple in this case.
Compares levels and lengths (maybe no so much attacks RMS/peaks, release ?).

Even simple short word is very different from these unwanted noises, which are shorten and more quiet.

has been working with sound for about a week, can easily predict what is usefull signal and what is trash.

Exactly correct. You or the software has to know what is valuable human speech and what isn’t. If you simply take all the sibilance and ticks out of a human voice, it turns into mud. One of the things you gained by buying a high quality microphone was clarity, liveness and crispness of presentation. This appears to be exactly what you’re trying to remove. You might do better by stretching a heavy woolen sock over the microphone – and I’m not kidding. Tune the sock for good presentation.

One of my favorite radio shows is This American Life. Ira Glass has a vocal thing that he does in that he always “ticks” his lips before he starts talking.

[tick] “Hello everyone, I’m Ira Glass. On this week’s show…”

I’m sure I’m the only person on earth who pays attention this this kind of thing, but it’s part of his signature. Whenever I edit his show for the car, I make it a point to include all of those.

Are you too close to the microphone? Proximity effects can be deadly if you’re too close. That’s one of the reasons people use a blast filter – to keep the performer away from the mic.

How are you listening? If you’re trying to mix on a “computer sound system” then good luck with that. Most of them are terrible. Headphones are better. Consult reviews. Have you tried listening on the client sound system? You could get an interesting surprise that none of this stuff is audible there.

Gates, etc. One of the reasons gates and other filters produce science fiction sound is sound never really stops between words in real life. There’s always a tiny bit of ambiance or room sound back there and when it vanishes, your head rings bells that there’s something wrong. Of course, if you’re mixing with background effects or music, then you may not need to bother with any of this.

A respected sound shooter friend of mine once told me I’m going to hear a lot more in the shoot headphones than is ever going to make it to the show.

Post a bit of your ticky, hissy dialog track.



What about a noise gate which uses an RMS value rather than a the absolute value as a threshold ?

A click during otherwise silence can be loud (i.e. high ABS value) but it’s short duration means it has a low RMS value compared with speech.

If a noise gate threshold is set too high you would lose the beginning of words, (true of ABS or RMS versions).

[ I think this noise gate uses an absolute value … http://wiki.audacityteam.org/wiki/Nyquist_Effect_Plug-ins#Noise_Gate ]

I see what you are suggesting Koz, but before we proceed further with discussion I think we need to know whether jankobenko is wanting tips for improving recording quality, or to create a plug-in effect. The original post appears to asking about creating a plug-in effect. Of course it’s not too late for jankobenko to change his mind, but if building a plug-in effect is still the question then I think we should try to keep to the point.

A noise gate that uses RMS detection can certainly help in the case of very short clicks and crackles (such as minor scratches on vinyl), but probably won’t help much for longer noises like mouse clicks.

Yes it does, though it could be modified quite easily to use RMS detection.

I did miss one item in the laundry list. Some people are aggressively not announcers. Does this odd mouth noise thing happen if somebody else speaks into the microphone? I know people with that microphone and they sound just grand. I would wonder if we’re trying to cure a broken microphone in software. Even if we do succeed in producing a Professional Audio Filter (PAF), requiring it each and every time you perform is not coming from a happy place. If somebody falls in love with you enough to want you to do a lot of voice work, it can be debilitating.

“Hurry up. We need that voiceover right now!”

“I can’t, I have to get rid of my swallow rattle and cheek pops [watching hourglass on screen].”


Tank you Koz,
but I need not discussion but a help from a guy, that can create this plugin.

How are you getting on with Nyquist jankobenko? Have you had chance to spend any time playing with it?

Could I summarise the algorithm as “a plug-in to mute short sounds”?

Mouse clicks are broadband noise, extending beyond 10KHz …
mouse-clicks are broadband.gif
that property may enable them to be differentiated from other sounds.

[ BTW for windows users there is the free VST Floorfish , which is a frequency dependent noise gate ]

Thanks Trebor. I agree that taking the frequency content into account is likely to be beneficial for determining which sounds are “useful” sounds and which are “noise”.
Once we have a basic implementation of the analysis stage of the plug-in we can test out various ways of improving its accuracy against real recordings.

silence-cleaner.ny (2.09 KB)
I hoped that one had a sample file with those clicks. I can’t record at the moment. Therefore I experimented with the following “speech simulation”

(setfn rr real-random)
(setf s (abs-env (trigger (lowpass8 (noise 30) 2) 
(reson (noise (expt (rr 0.0001 1.2) 3.0)) 
(rr 200 6000) (rr 50 700) 2))))

Based on the produced sounds, I’ve written a plug-in with the following features:

  • Only mono sounds are processed.
  • All noise below a arbitrary level will be removed.
  • The duration of the clicks can be entered in milliseconds.
  • The amount how much adjacent sounds are protected is given by a sample number.
  • In the debug window is written how many blocks of sounds were found and how many silenced.
  • You can listen to the sounds that were removed (Windows).

There is no error checking yet. The algorithm is only based on the duration and isolated position of the sounds, frequencies and RMS are ignored. Everything is calculated at the tracks sample rate, so don’t process too large tracks (I did 50 min of the sound produced by the code above). Remember that all gaps are nullified i.e. totally silenced. Thefore, it is only suited for speech transcriptions. Please give me your feed back in order to help improving this plug-in.

I don’t think your simulated speech is as complex as true speech.
Samples of speech and mouse clicks are available from Freesound. e.g. …

Shakespeare … Freesound - Henry5.mp3 by acclivity

mouse clicks … Freesound - Mouse Clicks.wav by Swoboman87

Differentiating a mouse click from a “k”, “t” or “d” at the end of a word is going to be difficult.

Love it :smiley:

and differentiating between a mouse click and valid vocal audio within a word is going to be mostly impossible. As I suggested earlier, there’s a lot that is possible without trying to achieve the impossible, so it makes sense to work on the possible first.