Dynamic microphone noise suppression

This idea may well have been floated somewhere else already. If not - remember you saw it here first!

I have been tesing some microphones recently for use with online comms packages such as Zoom. My tests included recording into Audacity, then playing back. There is some noise clearly present in some of the recordings. This can quite easily be removed or significantly reduced using the noise removal tools in Audacity.

It occurs to me that modern computers are so fast compared with earlier models that it might be possible to do some noise removal for specific microphones dynamically, by using essentially the same filtering algorithm.

However for some applications this might impose an unwanted delay, so it would be better to filter the noise later rather than during the initial data capture.

Essentially a short test could be made using the desired microphone, and the noise characteristic then calculated. Then the live streaming or recording could be done, either with the appropriate noise filter based on that characteristic inserted before or after the link, or the noise characteristic could be applied later to the recorded version. There might be a choice of possible filters depending on the desired degree of filtering - determined by a standard set up test.

It might not work - but it could be worth investigating. Has this been done before?

Yes: real-time noise-reduction is available
https ://www .youtube.com/results?search_query=real-time+noise+reduction
It’s constantly acquiring & applying a noise-profile.

That’s why fully automated communications and chat systems hate music. Any sound that stays around for longer than a set time is considered “noise” and get rid of it. Also see forum posts from people complaining that: “Audacity is changing the volume of my music and I can’t stop it.”

Audacity doesn’t apply effects, filters or corrections on recording, but it gets the sound from Windows and Windows services can do all kinds of tricks like that.

iPhones have (or had) two different recorders. Voice Memos (built-in) and Music Memos (download). Music Memos doesn’t apply tricks and effects.


The “real problem” with Zoom is that most people don’t have a soundproof studio, nor are they professional performers or recording engineers. :wink:

Noise reduction works best when you have a constant very-low level background noise… When you don’t really need it. If the noise is bad “the cure can be worse than the disease”. I’ve got some older movies on DVD that were apparently noise-gated when they were digitized. The background noise isn’t that bad and I’m not really noticing it… Then during “silence” the noisegate suddenly kicks-in and everything goes dead-silent for a moment and it’s distracting. And that’s a rather mild side effect… Often the “cure” is much worse…

There is some noise clearly present in some of the recordings.

Most microphones don’t generate noise. They pick-up noise from the room. A more sensitive mic will pick-up more noise but it also picks-up more signal so it’s no different from turning-up the volume.

A directional mic can help because the noise comes from all directions whereas the signal/voice comes from one direction.

The preamp (in a soundcard or interface) also generates (electrical) noise. A more sensitive mic (or talking louder or getting closer to the mic) can help to overcome the electrical noise (a better signal-to-noise ratio).

Probably the best kind of microphone for people who don’t have a “studio” is a good-professional headset mic (the kind Garth Brooks and Lady Gaga use for live performances). It picks-up a strong signal because it’s close to the mouth and the distance to the mic is constant. A gaming headset can be good for minimizing room noise but they can pick-up “breath noises” and they can just be “poor quality”.

I have two headset microphones. One, I think Logitech, seemed to work OK, but its sound quality was pure gamer. Just enough to let the other players know I was still breathing. It has a place of honor in a box in the garage.

The other one I used for my podcast test and is a terrific microphone and it would be even terrificker if I could find it. I searched all the subjunctive locations. The places where I would have put it had I put it there. But yes, it let me host the podcast in the middle of my noisy living room with little or no trash in the background.



The second of those videos is really amazing - great.

The first isn’t too bad either.
Are those two different software systems costly? I think I saw something about a subscription model on one of the comments - and I’m not doing that either. If there’s low cost or open source software out there which will behave similarly I might be interested.

I was rather thinking of a system which would do a short one off noise calibration before starting a session, rather than one which tries to guess which sounds are wanted and which aren’t. If those systems are always redoing the noise profile, how can they possibly know what is wanted and what is not?
Despite that though - in some application areas software like that would be really useful.

There’s “krisp” which is a software only service. It has a free version which is limited to 2 hours per week. Beyond that it’s a monthly subscription (currently starting at $60 per year).

The Nvidia RTX version requires an Nvidia RTX video card, so there’s a one off cost of the video card (they’re not cheap).

Both are amazing and extremely complex products.

They are optimized for voice. They’ve used very clever AI modelling to determine what is a voice and what is not, then remove “not a voice” sounds.
There’s some information about how they work on the krisp website (commercial site: https://krisp.ai/blog/nvidia-rtx-voice-krisp/)

To some extent this is already done in apps such as Skype / Zoom, and some sound cards have similar noise reduction / echo cancellation features.

The significant down side to “dynamic” (as you record) noise reduction, is that if does something bad, there’s no going back - you have to re-record. It’s a lot quicker and easier to undo an effect and re-apply, than to re-record.