Making the voice more easy to hear.

Hi there,

I’m using Audacity 2.0.0
I have a video in sony vegas (made by a small camera) where the voice of a human is hard to hear due to the music that is playing too loud.
Somebony brought a radio to the court and I cannot hear what the guy is saying while he’s playing volleyball.

I can separate the sound track in vegas so that I can open it in audacity, that’s no problem.
But is there any way of reducing the music and somehow gaining the human voice so that it would be more clear.
I presume it’s a specific frequency to preserve but how to do it?

Thanks for any help!

You can apply the Equalization effect ( and use the “Telephone” preset. That will reduce frequencies that are outside of the main voice frequencies and may help a little, but it is unlikely to help very much because much of the music will be in the same frequency band as the voice. That’s as good as it gets.

Thanks, I’ll give it a try.

The best solution in this situation is usually subtitles. That’s assuming you can hear what’s being said or you know what was said.

If you can’t make-out what’s being said, it’s unlikely that filtering will help. Even if filtering makes some improvement it’s unlikely give you an acceptable result.

Or, do what the pros do and bring the “talent” into the “studio” for a voice-over.

…The stuff you see on TV and in movies where they pull a conversation out of crowd-noise at a party, etc. That’s science fiction.

Yep, I know it’s a sc-fi.
But this was an incidental situation which cannot be repeated.

If I turn the loudness on so its really high I can hear what he is saying but the music is very irritating.
Should I tweak settings (lots of them) using this ‘telephone’ plugin or leave them at default settings?

By the way, one more thing. Is this possible so that I could see on a graph what frequency his voice is using, while he’s speaking ?

“Analyze > Plot Spectrum” shows the frequency content (spectrum) of the selected audio:
Generally the “logarithmic” scale (“Axis” setting) is more useful when analyzing voice or music.