Making an incomprehensible speech more comprehensible

When speakers swallow their words, talk too fast, etc., then what’s the best way to edit their speech to actually understand what they’re saying?

I usually try slowing down (in Audacity it’s Effects=>Change Speed) but I don’t manage to make it make the speech more understandable.

Here are two example sound files. I think the first one starts with “And now I’m gonna” but then I can’t make up what is said. As for the second one, I’m sure it starts with “when I count to three and snap my fingers, you will” and I can’t understand the rest (but it ends with “…from now on”).

Thanks! (75.1 KB)

We’re clear, right, that this was shot wrong? I’m guessing the piece was shot with the microphones on top of the camera. You can’t get rid of the echoes, but you might be able to suppress the room noise…a little.

You can’t change the performer, either. She’s competing with the room and your only hope is to try and get rid of that. Pick a segment of performance where she’s not talking (and nobody else is, either) and apply Effects > Noise Removal… That tool takes a segment of “Room Tone” (background noise) and then tries to subtract it from the rest of the performance. It’s a crap shoot. The tool works better in Audacity 1.3 than it does in 1.2, so this is your opportunity to load both on your machine. Just don’t try to run both at once.

If you determine that room echoes (a live or bright room) is your trouble, plan for a reshoot. There is no way to get rid of that.

The performer should have been wearing a lavalier microphone to make her stand out from everything else. But you work with what you got.


I only use 1.3 anyway, but I’m not talking about sounds I tape myself. The 2 attached examples were extracted from a movie file (that has no subtitles) because I’m trying to understand what is being said. Have you tried the examples?

Using Equalisation you can filter out some of the banging in the background.
The second clip actually has some words missing, so it is impossible to ever hear them:
“When I count to three I’m going to snap my fingers and you’ll powerful with women from now on”

I think the first one sounds like: “and now a matter of excellence d rights fra minute - no” but that doesn’t seem to make much sense. Unfortunately Audacity does not have an “anti-mumble” button.

<<<an “anti-mumble” button.>>>

Nor does anyone else. The rule here is by the time you realize you need the special effects tools, it’s too late. I don’t think the big kids could rescue this; it’s completely out of Audacity’s league.

And yes, I did listen to them. I can think of one not-free tool that might help slightly. SoundSoap has a tool where they try to filter natural voice frequencies at the same time doing the sample/noise reduction trick and a couple of other tools thrown in like hum reduction. See: rule, above.

Certainly the only way to rescue the second cut is to get rid of the echoes in the room, which you can’t do. Echoes are the performer’s own voice arriving at the microphone more than once–bouncing from the walls. So you are, in effect, asking the software to extract the performer from herself.


Attached is a better quality version of the first sound. Can you understand what it says now?

Thanks! (87 KB)

“and now man excellence do right superman no” ?

“and now, Madam, Exxon’s do right supplementum”

Can anybody pile in?

“Any how, Main…”

I used to know somebody over at the FBI from when I lived in Washington, DC. Maybe we can give them a call.