Fixing "Robot" Voice

Hi all - I’m new here, but I am trying to fix a sort of “robot effect” on a series of lectures I need to watch for class. (This will amount to several dozen hours of audio with this problem, so any suggestions are welcome.)

Does Audacity have some effect that can counter this distortion (the clip is attached)? (I have previewed several things that sound related to me, but I couldn’t get clip fix, normalize, or much else to do any good.) Thank you!

I couldn’t get clip fix, normalize, or much else to do any good

I couldn’t either. Whoever recorded those created massive distortion. It’s rough to recover from that.

If somebody had a gun and forced me to “clean it up,” I’d print out the words and get someone with a nice voice to read it into a good quality microphone.


If your tutor expects you to listen to many hours of such badly recorded material, your entire class should put in a joint complaint to the school / college. On the other hand, I can’t shake off the feeling that you are not giving us the full story.

Clip Fix is what you want, you just have to give it a healthy dose: Amplitude as low as it will go, -30dB, to give it plenty of headroom, and threshold way down, around 45%. Then you can throw on a filter to bring the frequencies above 1000 Hz back up a little.

Yeah, that’s fair. The professor is working through re-recording the lectures and animating them, but it takes a while, so I’m stuck with this for two hours a week for at least 8 weeks.

Hi Steve - you’re absolutely right, and, as I mentioned below, the prof is working on re-recording and animating the lectures for the final third of the class, but I am stuck with these videos for the next several weeks. This class was just recently moved to online, so they are using a bunch of old materials until he can finish getting new stuff recorded.

I appreciate the steps, Not R - I’ve given them a go and tried to drop out all the unnecessary frequencies. The spectrogram looks cleaner, but it’s still iffy quality. Regardless, I’m stuck with what I’ve got, so I’ll stick with your advice. Thanks!

The professor is working through re-recording the lectures and animating them

5X Editing. Editing and preparing a show takes five times the length of the show—minimum. Usually much longer if you’re new at it. That’s remarkably reliable as a labor projection. This is why many people want to record into the final presentation and will go to very serious effort to get that. The first time you press the “edit” button, you’re toast.

Play the work all the way through and make sure you know where all the mistakes are.

Play the final work to make sure it’s perfect.

That’s double and we haven’t done any work yet—the steps in the middle.


I don’t think the frequencies over 3kHz are “extra”! I’d boost them a little, not kill them. They’ll help you differentiate the vowels.

I notice four discrete pitches ringing up around high C, D, E-flat, and A (roughly 1050, 1170, 1250, and 1750 Hz). Are they in the source, or were those resonances introduced by your filtering? I don’t notice them in your first sample.

I see what you mean about the higher frequencies - I’ll keep messing with it. Not sure what I’ve messed up / changed in the editing. Thanks for all your help!