Removing a voice

I have a number of audio files of lectures from a Chinese speaker, with English translator. I’d like to speed up the listening process by removing the Chinese speaker. Any tips on how to do this quickly without manually going through and editing out the Chinese? If it helps, the Chinese speaker is male and the English speaker female, so his pitch it lower. Any help much appreciated :slight_smile:

Is it stereo, two blue waves, or mono, one blue wave?

Screen Shot 2022-01-02 at 16.52.37.png
If it’s mono, drag-select up to 18 to 20 seconds of representative sample. File > Export > Export Selected Audio > WAV (Microsoft).

Export Selected Audio is important.

Post it on the forum. Scroll down from a text window > Attachments > Add Files.

Before you get all excited, the answer to this question is almost always no. Audacity can’t split a mixed show into individual sounds, voices, or music.

There are some special cases and this test will tell us if you qualify. So don’t call all your friends and tell them you got this licked.


@kozikowski thanks for the offer, here’s a sample. To be clear, although it would be great to find an automatic way to do this, I’m not expecting that. What I hoped was that there might be some way to visualise the pitch of the voices on the timeline for example, so I could visually chop out the male voice. Thanks again for anything you can suggest.

We may be able to help there a little.

This is Spectrogram View from the pull-down menu on the left of the track.

Screen Shot 2022-01-03 at 05.19.15.png
It gives a rainbow representation of the pitch and volume of the tones in the performance. Note between 6 sec and 12 sec, there is more energy in the upper registers. That’s when she speaks.

Screen Shot 2022-01-03 at 05.18.49.png
It’s pretty common to think there is a large difference between voices and it’s a snap to split them apart. There isn’t and it’s not. The only reason you can understand multiple voices is you’re human (we assume).

There’s other tools that do this trick, but they don’t do it on the timeline, so they don’t help you a bit.

There’s a way to play the performance at high speed. You don’t have to listen in real time. I don’t remember how to do that, but I can look.


This is your sample at double speed (200%). You can’t always understand what they’re saying, but it’s still pretty clear when she speaks. It’s the “when” that you need, right?

Select a portion of the performance or the whole thing with the Select button on the left-bottom.

Effect > Change Tempo > 200%.

Change that number as needed. Experiment with the other settings.


Many thanks, much appreciated. Looking at the spectogram option it seems as though that will be helpful. Changing the parameters to just show the higher registers might make it more obvious so I’ll play with the settings. Thanks again :slight_smile:

You picked a good one. His voice is slightly high and she’s almost a contralto, so they’re closer than you think. She has a nice announcing voice. If she had a higher pitch, it wouldn’t be as nice to listen to, but it would be a lot easier to separate.


I’ll ask her to squeak more next time :wink: