Removing a voice

tzkennedy1 · January 2, 2022, 8:43pm

I have a number of audio files of lectures from a Chinese speaker, with English translator. I’d like to speed up the listening process by removing the Chinese speaker. Any tips on how to do this quickly without manually going through and editing out the Chinese? If it helps, the Chinese speaker is male and the English speaker female, so his pitch it lower. Any help much appreciated

kozikowski · January 3, 2022, 1:02am

Is it stereo, two blue waves, or mono, one blue wave?

Screen Shot 2022-01-02 at 16.52.37.png
If it’s mono, drag-select up to 18 to 20 seconds of representative sample. File > Export > Export Selected Audio > WAV (Microsoft).

Export Selected Audio is important.

Post it on the forum. Scroll down from a text window > Attachments > Add Files.

Before you get all excited, the answer to this question is almost always no. Audacity can’t split a mixed show into individual sounds, voices, or music.

There are some special cases and this test will tell us if you qualify. So don’t call all your friends and tell them you got this licked.

Koz

tzkennedy1 · January 3, 2022, 10:10am

@kozikowski thanks for the offer, here’s a sample. To be clear, although it would be great to find an automatic way to do this, I’m not expecting that. What I hoped was that there might be some way to visualise the pitch of the voices on the timeline for example, so I could visually chop out the male voice. Thanks again for anything you can suggest.

kozikowski · January 3, 2022, 1:30pm

We may be able to help there a little.

This is Spectrogram View from the pull-down menu on the left of the track.

Screen Shot 2022-01-03 at 05.19.15.png
It gives a rainbow representation of the pitch and volume of the tones in the performance. Note between 6 sec and 12 sec, there is more energy in the upper registers. That’s when she speaks.

Screen Shot 2022-01-03 at 05.18.49.png
It’s pretty common to think there is a large difference between voices and it’s a snap to split them apart. There isn’t and it’s not. The only reason you can understand multiple voices is you’re human (we assume).

There’s other tools that do this trick, but they don’t do it on the timeline, so they don’t help you a bit.

There’s a way to play the performance at high speed. You don’t have to listen in real time. I don’t remember how to do that, but I can look.

Koz

kozikowski · January 3, 2022, 1:36pm

This is your sample at double speed (200%). You can’t always understand what they’re saying, but it’s still pretty clear when she speaks. It’s the “when” that you need, right?

Select a portion of the performance or the whole thing with the Select button on the left-bottom.

Effect > Change Tempo > 200%.

Change that number as needed. Experiment with the other settings.

Koz

tzkennedy1 · January 3, 2022, 1:42pm

Many thanks, much appreciated. Looking at the spectogram option it seems as though that will be helpful. Changing the parameters to just show the higher registers might make it more obvious so I’ll play with the settings. Thanks again

kozikowski · January 3, 2022, 1:48pm

You picked a good one. His voice is slightly high and she’s almost a contralto, so they’re closer than you think. She has a nice announcing voice. If she had a higher pitch, it wouldn’t be as nice to listen to, but it would be a lot easier to separate.

Koz

tzkennedy1 · January 3, 2022, 2:07pm

I’ll ask her to squeak more next time