Filtering a soft conversation from a louder one

I have a video from YouTube with audio that is identical on dual mono and Ogg Vorbis. The audio has two dubbed languages, A and B. I want to hear Language A, but it’s soft and drowned out by the much louder Language B. I have no access to any other media that would help me subtract Language B.

The ideal solution would magnify B while making A very faint. I don’t think Vocal Isolation would help since both A and B are in the same signal, and I basically want to isolate a part of the signal.

I looked at Compressor and moved the threshold very low (~ -60 dB) and made the ratio really high, and I could hear A better, but B was also enhanced. I also tried compressing based on peaks.

I briefly looked at Normalize, but I think I’m out of my depth. Any ideas? I could provide a sample if it helps.

I think I’m out of my depth.

No. You’re not. There are no good tools to split two different mixed conversations into two different sound files.

The trouble is there’s no “Key.”

Audacity (and most software) have no idea what a language is. All it knows is a bunch of musical tones jammed into a sound file at the same time.

Vocal Removal works by “knowing” that the vocal is usually in the center of a stereo field. That’s its Key. Delete everything in the middle.

A really simple version of Key is one language on the left of a stereo show and the other on the right. Simply splitting the show left to right will give you your job.

Two people speaking at the same time in the same file even if one is slightly lower volume than the other doesn’t count. Picture the confusion if you didn’t speak either of the languages. Neither you nor the computer would have any idea what’s going on.

I think this is the impossible job, at least with Audacity.


Can I predict the past?

You did an interview on your phone and can’t hear the far side.

That’s actually worse than you think. The far voice is probably Echo Cancellation Trash and not actual voice. So even if you did rescue the voice, it won’t be useful.

Did I hit it? There are ways out of doing that again.


Nope. Sorry. It’s a download. That can create its own problems. YouTube productions are highly compressed and don’t lend themselves to post production. There was a forum posting from someone who produced a radio show using his live voice but music samples from download. He submitted to a radio station and they had trouble using it because his voice was OK, but the music turned to bubbly trash from multiple compression passes.


Koz, I appreciate all the thought you put into this. The infeasibility makes sense.