Attempting to Remove Vocals

Ok, so this seems easy enough looking at all the youtube tutorials (split tracks, invert one track, set both to mono, etc) but every time I follow these simple instructions it sounds like the song is very distorted and muffled. I’m using USB headphones and the left/right channels are working just fine, I tested.

Using Ubuntu 14.04 64 bit and Audacity 2.0.5. The instructions are so simple I don’t know why it doesn’t sound right. Help?

There is actually an easier way to do that.
All of those steps are rolled into one simple effect that is included in Audacity: Audacity Manual

that does not alter the fact that the “trick” of inverting and adding left/right channels will often not work!
This method for removing vocals will only fully work in cases where:

  • the vocal is free of stereo effects
  • the vocal is panned dead centre of the stereo mix
  • the other sounds (instruments) are panned off-centre
  • the “phase” relationship between left/right channels has not been damaged by file compression

In old recordings it is not uncommon for these conditions to be met. Unfortunately for people wanting to use this trick, the conditions are often not met in modern recordings. Large amounts of stereo effects on the vocal is common these days, and bass / kick drum are frequently panned centre. MP3 encoding can often cause phase shifts between the left right channels. These will all make the vocal removal trick less effective. In particular, phase differences between left and right channels will cause the sound to become “muddy”.

The effectiveness of this technique is entirely dependent on the characteristics of the recording. Sometimes it will work very well, but often not.

You don’t need to go through all that. Effect > Vocal Removal has all that baked into one tool.

The YouTube people all carefully chose a song that works perfectly even though they may not realize they did that. Most songs don’t work all that well.

For inversion and cancellation to work, the show has to be in good quality stereo and the performer has to be in the exact stereo left-right center. They have to have no stereo effects added to their voice (the problem in that clip), and it’s good that the clip is not an MP3 or AAC (Apple iTunes), or other compressed format. Most times drums and bass (also in the center), go too although there are ways to get around that.

Youtube has collected all the music that meets those rules. The bad joke is Vocal Removal work perfectly if you use the exact same song they did.


Unfortunately if the singer’s voice is not centered in the stereo image, you are doomed.

Luckily there is some software based on A.I. called “UnMixIt”
It makes a very good job removing vocals (you also get a track of the isolated vocals as well.)

Windows only. $79 to buy.
Demo Version restrictions: limited audio duration, audio “watermark”, saving is disabled.

The only independent reviews that I found for this software are here:

As far as I can tell, this is proprietary front end for the experimental, (free) open source “Spleeter” scripts: GitHub - deezer/spleeter: Deezer source separation library including pretrained models.

My tests with Spleeter produced quite good, though very slow, results, though a certain amount of tweaking was necessary for best results (tweaking probably not available in the commercial GUI version).

Another free and open source Python project for source separation: GitHub - wslihgt/separateLeadStereo: Separate the lead from the accompaniment, in polyphonic audio music excerpts, in Python/Numpy