Youtube's algorithm for background music removal?

Does any one know what algorithm Youtube uses to automatically remove content ID music from an audio clip?

Before removal
After removal

Both video were recorded and processed under the same procedure, and then one video went through Youtube’s audio removal while the other didn’t. The source was recorded from AV cables, so there are obviously noise from that plus artifacts of the audio compression among other things.

As you can see, even though the foreground effects become somewhat muffled, Youtube was able to remove the artifacts of the background music completely. I think the result is amazingly good because I can still hear everything in the foreground perfectly. What kind of audio processing did Youtube use to achieve that result?

Obviously Youtube has the source music track in their hands so they were able to match the content ID, so how do they use that to filter out my background music?

This is not an Audacity problem.

Moved to Audio Processing.

If you invert a copy of audio against the original, the original can cancel out, if identical and identically aligned.


The answer is quite simple.

It’s all digitally generated content. The “music” is stereo, the sounds are perfect mono, perfectly in the middle.

A polarity inversion on the stereo part will cancel it perfectly and leave very little trace on the mono sounds.

And since the “music” is very repetitive, there’s only a small sample to treat and then it’s just repetition.

If you would try this with real music and a real singer, it would be very much harder…

The algo is the same as Audacity’s, but probably with some AI help from Google. The AI would tune it automatically, in Audacity you have to tune manually.

Yeah I remember doing the split to mono inversion technique, but I don’t think that’s the whole story here.

Here’s another example where I used music where there’s singing involved, and again they were able to just remove the music while keeping foreground audio.

The track that was removed was this one

I downloaded the music and did the split mono inversion technique in Audacity, but the music was still obviously audible, both the beats and the singing. Only a small part of the singing was removed.

Again somehow Youtube was able to do it somehow, and I don’t know how they did it.

If you are wanting to circumvent YouTube’s anti-piracy policy, then we are not able to help you. Youtube have various methods of combating piracy because using copyrighted music / video without the copyright holder’s permission is illegal in many of the countries in which YouTube operates.

If you are interested in the technical details of how YouTube operate, it would be more appropriate to ask them ( rather than us. At best we can only speculate about what techniques they may use.

How did you read my message and come to that conclusion? I hope I didn’t give off that vibe because that’s not what I am interested in.

What I want to learn how they separate background music from foreground music with very good precision. It’s the holy grail I have always been searching for.

I even made another topic on this a while ago before I learned of Youtube’s ability to automate this process.

Just checking :wink: We do get people asking how to do such things, and often the question is “disguised” to sound “innocent”.

I am thinking about the opposite of circumventing the copyright policy. I’m thinking it would be awesome if YouTube would help someone comply - by filtering some copyrighted music from the background of a video where they didn’t notice it, but the copyright algo catches it. Like if I’m getting video of my friend on the street, but someone is playing a radio loudly in the background. It would be really great if you could select “remove copyrighted audio” - and have it filter for just that.