Hello. I am trying to remove some audio. This audio is basically background music in a show. So basically we have some dialogues in it. I want to remove dialogues but avoid loosing and VOCALS of the music.
I use Spleeter GUI to split music into 5 different audios (Vocals+Drums+Piano+Bass+Others). Now I clear off any dialogues from vocals. The problem is when I erase the dialogue and play the music it sounds weird. The problem is the way the original audio is made. Wherever I remove the dialogues, the music intensity drops there. SO basically at that area in all the other tracks I have to use the envelop tool to increase the intensity.
So is there any automated way of doing it precisely, because I have to do a lot of hit and trials to get a better sound.
Is there a way in which wherever I delete a part from vocals track, automatically all other tracks are boosted in order to maintain the over all music amplitude? Thank you!!!

Software is improving all the time and Spleeter may be the “latest & greatest” but…

“You can’t un-bake a cake or un-fry and egg, and you can’t un-mix audio.”

Of course the left & right are already isolated and a traditional “vocal remover” can completely remove everything in the center (everything that’s identical and in-phase in both channels.)

If audio could be un-mixed it would completely eliminate the need for multitrack (or stereo) recording. And, you wouldn’t need a soundproof studio if you could perfectly isolate & remove noise.

Have you seen this thread? https://forum.audacityteam.org/t/envelope-follower-or-ducker/51607/1

The party line is Audacity can’t split apart a mixed performance into individual voices, instruments, or sounds.

Vocal Removal usually works by assuming a simple voice or announcements is in the middle of a stereo show. It’s dead simple to split the stereo show into two, invert one side and add them together. Poof. No more vocal. That’s what the thousands of Youtube videos are doing. Note they never do it to more than one song or performance and they don’t take requests.

There’s a stunning list of things that can go wrong to screw this up. If the vocal isn’t in the middle, if there’s stereo echo on the voice, etc. That stereo echo thing is a favorite of songs.

If you don’t do it that way, then the software package has to “know” what a voice is and how to recognize it. That’s easy for you, hard for software. That process comes with its own problems and many times a price tag. It doesn’t work if someone added effects, filters, or processing to the voice. It’s not just a plain voice any more and the algorithms stop working. I think somewhere in this thread someone mentioned the voice “poking up” out of the clean mix. That’s what causes that problem. A series of vocal effects didn’t work out quite right.

And none of this works right with an MP3. MP3 gets it’s tiny convenient files by causing nearly invisible sound distortion. They rearrange the vocal tones so they’re almost perfect, but fit together much better into a small file. It’s that “almost perfect” that kills you.

If you have an MP3 made from an MP3, nobody can fix that.