Page 1 of 3

extract common part of two audios

Posted: Sat Jan 01, 2011 1:00 pm
by mathmax
Hello

I have two audios (in mono). Both contain the same vocals, only the backgrounds differ. Is it possible to extract the common vocals getting rid of the backgrounds?
Thank you in advance for any advices :)

max

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 2:25 pm
by steve
A qualified yes.
If the vocals are identical they can easily be removed, but they must be absolutely identical.

Import both (mono) tracks into Audacity.
If the tracks import as "2 channel" tracks, use "Tracks menu > Stereo Track to Mono" to convert them to 1 channel mono.
Click on the name of the upper track and select "Make Stereo Track".
Apply "Effect menu > Vocal Remover".

If the vocal is not removed, then it is not identical on both tracks.

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 8:40 pm
by mathmax
They are identical.. but I don't want to remove them. On the contrary I would like to keep only them... and get rid of the backgrounds.

Please hear these samples, you'll see what I mean:
http://www.mediafire.com/?nd2t2s1o3dc42cx
http://www.mediafire.com/?cx1i22ydzf7g87x

I want to have a clean acapella of "DJ Stolen" (the common part I want to extract).

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 9:09 pm
by steve
mathmax wrote:They are identical.. but I don't want to remove them. On the contrary I would like to keep only them... and get rid of the backgrounds.
In that case the answer is no. Or at least, not with the tools available in Audacity.
There is a VST plug-in (that works in Audacity 1.3.12) that claims to be able to do this.
It is available as a free demo version, or a non-free full version from here: http://www.paulrharvey.co.uk/elevayta/product13.htm

Some of the demos are quite impressive, though reviews that we have seen on this forum indicate that the results are highly dependent on the source material and the sound quality of the result is often not very good. Let us know how you get on.

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 10:31 pm
by mathmax
Thank you.
I installed it but I don't know how to use it... :-(
I put the two signals in phase and made a stereo from both of them. Then I tried the option "Isolate Vocals" but it gives me something really distorted... :-/ Probably I don't use the right option... I also don't have the slightest idea on how to play with the settings...

Could you give me some hints?

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 10:53 pm
by steve
I've not used it myself, but there is a manual on the download page http://www.paulrharvey.co.uk/elevayta/product13.htm

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 11:18 pm
by mathmax
how do you know this tool can do it? Which option should I use? I tried all.. nothing seems to work properly.. :(

Re: extract common part of two audios

Posted: Sat Jan 01, 2011 11:24 pm
by steve
mathmax wrote:how do you know this tool can do it?
The task that you are describing is essentially the same as "vocal isolation", which Elevayta Extra Boy claims to be able to do. The examples on the download page show that with the right kind of material it can indeed perform this difficult task, though the examples are clearly chosen to showcase its ability. With other material your mileage may vary.

Re: extract common part of two audios

Posted: Sun Jan 02, 2011 11:59 am
by mathmax
Ok.. so the vocal isolation doesn't really work in my case.

I've another idea... Is it possible to compare the waves point by point and for each couple of point take the one with lowest amplitude to create a new wave. I could even do that taken 4 or 5 different samples into account. This way I would be able to take the best acapella parts from each version. How can I achieve this? Should be easy script it... but is it possible with Audacity or should I use another tool?

Re: extract common part of two audios

Posted: Sun Jan 02, 2011 7:32 pm
by steve
mathmax wrote:Is it possible to compare the waves point by point and for each couple of point take the one with lowest amplitude to create a new wave.
Yes that can be scripted in Audacity using Nyquist but I don't think it will produce the result that you are expecting.

Both the backing and the vocal are waveforms that are oscillating independently with positive and negative values. At any given point, the value of a sample is the sum of the backing component and the vocal component, but as either of these may be positive or negative, the sum may also be positive or negative. There is no direct correlation between the minimum sample value and whether the sound comes from the vocal or the backing. Although there will be some degree of reduction in the background (assuming identical voice), the result will also contain distortion/noise.

The most simple way would be to do bit-wise comparisons (read one sample of each and compare them, then read the next sample from each), but that would be extremely slow. A much faster method is to use the (s-min) function as this can do bit-wise comparisons of entire sounds rather than having to script a loop through individual samples.

If you want to try it, import both tracks into Audacity and make them into one stereo track by clicking on the name of the upper track and select "Make Stereo Track".
Select the stereo track and from the Effect menu select "Nyquist Prompt".
Copy and paste the following code into the Nyquist Prompt box and click OK.

Code: Select all

(let*
   ((PosL (s-max (aref s 0) 0))
      (PosR (s-max (aref s 1) 0))
      (NegL (s-min (aref s 0) 0))
      (NegR (s-min (aref s 1) 0))
      ; compare positive parts
      (MinP (s-min posL posR))
      ; compare negative parts
      (MinN (s-max NegL NegR))
      ; compare PosL with NegR
      (MinPN (s-min PosL (mult -1 NegR)))
      ; compare NegL with PosR
      (MinNP (s-min (mult -1 NegL) PosR))
      ; correct sign for MinPN and MinNP
      (MinPN(mult -1 (clip(mult ny:all(sum PosL NegR))1)MinPN))
      (MinNP(mult -1 (clip(mult ny:all(sum NegL PosR))1)MinNP)))
   (sim MinP MinN MinPN MinNP))
A brief summary of how the code works:
(s-min) is a function that computes the minimum of two sounds.
(s-max) is a function that computes the maximum of two sounds.

As the amplitude may be positive or negative we need to calculate the minimum amplitude of the positive and negative parts of the wave, then stick the parts back together.

A couple of other points:
It is a bit risky using special characters in file names (such as back-slash and apostrophe) as such characters are not valid on many machines and incompatible with some programs. It is generally safest to stick with alpha-numeric characters, space, hyphen and underscore.

Although the voice on the two samples sounds are from the same original recording, they are not identical. The voice is about 4 dB louder in "1'55_left (with amplification t 0 b 0.6)" than in "1'35_left (with amplification t -0.6 b 0)" and about 5 milliseconds (250 samples) earlier. To avoid just producing bad distortion you will need to carefully align the two tracks.

The result of this "lowest common sample" script applied after adjusting the time position is attached:
lowest-common-sample.wav
(147.2 KiB) Downloaded 93 times