Assuming that the amplitude of vocals wave is always bigger than the amplitude of the backgrounds waves (which is right most of the time), the signal should be randomly tone down or up (with a probability of 50-50), right?
As a result, the signal shouldn't be globally tone up by stacking the background on the top of the vocals (except when the amplitude of the first is higher than the amplitude of the second)... that is strange... Intuitively, I would have say that a sound become louder by stacking two signals.
I wonder what makes the values positive or negative? I mean, what is the physical meaning of the sign positive or negative?
The sample you attached seems to be a better acapella... but I guess the vocals are altered and as it's intended to be subtracted from other parts of the song (to remove tags), I wonder if the final result will be better or not. When the amplitude of the backings is high, I guess their reduction is more important than distortion of the vocals... so the result should be globally slightly better.
However I wonder if there is a better algorithm to improve the acapella more efficiently.. especially having more than 2 samples, it should be possible to write a script that causes less distortion and more reduction of the backings.
For example, assuming we have 3 samples I know compare a triplet of points. If two have almost the same values and the third is different, there is a good chance that the value of the point in the original vocals is closer to the two identical values. Maybe I could even improve the script working on wider motifs, taking into account more than 1 point... I'm sure we can do a better work having more than two samples. If you have any idea... I provide a third sample:
http://www.mediafire.com/?7hlazcz5ja753sl
We could even work on 6 samples, taking the left and right tracks as each of them are originally stereo.
Please let me know if you have any idea or advice to enhance the acapella. I really need something as clean as possible.
Thanks
max