Karaoke, Rotation, Panning & more

Robert_J_H · July 1, 2013, 12:39pm

Hello everyone
I’ve been working on this for a long long time…
Currently, the Voice Removal in Audacity lacks the ability to return a stereo sound, nor is it able to isolate the center.
External plug-ins can do that (somewhat) but I thought it would be nice to have such an effect written in Nyquist - where it can be tweaked easily.
Here it is:
rjh-stereo-tool.ny (14.8 KB)

Update from July 16th, 2015
The effect (without rotation, panning etc.) is from now on shipped with Audacity (starting with version 2.1.1)
It is called “Vocal Reduction and Isolation”
It has its own help page online or locally installed.

The following is therefore more or less obsolete:
(Update, you can find a slim version for Audacity 2.1.0 and higher in the following post:
http://forum.audacityteam.org/viewtopic.php?p=278787#p278787 )

History:

 v1.0 July, 1st 2013
 v 1.1 faster by about 33 % (3:00 m in 52 s)
 v1.2: No messaging for aborted or finished preview
 v1.3: Faster by about 25 % (3:00 m in 37 s; w.o. filter in 1:14 m
       Added playback for left and right channel
       Error message for play attempts on non-Windows OS
       Only one Effect Parameter (context dependant)
       Remove/isolation split into filtered and not filtered versions
       Some catalog entries renamed

(A newer feature description will be posted soon)

A short feature description:
Catalog control:

Remove Center - Returns a Stereo version of the center-less track.

Isolate Center - The center without most of the extreme panned audio (the slope is linearly, so some background will still be audible)

Same as before, but it can be used on a duplicated track to control the attenuation with the gain control. Both tracks together = no center; New track alone: emphasis of the center (see example below)

Carousel and Rotary effects - Rotate the Stereo image by degrees/s (Rotary = 90 times faster). Set the degrees under “Stereo Field Rotation”

Some Fade effects which go in the direction of the set “Pan Angle”. Note that in principle all made settings (rotation, pan, delay and width) influence the outcome of a chosen action from the catalog.

Fixed Rotation/Panning (…Delay and Width) This applies the current transformation settings without variable modification (or center removal). Useful to apply a single 45 degree Rotation or a widening effect (again, all settings are used simultaneously).

This example illustrates the use of the (inverted) center isolation.

The Options control just does what it says.
Please note that the center removal is very time consuming (but now 50 times faster then my earliest tries). Therefore, I’ve included the ability to play the result while it is calculated.
The nice thing is that the execution can be aborted any time - no bad surprises.
This feature runs badly or not at all on Linux systems!
The analyse option tells you something about the (cross-) correlation of the two channels.

There are 4 controls for Rotation, Panning (gain difference), Delay (panning with time difference) and Stereo Width. The latter has not much impact on the different effects - it is here for completeness sake.
I know that some people will complain that there are too many controls. Although not obvious, they can improve the result for a center removal task. For example, many old tunes are such panned that the voice is on one side and the band on the other. You can try rotating the stereo field by +/- 45 degrees to solve this problem.
You’ll see what I mean if you first apply the Carousel effect and then Removal/Isolation.

There are another three controls:

Low- and highpass/-cut. Only the frequencies between those values are regarded while removing/isolating.

Playback control (if you here clipping during a preview).

A last sample sound for now. The (mono) track has been “improved” bwith a over-dimensioned stereo-reverb.
Ive used the Center Isolation to dry it up again.

Have fun - as much as I’ve had so far.

steve · July 1, 2013, 3:22pm

Congratulations Robert.

With appropriate source material the vocal removal/isolation works really well, and the clicking problem in your previous experimental code seems to be fixed.

As you say, it is rather slow - on Linux (which tends to run Nyquist slower than Windows) it takes about a minute to process 10 seconds of audio. Also, (again as you noted) there is a problem with the playback feature on Linux (nothing to do with your code - that’s a bug in Nyquist).

The number of features in this plug-in are a good demonstration of what can be achieved with the techniques that you are using, but I think probably too complicated for many users. In spite of the slowness I think it would be worthwhile to have a simplified version (as well as this version) that deals only with vocal isolation/removal. Vocal isolation is a popular request, but often the people that want it are not very technically inclined and would probably benefit most from a very simplified version that is robust against user errors.

A quick question about the code - what is the “pre-evaluation dummy” code for?

Robert_J_H · July 1, 2013, 3:59pm

Thank you Steve for the feed back.
As I’ve said, the additional controls are made for worst case scenarios.
Average users will most likely just start the plug-in and press OK - all what is needed for simple removal.
I have the preview option as default - in this manner I am able to play existing tracks without polluting my HD/Projects with additional audio data.
The pre-evaluation gives the play function some head room in order to play smoothly, without stuttering.
Besides, the speed is proportional to my wages…

steve · July 1, 2013, 5:21pm

OK, I thought it might be that, though unfortunately the effect is so slow on Linux (about 6 times slower than real time) that it would need a much bigger head start.

I’d definitely like to put this plug-in on the wiki (if that’s OK with you) but there’s some details that we need to sort out first.
I don’t think that we can include the Play feature in a “release” version of this plug-in until the “Play on Linux” bug has been fixed because it can cause normal playback in Audacity to fail. What I suggest here is that the Play feature is just commented out, plus a comment note to re-enable it when the bug has been fixed.

Speaking of the “Play” bug, I thought that it was logged on bugzilla, but I don’t see it there so I’ll raise the issue with the developers.

What are the references to 1e-15, 1e-15 and 1e-20?

I may have more questions about the code in due course - there’s a lot of code to look at

Robert_J_H · July 1, 2013, 6:43pm

I’ve definetly seen your bug report somewhere - maybe on the mailing list.
But it is certainly years ago.
I’d rather prefer two versions or a control via the separator variable. Some Windows users may not be able to remove the ‘;;;’ themselves…
It is a pity to castrate the ram because some ewe sheep are black.
I am in the lucky position that my computer can do it all in real time but this may not hold true for the greater part of our “audience”.
So, there will be probably some demand for a low performance version. A little bit speed improvement can still be done but not more than 5 to 10 %.

‘1E-15’ is nothing but a scientific written floating point number - it’s the shortest way to write a little or big number. It only looks strange because I’ve not written ‘1.0E-015’. It is always used to prevent from a division by zero error.
I’m looking forward to your further questions - they let me recapitulate some points from another perspective.

steve · July 1, 2013, 6:53pm

That seems reasonable, but I’d rather not give the developers the impression that fixing the bug is unimportant
It would need to be tested on Mac to see if Play works there.

That’s what I thought, but I’m not sure how safe it is to do it that way. Isn’t there a chance that a “valid” user setting could cancel out the small value and produce a hard to race divide by zero error? Although a bit more code I prefer to see divide by zeros prevented by a conditional rather than an arbitrary offset.

Robert_J_H · July 1, 2013, 8:38pm

I’ve been asking for about a year if the play functionality works with the Mac, but there doesn’t seem to be any Mac user out there that uses my plug-ins regularly.
By the way, I’ve tested my plug-in’s performance with a 3 min song. It took “only” 1:15 min - not 18 as it seems to take on Linux…
There are obviously serious memory management issues involved too - apart from the play problem.

Who would enter a number with 15 or 20 zeros?
Only joking - there is always a chance that different settings somehow add up to a little nasty zero. However, a conditional is perhaps not the right thing because we are simultaneously dealing with sounds and/or numbers (for which an ‘if’ statement works fine).
For the sound case could be written:

(s-min ny:all (recip <expression>))

This works because the error is not raised immediately if a division by zero occurs in a sound behaviour; thus we can replace the #inf value by ny:all.
That’s at least a point for my little red book… Go ahead!

steve · July 2, 2013, 12:12am

Rodger Dannenberg is aware of the poor performance on Linux, but sadly we have no answer to this as yet.

That certainly does not work for numbers. Can we be sure that it always works for sounds and will continue to work in future releases of Nyquist?
The manual warns:

Note that the reciprocal of 0 is undefined (some implementations return infinity), so use this function with care on sounds.

The phrase “some implementations” (rather than “all implementations”) makes me nervous (though I’m not aware of any implementation where that does not work).

Robert_J_H · July 2, 2013, 3:58am

I think that he meant LISP implementations in general - or any higher language for this matter.
The same danger arises with s-log and so on. The simplist way for positive cases (like the tangent law panning in the transform procedure) is to set a little min higher than zero before taking 1/x value. This works for numbers and sounds.

(recip (s-max 1e-15 <expr>))

Recently, I’ve come across another nan-value that I’ve not seen before (apart from #inf, #-inf and#-ind), but now I can’t reproduce it - something with ‘op’ in it.
Am I getting paranoid? It is like waiting for the little bombs on the old Ataris, that told you how serious you’ve crashed.

steve · July 2, 2013, 9:12am

I’ve not seen that either, so do post if you find it again.
These are some common ones:

(log -3.0)        ; nan
(/ 0.0 0.0)       ; error: division by zero
(sqrt -3.0)       ; error: square root of a negative number
(power 0.0 0.0)   ; -nan

Interesting that “power” produces “negative nan”.

and an easy test for (non-error) nans:

(setq x (log -3.0))
(if (= x x)(print "number")(print "nan"))

Gale_Andrews · July 2, 2013, 9:57am

I don’t think it’s on Bugzilla if it only affects users in a rare Nyquist plug-in that has a preview feature.

What happens on Linux so I know what to look for on Mac?

I have not really tried the plug-in much, Robert, but I liked the Analyze feature and found it a good predictor of how much isolation there would be.

I found the pop up when you cancel preview a bit annoying.

It crossed my mind that “fake stereo” would be a possible addition to effects that mostly did centre removal and centre isolation respectively.

Gale

Robert_J_H · July 2, 2013, 12:21pm

Hi Gayle, don’t try too much, I’ll post a new version presently.
The 3 minute Audio (44100 Hz, on Windows 7/64, 8 GB, medium fast 2 TB HD) processes now in 53 s instead of 1:15 min - i.e in about 2/3 of the previous time.
The value varies with the setting for high-cut. I’ve taken all measurements with the default (8000 Hz). Without filtering, the value is 71 s. That’s just in case if you want to compare with your system.

What pop-up message would you prefer (I know, it sounds currently like a last minute escape from being shot…)
I replace the message for the time being with “Precessing aborted” until you come up with something better.
I’ve changed the dummy evaluation from ‘format nil’ to ‘format t’ because I don’t know if the evaluation takes place without printing on a physical screen (debug window).
So, Steve could give the preview another try (maybe with a higher value than 200000 samples).
(See first post for the new version 1.1) - obsolete

steve · July 2, 2013, 12:31pm

It’s a rare feature because it’s buggy.
If it worked reliably it could benefit accessibility for many plug-ins.

It’s easy to test if it works or not. Simply select some audio, then run the following in the Nyquist Prompt effect:

(play s)

The problem on Linux is that Nyquist does not use Audacity’s playback system, so there may a conflict in accessing the sound card.
Typically on

On my (Debian stable) system, the default sound system is ALSA + Pulse Audio.
Audacity uses Pulse, but Nyquist tries to access ALSA directly, which usually either fails, or stalls PulseAudio. If PulseAudio stalls then Audacity loses audio in/out until restarted. However, it is not guaranteed to fail - sometimes it works! It depends if anything else is accessing the sound system (for example a web browser). PulseAudio is designed to handle multiple applications accessing the audio hardware, but can’t do so if applications are bypassing PulseAudio.

It is not really a problem for standalone Nyquist because it is not running inside another audio application and normally it would be the only audio application running.

Two possible solutions could be:

Add PulseAudio support to Nyquist.
Integrate Nyquist in Audacity more closely so that it accesses the sound system through Audacity/Portaudio.

Option 1 would have benefits to stand-alone Nyquist on Linux, but is probably not a job for the Audacity developers.
I think that option 2 would be the preferable solution (if it is possible) as it would then allow the “play” feature of Nyquist to work on all platforms that Audacity works on.

There is a workaround for Linux, which is to launch Audacity with the command:

pasuspender audacity

and then configure Audacity to use ALSA directly by using the [example] (hw,0:0) options in the device toolbar.
The downside of this workaround is that Audacity loses access to PulseAudio, so recording sounds playing on the computer won’t work, jackd will not run, and Audacity will conflict with other applications that are accessing ALSA (so it’s not a good workaround for most users).

Whether or not Nyquist’s “play” function works on Mac is currently unknown - please could you try it (a few times) Gale?

Robert_J_H · July 2, 2013, 12:57pm

It may be better to test the play function with the following:

(s-save s ny:all "" :play t)

The ‘play’ function saves always a temporary *.wav file somewhere in the Audacity folder.
If you have no writing access for any reason, the attempt will fail in the first place.
Whereas ‘s-save’ plays without storage if the file name is “”.
The ‘play’ command fails on my system because there is a ‘ä’ in my last name.

Gale_Andrews · July 3, 2013, 12:17pm

I’m not sure why it needs a message, Robert. Isn’t the “Cancel” button the same as “Stop” when previewing in a built-in effect? Can we prevent the “Nyquist did not return audio” after the preview or “play” in Nyquist prompt completes?

Gale

steve · July 3, 2013, 12:29pm

Nyquist plug-ins must return “something” to Audacity, otherwise Audacity pops up the “Nyquist did not return audio” message.
“Play” does not “return” a value to Audacity, it simply plays the sound (assuming it works).
To prevent the “Nyquist did not return audio” message we need to return either audio (which Audacity will use to replace the current selection), or labels, or text.

Robert_J_H · July 3, 2013, 12:58pm

Hi Gayle
Normally, I would return the input sound.
Unfortunately, that one is “killed” to free some memory.
I only thought that something more constructive could be returned, like the elapsed playtime. Could be useful if you’ve heard something that needs special attention. Clipping is for example always audible because the play function works with 16 Bit.
However, I can turn it back to a quiet stopping if this is the preferred behaviour.

Gale_Andrews · July 3, 2013, 4:49pm

On Mac, I tried the two Nyquist commands to play audio in the command prompt several times - but there is no playback at all. The file contains the correct audio if you do (play s).

Sighted users could see the elapsed time if the progress dialogue for Play was accurate - but it is not accurate on Windows.

I think an information box when Play stops by user intervention or it comes to an end is fine, but the user having to dismiss it every time will get very irritating IMHO.

Thanks,

Gale

steve · July 3, 2013, 5:23pm

Thanks for testing.
Could you try it using the debug button and post the debug output.

Robert_J_H · July 4, 2013, 7:45am

New Version: 1.2
(see first post for download)
No message box appears after finished or stopped preview (“Play only” option).