Want to do time-delay scaled add with channel swap/mix

Audacity 2.0.3, Win 7 32-bit for now (occasionally use on MacOSX and Linux)


I had an idea, and was going to try implementing it using stereo cross-channel convolution in foobar2000 foo_dsp_stereoconv, but discovered that its 1024-point FFT limits the delay I can apply when convolving. Nyquist sound like it might be capable of what I want and I’m very familiar with other aspects of Audacity.

I’m trying to get my head round Nyquist or SAL, mainly trying the Nyquist prompt, but I can’t seem to understand the syntax or the descriptions in the Nyquist reference well enough to make much beyond simple scaling work in the Nyquist prompt (e.g. Return s * 0.5) and can’t completely understand which bits refer to audio streams in some of the .ny files I’ve downloaded from here or found in the plugins folder.

I want to make a sort of repeated decaying echo effect with a short enough time (a few tens of milliseconds) that it ‘belongs’ to the same sound, like room-reflections (or reverb) rather than sounding like separate sounds (echo, hundreds of milliseconds). This so far sounds like a fairly simple delay and add a scaled version, and repeat until the added signal has decayed by, say, 90 dB.

However, I want to move some or all of the left source channel to the right channel at the first reflection and some or all of the right channel to the left channel too.

Rather than room modelling, I happen to be visualizing complex numbers representing both channels of each sample, where the sample value on the left channel is the REAL component, and that on the right channel is the IMAGINARY component, and I consider it as is on polar coordinates, so with each reflection, I multiply by the radius r by the attenuation factor to decay the reflection, and add x to the angle theta the pass that reflection on for a subsequent reflection), so the attenuation on the n’th reflectin is a^n, and the angle added to theta is x*n (that’s equivalent to raising the polar complex number to the n’th power.

The conversion from polar to real+imaginary parts returns us to the left and right channels.

I understand we need to think not of an array of samples, but of a sound and a modified sound (much like a circuit design), where the modification could be a delay of d milliseconds per reflection and we can apply simple scaling my multiplying by attenuation factor a where a = 10^(negative decibels/20), and for reflection n, the scale is a^n (a raised to the nth power)

DestinationLEFT = (a^n)(sin(xn)SourceRIGHT_delayedBy(dn) + cos(xn)SourceLEFT_delayedBy(dn)) summed from n = 0 to Nmax
DestinationRIGHT = (a^n)
(cos(xn)SourceRIGHT_delayedBy(dn) - sin(xn)SourceLEFT_delayedBy(dn)) summed from n = 0 to Nmax

There’s also a special case, where x is 90°, and that means at each pass, either sin(xn) or cos(xn) is 0 so one channel can be left out of the calculation and a simple sequence of +1, 0, -1, 0 is repeated over and over, reducing the computation again.

For that case, I’d be happy just the swap channels and scale.

Computationally, I’m sure it’s most efficient to take input sample at time t and for each channel, map to the destination samples at time t + d, t+2d, t+3d and so on, each scaled by a from the previous version, but that might not suit Nyquist’s way of doing things.

I did try one or two examples some time ago by delaying and channel-mixing the waveforms as appropriate and using MixPaste in Cool Edit 96, and attenuation of about 12-18 dB per pass was fine, and 3 or 4 passes (with the 4th being 48 to 72 dB down) was enough to enough a bit of live-room effect. This is rather labour intensive, however, and I thought Nyquist could make it simple.

Can anyone point me at some code snippets that will do each step I require:

  1. Delay the audio and add a scaled-down echo version
  2. Swap or partially mix the channels of the added audio.

Once I understand that I can probably create input boxes such as those in the Delay plugin to make it configurable and could share the code here for review and clean-up and testing and possible refinement to suit boundary cases (e.g. awareness that the effect slightly lengthens the sample).

Thanks in advance.

I have just found the following plugin in the archives, which is very close to the special case of 90 degrees I’m hoping to achieve.

The main difference is that it forces Peak Normalization (though I guess I can edit).


Hopefully I’ll be able to understand it with the help of the Nyquist Prompt and if my understanding is sufficient I should be able to generalize it to remove Normalization and optionally take an input for the angular step size.

All of the regular postings on this forum use LISP syntax rather than SAL.
SAL is a newer syntax that was intended to be more comfortable for C/C++ programmers.
Any code examples you find on this forum are likely to use LISP syntax.
Nyquist is based on XLISP (a dialect of LISP). There is a good guide to XLISP, including a language reference here: http://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-index.htm
The main Nyquist manual is here: http://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/manual/home.html
The Nyquist language reference here: http://www.cs.cmu.edu/~rbd/doc/nyquist/indx.html
and the Plug-in reference for Nyquist plug-ins for Audacity is here: http://wiki.audacityteam.org/wiki/Nyquist_Plug-ins_Reference

The code for “Channel Mixer” may help you see how to manipulate stereo channels: http://wiki.audacityteam.org/wiki/Nyquist_Effect_Plug-ins#Channel_Mixer

Nyquist is an interpreted language, and consequently looping through individual samples can be extremely slow. However, Nyquist includes many DSP primitives that are written in highly optimised (computer generated) C code, so working on “sounds” tends to be very much faster than working with individual samples, so as a general rule, work with sounds where you can rather than samples.
As an example of the difference in speed, try these two code snippets on a SHORT mono track (no more than a few seconds)

(mult s 0.5)

(do ((output (s-rest 0))
     (val (snd-fetch s)(snd-fetch s)))
    ((not val) output)
  (setf output 
    (sim (snd-from-array 0 *sound-srate* (vector (* 0.5 val)))
         (at-abs (/ *sound-srate*)(cue output)))))

As well as the first example being much shorter and easier to read, it is very much faster than the second, but both do the same thing.
A brief explanation of the second code snippet (how not to do it):
OUTPUT is initialised to a null sound
VAL is initialised to the the value of the first sample, and then on each loop in the DO block it fetches the next sample,
The loop runs until VAL is “nil” (no samples left) and the loop returns OUTPUT.
Within the loop, VAL * 0.5 is converted to a one sample sound and added onto the beginning of OUTPUT.

Here’s a simple snippet for a mono delay (echo)

(defun echo (sig delay gain repeats)
  (let ((output sig)
        (dly delay))
    (dotimes (i repeats output)
      (setf output (sim
        (at-abs dly (cue (mult sig gain)))))
      (setq gain (* gain gain))
      (setq dly (+ dly delay)))))

(echo s 0.5 0.8 3)

Here’s a snippet to swap left and right channels of a stereo track:

(vector (aref s 1)(aref s 0))

Thanks, Steve,

Your replies are certainly helping me understand each aspect.

I think those links and your description will help me understand what’s going on.

I was able to remove the normalize function from the delay_flip plugin and replace it with a scale factor (default 1.0) and change the defaults to closer to my requirements (12.0 dB decay, 0.03 s delay, 8 delays, scaling 1.0).

I tested it on an impulse function (+1.0 on the first sample, zero elsewhere in the right channel) and it switched channel as expected, but didn’t ever invert the signal to a negative-going impulse as I had expected. I guess this is to do with. It still had the sort of audible effect I was after when I applied it to music and speech.

As far as I’m aware, it seems to have generated vectors of samples for the original and channel-swapped (flipped) waveforms, which seems to require vast amounts of memory, which is probably why it crashed on a single 70 minute file.

That seems to match what you did in the third post with the vector

(setq s:orig (vector (aref s 0) (aref s 1)))
(setq s:flip (vector (aref s 1) (aref s 0)))
(setq s (s-rest 0))

I’m not sure what (setq s (s-rest 0)) is for, however.

It sounds like my 90 degree version requires me to create 4 vectors:

(setq s:orig (vector (aref s 0) (aref s 1)))
(setq s:flip1 (vector (aref s 1) (* -1 (aref s 0))))
(setq s:flip2 (vector (* -1 (aref s 0)) (* -1 (aref s 0))))
(setq s:flip3 (vector (* -1 (aref s 1)) (aref s 0)))

I can then cycle through those 4 vectors with appropriate scaling and delay as is done in the delay_flip.ny plugin.

I’m not sure if (* -1 (aref s 0)) is the correct syntax, but I need to invert one channel only, not both.

I guess the alternative is to decompose into separate left and right vectors and left-invert and right-invert vectors.

The “(setq s (s-rest 0))” simply resets the original input sound to one of zero length - to free memory.
“(aref s 0)” is referring to the left channel and that’s a sound (not numbers. Therefore, you should use ‘mult’ or ‘prod’ or ‘scale’ for multiplication, instead of ‘*’ which works only with numbers.
I suggest that you use the cos and sin functions instead of 1 and -1, maybe with a angle argument that follows the current reflection time.
Something like

(setf L (sim (mult (cos rad) (aref s 0)) (mult (sin rad) (aref s 1))))

You could of course also change the angle continuously over time by employing low frequency oscillators instead of the two trigonometric functions.
But to be honest, it is not yet clear to me what kind of reverb you’re trying to get eventually.
Maybe, I need to re-read the first post a dozen times…

Thanks, Robert, that helps a great deal.

It’s a bit like room modelling but with single echos decaying and separated by 25-50 ms or so, so that I avoid comb filtering of audible frequencies (which I’d get mildly with shorter delay) and avoid distinct echoes (which I’d get with a longer delay outside the Haas effect, I believe) It’s also simpler to implement.

The sound will be pretty similar to what you get with the delay_flip effect set to about 12 to 18 dB decay, 0.03 seconds delay and about 8 repeats and whatever normalize setting you want, but it will give me room to experiment a bit more with a range of settings. (Each track you use it on needs to be stereo already)

It tends to widen out the stereo field and makes a flat recording (e.g. close-miked) or a near mono recording or a hard-panned recording which goes into one ear on headphones, feel like it’s played back in a room with a healthy dose of room reflections (rather than a dead room covered with absorbent materials and bass traps) with no reflecting panels. It also seems to spread background noise and tape hiss around a bit spatially. I’m not sure if this makes it easier to localize the desired signal and ignore the hiss or not, but I get that impression. In a way it makes some rather flat recordings sound a bit more ‘produced’ with a bit of ‘shimmer’ or ‘sheen’

Those sort of effects also come out of the delay_flip effect (delayfli.ny), so it might not be a worthwhile advance and I might just share my modified version of that without the normalise function (uses an optional scale ratio instead). I haven’t fully amended the comments but the code works and has reasonable defaults to hear the sort of effect, I’m after.

;nyquist plug-in
;version 1
;type process
;name "Delay (Stereo Flip no normalize)..."
;action "Applying delay, flipping stereo..."
;info "modified from Delay(Stereo Flip) by David R. SkynModifications by Ryan K. HardingnReleased under terms of GNU Public License"

;control decay "Decay amount" real "dB" 12.0 0.0 24.0
;control delay "Delay time" real "seconds" 0.03 0.0 5.0
;control count "Number of delays" int "times" 8 1 100
;control norm-scaling "Scaling" real "" 1.0 0.0 1.0

; Delay with Stereo Flip by David R. Sky
; December 2, 2004; updated January 3, 2006
; a delay effect which flips stereo channels with each delay
; Released under terms of the GNU Public License
; http://www.opensource.org/licenses/gpl-license.php
; Inspired by a sound effect in the opening track of
; Mike Oldfield's "Songs From Distant Earth"
; Thanks to Steven Jones for illustrating
; how to check for even/odd numbers

; set original and flipped audio samples
(let (s:orig s:flip i x)
(setq s:orig (vector (aref s 0) (aref s 1)))
(setq s:flip (vector (aref s 1) (aref s 0)))
(setq s (s-rest 0))

; function to produce next delay
(defun nextflip (i decay delay s s:orig s:flip)
(setf pos (* i delay))
(setf vol (* -1 i decay))
(if (evenp i) ; if i is even, do not flip
(sim (cue s) (at-abs pos (loud vol (cue s:orig)))) 
(sim (cue s) (at-abs pos (loud vol (cue s:flip)))))) 

; normalize function
(defun normalize (signal)
(setf x 
(max (peak (aref signal 0) ny:all) (peak (aref signal 1) ny:all)))
(scale norm-scaling signal))

; generating the delays
(normalize (simrep (i (+ 1 count))
(nextflip i decay delay s s:orig s:flip)))
) ; close let

That’s certainly a starting point.
I would have kept the normalizing - you’ll never know how loud the whole stuff gets in the end, especially if you’re still experimenting.
You could now implement the projection formulas (those with sin and cos) in the ‘nextflip’ function. You could thus pass an angle which rotates the stereo field at each stage (maybe another control?).
180° would automatically swap the two channels - no need for the if conditional.
but for this to work, you have to replace s-orig and s-flip with left and right.
Or do you have something else in mind?
There are endless possibilities that can be explored.

That sounds like what I’m intending, with an extra control for the angle per pass.

I removed the normalizing for testing purposes to compare like with like without significantly increased volume making the louder version sound better. With a decay of over 10 dB per pass, there’s a chance of about a 35% peak increase, though I’d expect that to be rare and the volume increase should be less than 0.5 dB ~= ( log10(1.1111))

My final version might allow peak normalize but will certainly have an option to disable it entirely or apply loudness scaling correction.

I’ve had a further thought. This sort of approach, as with using the Haas effect for time-delayed sound reinforcement at concerts without making the sound appear to localize to the reinforcing speaker because it’s delayed with respect to the original source on stage, might also have some use in making the volume louder without requiring severe dynamic compression, peak limiting, clipping and distortion. I don’t really approve of the Loudness War, but with a different decay curve or even a boost of up to 10 dB in the early reflections, it might work - either with or without the channel swapping.

That link also mentions Haas kickers which add to the stereo spaciousness using physical reflective panels which have an effect not entirely dissimilar to what I’m aiming for.

By the way, does Audacity use floating point internally or fixed point? In other words do I need to worry about clipping at every step or can it be left until the end and tamed with a look-ahead limiter? I know the Envelope tool shows values greater than 1.0, but that could just be the graphical representation prior to ‘rendering’ the envelope.

Clipping is only an issue for the export. All is calculated with 32 bit floating point numbers.
Here’s some modified code for the Nyquist prompt, implementing the fixed rotation/reflection:

;; hard-coded values (= control parameters of the plug-in)
(psetq delay 0.03 decay 12.0 count 10
       angle 180.0 norm-scaling 1 )
(let* (reflection
       (s-left (aref s 0))
       (s-right (aref s 1))
       (s (s-rest 0))
       (omega (* (/ angle 180.0) pi)))
   ; function to produce next delay
   (defun next-reflection (i decay delay s s-left s-right &aux 
          (pos (* i delay)) (vol (* -1 i decay)) (rad (* i omega)))
     (setf reflection (vector
           (diff (mult (cos rad) s-left) (mult (sin rad) s-right))
           (sum (mult (sin rad) s-left) (mult (cos rad) s-right))))  
     (sim (cue s) (at-abs pos (loud vol (cue reflection)))))
   ; normalize function
   (defun normalize (signal)
     (setf x (multichannel-max signal ny:all)); not active
     (scale norm-scaling signal))
   ;; generating the delays
   (normalize (simrep (i (+ 1 count))
                      (next-reflection  i decay delay s s-left s-right))))

You may want to add a special treatement for the first reflection (following your previous post that refers to the precedence effect).
A start off-set for the decay could be a possible solution.
Nice would also be a “prevent distinct echoes” algorithm.
Something that recomposes the reflection signal in order to have less delay (or gain) where an impulse-like event occurs. Either Delay or Decay would thus only be target values and not fixed ones.