Trouble with sound-warp

I’m working on a processing plugin that implements time-dependent pitch shifting using sound-warp and pwlv, but am getting no joy. The problem surely involves duration and the warp value, as I can never get the resulting audio to match the window Audacity gives it (the length is wrong, often being the inverse of what it should be, and there’s sometimes a whole lot of silence added to the end).

In order to get some traction, I tried writing a simple constant pitch shift plugin using the same tools, but can’t get even that going (same problems). Here’s the code; would anyone have any suggestions?

;nyquist plug-in
;version 3
;type process
;name "Pitch Shift Test..."
;action "Boogity, boogity, boogity..."
;info "This is a brain-dead way to do a simple pitch shift.nThe intent is to implement a more complex time-dependent pitch shiftingnusing the pwlv function, but first I have to get this version working."    
;control shifted-ratio "Shift ratio (>1 == sped up)" real "" 1 0.5 2

(defun warper (d)
	(pwlv 0 (* shifted-ratio d) d))

(defun process (sound)
	(sound-warp (warper (get-duration 1)) sound 10))

(if (arrayp s)
	(dotimes (j (length s))
		(setf (aref s j) (process (aref s j))))
	(setq s (process s)))
s

Thanks,
Dan

“Warp” and the “Environment” are the two most mind numbing concepts in Nyquist, and particularly the Audacity implementation of Nyquist.
In Audacity, the “logical duration” of the selection is always 1.0
This puts much of the Nyquist documentation at odds with what actually happens in Audacity.

Have a play with this - it can be applied using the Nyquist prompt on a mono track:

(setq seconds 12)

(sound-warp (pwlv 0 (get-duration 1)(get-duration seconds)) s)

or as a plug-in:

;nyquist plug-in
;version 1
;type process
;name "Pitch Shift Test..."

;control newlength "New length in seconds" real "" 1 0.5 20

(defun stretch (sig seconds)
  (sound-warp (pwlv 0 (get-duration 1)(get-duration seconds)) sig))

(multichan-expand #'stretch s newlength)

This code may be easier to follow:

;nyquist plug-in
;version 1
;type process
;name "Pitch Shift Test..."

;control newlength "New length in seconds" real "" 1 0.5 20

(defun stretch (sig orig new)
  (setq seconds (* new orig))
  (sound-warp (pwlv 0 orig seconds) sig))

(setq oldlength (/ len *sound-srate*))

(if (arrayp s)
  (vector
    (stretch (aref s 0) oldlength newlength)
    (stretch (aref s 1) oldlength newlength))
  (stretch s oldlength newlength))

Most excellent: that works. Now I just gotta stare at it for a while and figure out why.


Thanks,
Dan

OK, here’s the same code, with the variable names rewritten to indicate their use and units:

;nyquist plug-in
;version 1
;type process
;name "Pitch Shift Test 2..."

;control newseconds "New length in seconds" real "" 1 0.5 20

(setq oldseconds (/ len *sound-srate*)) ;; same as (get-duration 1)
(setq warpednewseconds (* oldseconds newseconds)) ;; same as (get-duration newseconds)

(defun stretch (sig)
  (sound-warp (pwlv 0 oldseconds warpednewseconds) sig))

(if (arrayp s)
  (vector
	(stretch (aref s 0))
	(stretch (aref s 1)))
  (stretch s))

There seems to be two separate warpings of time going on:

  1. The environment that this code is executing in is pre-un-stretched by the length of the original sound
  2. The length of the final sound is changed by moving the X value of the endpoint of the pwlv sequence.

Without the first warp, the pwlv call would be (pwlv 0 1 newseconds). (I think.)

(Did I get it all right?)

And, a related question. It looks like I’ll have to know the final length of the resulting sound before I calculate any of the pwlv breakpoints. My best shot at doing this would be to calculate the breakpoints once, scale the list of breakpoints, and then hand it to pwlv-list. But, the devil is in the details; any suggestions on how to do this?


Thanks much,
Dan

If you get to a sufficient understanding where you can explain it clearly enough for anyone to understand, please explain it to me and I’ll put it in the Nyquist documentation.
I spent about a week staring at this type of thing to get an understanding of it, but I still can’t explain it clearly and I still sometimes get caught out - I think it’s a bit magic :wink:

There’s only really one “warp” occurring, but there are two different “contexts” (my word). There is “real” time (24 hours in a day) and there is “Audacity” time (selection = 1 second). “Audacity” time is the “logical” or “local” time. “Real” time is “global” time.
Here’s some text to read through a hundred times: http://www.cs.cmu.edu/~rbd/doc/nyquist/part4.html#index125

“pre-un-stretched” time has both “logical” and “real” values.
The “logical” time for the selected audio starts at t=0 and ends at t=1
The “real” time for the selected audio starts at the start of the selection and ends at the end of the selection BUT…
Nyquist does not know the “track” time. Audacity passes the audio data in the variable “S” but it does not pass start and end times, so Nyquist only gets the actual audio data. As far as Nyquist is concerned the “real” time at the start of the selection is t=0 and the end time is equal to the duration.

The length of the final sound is changed by moving the Y value.

Take 5:
PWLV http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index384
“Creates a piece-wise linear envelope with breakpoints at (0, l1), (t2, l2), etc., ending with (tn, ln).”
This envelope is created at the “control rate” (2205 Hz by default in Audacity).
We can change (force) the sample rate of a PWL envelope like this:

(force-srate 44100 (pwlv 0.3 0.7 0.9 1 0))

or to force it to the sample rate of the audio track (the default Sound sample rate)

(force-srate *sound-srate* (pwlv 0.3 0.7 0.9 1 0))

Note that for PWLV the initial start time of 0.0 is implied, so the first number (0.3) is the value at time=0

Back to the Time Warp:

(pwlv 0 oldseconds warpednewseconds)

This creates a “Sound” (at the Control Rate) that rises from 0,0 to oldseconds,warpednewseconds
The final X = oldseconds
The final Y = warpednewseconds

The PWL envelope creates a mapping
The “X” values represent the “Real” time of our original signal
The “Y” values represent the times that we are mapping them to.

(warp fn beh) http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index570
“Evaluates beh with warp modified by fn. The idea is that beh and fn are written in the same time system, and fn warps that time system to local time.”

Strictly speaking we should not be applying “warp” to “S” because “S” is a “Sound” and not a “behaviour”, but in this case it does what we want because we want to work with the “real” start and stop time of both “S” and “fn”.

(get-duration 1) tells us the “real” time for the duration of the sound “S”.
(/ len sound-srate) is the number of samples divided by the sound sample rate, so that is also the “real” time for the duration of out input sound “S”.
(pwlv y0 (get-duration 1) y1) creates a control signal that has a duration equal to the duration of out input sound “S”. The initial level is y0 and the final level is y1

If we want to do something fancier and apply a variable stretch to the sound, then we need to do it properly and apply warp to a behaviour.
We can do that by using the function “SOUND” to apply warp to the sound “S”.

sound http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index320

;; X values are the logical times of the input sound
;; Y values are the logical times of the stretched sound
(setq X0 0)  ;start of input sound
(setq Y0 0)  ;start of stretched sound
(setq X1 0.5)   ;half way through input sound
(setq  Y1 0.1)
(setq X2 1)  ;end of input sound
(setq Y2 1)

;; convert logical times to real times
(setq X0 (get-duration X0))
(setq Y0 (get-duration Y0))
(setq X1 (get-duration X1))
(setq Y1 (get-duration Y1))
(setq X2 (get-duration X2))
(setq Y2 (get-duration Y2))

;; apply *warp* to sound
(setq beh (sound s))

(sound-warp (pwlv y0 x1 y1 x2 y2) beh)

Try applying this to a faded out tone or a click track so that you can see what happens.