Code to Select Audio

Hello,

is there a code to select a certain range of samples within an audio selection, let’s say something like “from sample 35000 to sample 45000” (could also be in seconds or any other form of time code like that)?

The idea, for example, so that one could find the loudest section in a track (using ReplayGain or RMS etc.).

Thanks in advance!

Selecting from 10.0 to 11.0 seconds within the selection:

;version 4
(setf start 10)
(setf end 11)
(extract-abs start end (cue *track*))

Selecting from sample 35000 to sample 45000 within the selection:

;version 4
(setf start 35000)
(setf end 45000)

(extract-abs (/ start *sound-srate*)
             (/ end *sound-srate*)
             (cue *track*))



I wouldn’t do it that way.

Steve,

you kinda should’ve known what comes next: how would you have done it? :slight_smile:

Just so you don’t think I’m having no (bad) ideas of my own: I would take the shortest selection to run either RMS or ReplayGain, slide it (so to speak) through a track, remember the loudest slice and its location, forget the others. Actually I’m looking for a way to compare the loudness of tracks and amplify to a given average. I just don’t agree with RMS, RMS (A) and ReplayGain, as they take the average of the whole track. The same track, compared to itself but with a quiet interlude but with the same peak, will have a different loudness to all three, RMS, RMS(A) and ReplayGain. By comparing only the loudest part (which would be the same for both versions) should reveal that they should have the same loudness. In theory…

I was wondering how long it would be 'till you asked :smiley:

RMS is not bad as a first approximation for comparing loudness. If you want a better approximation, then applying a bit of bass and treble roll-off to approximate an inverse “equal loudness curve” before measuring the RMS is a better approximation. For simplicity, I’ll use RMS as an example.

There are a couple of problems with extracting each short section in turn:

  1. It’s slow
  2. It uses a lot of RAM

Fortunately, Nyquist has a built in RMS function Nyquist Functions
This function is pretty quick, and is not inherently memory hungry.

When measuring RMS, (the square root of the mean of the signal squared), because we are looking at averages (the mean), each RMS value represents a measurement within a time window. By default, Nyquist’s RMS function measures every 1/100th of a second, but we can set different rates.

In your initial question, you specified a “time window” of 10000 samples, which at 44100 sample rate is a little under 1/4 second, so let’s use that as an example. We can get the RMS for each 1/4 second of audio, by setting the “rate” parameter to 4 Hz.

For a mono track:

;version 4
(rms *track* 4)

If you apply the above code to a 30 second mono track, it produces a signal that is 120 samples duration (4 measurements per second x 30 seconds).
Each sample represents 1/4 second of the original audio.

We can then loop through the 120 samples, and find the highest value, which will tell us the maximum RMS of the selected audio in any 1/4 second window.

The complete code for a mono track (this can be run in the Nyquist Prompt):

;version 4

(let ((rms (rms *track* 4))
      (highest 0))
  (do ((val (snd-fetch rms)(snd-fetch rms)))
      ((not val) (linear-to-db highest))
    (setf highest (max highest val))))

Wouldn’t this mean that you work on each channel on a stereo track independently? (Don’t get me wrong: I guess for my personal usage that would be a close enough approximation. Just/still trying to understand/learn!)

I’m not quite sure about the let construct. Could one say (idiot’s level-wise) that…

(let ((rms (rms *track* 4))
	(highest 0))

…works as an extremely localized pseudo-function in the following code…

(do ((val (snd-fetch rms)(snd-fetch rms)))
      ((not val) (linear-to-db highest))
    (setf highest (max highest val))))

…and only there?

Could it work also something like that? :

(defun myrms (rms *track* 4)
	(highest 0))

And then? :

(format nil (((val (snd-fetch myrms)(snd-fetch myrms)))
		  ((not val) (linear-to-db highest))
		(setf highest (max highest val)))

Of course the code doesn’t work, just trying to turn it into a function that can be called from somewhere else.

At least I got the following code to work in the Nyqist Prompt:

(setq FMT-db "%#.1f")
(setq FMT-val "%#.6f")

(defun fmt-pretty (dformat &rest args)
  (prog2 
    (setf *float-format* dformat)
    (apply #'format args)
    (setf *float-format* "%g")))

(setf myrms (let ((rms (rms *track* 4))
       (highest 0))
   (do ((val (snd-fetch rms)(snd-fetch rms)))
       ((not val) (linear-to-db highest))
    (setf highest (max highest val)))))

(fmt-pretty FMT-val nil "~a" myrms)

(format nil "Maximum mean RMS: ~a" myrms)

Here’s a pretty print version:

;version 4

(setf *float-format* "%.2f")  ;2 decimal places

(let ((rms (rms *track* 4))
      (highest 0))
  (do ((val (snd-fetch rms)(snd-fetch rms)))
      ((not val) (linear-to-db highest))
    (setf highest (max highest val)))
  (format nil "Highest RMS is ~a dB" highest))

A good description of “LET” here: https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-ref/xlisp-ref-148.htm

And a description of a “DO” loop here: https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-ref/xlisp-ref-093.htm

An even more basic introduction can be found here: let (Programming in Emacs Lisp)

Not quite like a function but its description reminded me of one for a reason, upon reading that.

I’ll try to find a way to analyze a stereo track. Prof. Steve will be there to correct my code, right? :wink:

I seemed to have succeeded halfways:

;version 4

(setq FMT-db "%#.1f")
(setq FMT-val "%#.6f")

(defun fmt-pretty (dformat &rest args)
  (prog2 
    (setf *float-format* dformat)
    (apply #'format args)
    (setf *float-format* "%g")))

(if (arrayp *track*)
	(vector (setf mra0 (let ((rms (rms (aref *track* 0) 4))
       (highest 0))
   (do ((val (snd-fetch rms)(snd-fetch rms)))
       ((not val) (linear-to-db highest))
    (setf highest (max highest val)))))
	(setf mra1 (let ((rms (rms (aref *track* 1) 4))
       (highest 0))
   (do ((val (snd-fetch rms)(snd-fetch rms)))
       ((not val) (linear-to-db highest))
    (setf highest (max highest val)))))
	(setq myrms (max mra0 mra1)))
	(setq myrms (let ((rms (rms *track* 4))
       (highest 0))
   (do ((val (snd-fetch rms)(snd-fetch rms)))
       ((not val) (linear-to-db highest))
    (setf highest (max highest val))))))

(fmt-pretty FMT-val nil "~a" myrms)

(format nil "Maximum mean RMS: ~a" myrms)

It does return the right RMS for a mono track but it seems to work on the stereo track, also. But I need to test it further.

If it does work, next step would be running it in a multi-track environment so I can compare albums too.

I’ve reformatted your code so that it is easier to read and debug:

;version 4

(setq FMT-db "%#.1f")
(setq FMT-val "%#.6f")

(defun fmt-pretty (dformat &rest args)
  (prog2 
    (setf *float-format* dformat)
    (apply #'format args)
    (setf *float-format* "%g")))

(if (arrayp *track*)
  (vector (setf mra0
                (let ((rms (rms (aref *track* 0) 4))
                      (highest 0))
                  (do ((val (snd-fetch rms)(snd-fetch rms)))
                      ((not val) (linear-to-db highest))
                    (setf highest (max highest val)))))
          (setf mra1
                (let ((rms (rms (aref *track* 1) 4))
                      (highest 0))
                  (do ((val (snd-fetch rms)(snd-fetch rms)))
                      ((not val) (linear-to-db highest))
                   (setf highest (max highest val)))))
          (setq myrms (max mra0 mra1)))
  (setq myrms
        (let ((rms (rms *track* 4))
              (highest 0))
          (do ((val (snd-fetch rms)(snd-fetch rms)))
              ((not val) (linear-to-db highest))
            (setf highest (max highest val))))))

(fmt-pretty FMT-val nil "~a" myrms)

(format nil "Maximum mean RMS: ~a" myrms)

Do you now see that your VECTOR statement creates an array with 3 elements: mra0, mra1, and myrms.
(Tip: Use spaces for indenting, and not TABs.)
What is the purpose of that array?

The correct way to get the RMS of a stereo track, is to calculate the “mean square” of each channel, then calculate the average of the two mean square signals, then take the square root.

If you only require an approximation, and the stereo tracks have similar left and right channels, then it may be sufficient (and easier, and quicker) to only look at one channel of stereo tracks.
Whereas the audio from a mono track can be accessed with TRACK, the audio from the left channel of a stereo track can be accessed with

(aref *track* 0)

Do you see how this works?

;version 4

(setf *float-format* "%.2f")  ;2 decimal places

(defun max-rms (sig)
  (let ((rms (rms sig 4))
        (highest 0))
    (do ((val (snd-fetch rms)(snd-fetch rms)))
        ((not val) (linear-to-db highest))
      (setf highest (max highest val)))
    (format nil "Highest RMS is ~a dB" highest)))

(if (arrayp *track*)
    (max-rms (aref *track* 0))
    (max-rms *track*))

Hi Steve,

I try to give some insight to my thoughts on this. For starters, I prefer a result as exact as possible, as less approximate as feasible.

The idea of my array was to calculate the highest RMS (per time window) for each channel (2 for stereo, 1 for mono tracks) and “store” it in a “variable” that can be returned later on. A stereo channel will result in two hightest RMS for each channe (stored in mra0 and mra1) and I am looking for the highest RMS, I take the higher one, which I store in myrms. For a mono track I go directly for the myrms in one go. (I hope you can follow me. LISP doesn’t seem to have variables the same way as a procedural language, the programming paradigm I’m more familiar with.)

The major problem was turning your code into a function, though my own code looked almost identical, except for the (format…) and possibly track instead of sig. Btw, for anyone like me: I’ve checked, sig can be replaced by any other string (unless already used). But I could take my first name :mrgreen:

I wonder why does the following code return nothing:

;version 4

(setf *float-format* "%.3f")  ;3 decimal places

(defun max-rms (sig)
  (let ((rms (rms sig 4))
        (highest 0))
    (do ((val (snd-fetch rms)(snd-fetch rms)))
        ((not val) (linear-to-db highest))
      (setf highest (max highest val)))
    (format nil "Highest RMS is ~a dB~%" highest)))

(if (arrayp *track*)
    (vector
        (max-rms (aref *track* 0))
        (max-rms (aref *track* 1)))
    (max-rms *track*))

The debug output remains empty.

While this remains to be sorted out, for the next step I want to be able to select several tracks in Audacity and run the code to get the highest RMS on all selected tracks, not the highest RMS for each track, one after the other (as it does for the last working code). For that I thought of using the SCRATCH symbol.

Naturally and needless to say - it doesn’t work:

;version 4

(setf *float-format* "%.2f")  ;2 decimal places
; (putprop '*SCRATCH* (linear-to-db 0) ';version 4

(setf *float-format* "%.2f")  ;2 decimal places
(putprop '*SCRATCH* (linear-to-db 0) 'hbb-album-highestrms)

(defun max-rms (sig)
  (let ((rms (rms sig 4))
        (highest 0))
    (do ((val (snd-fetch rms)(snd-fetch rms)))
        ((not val) (linear-to-db highest))
      (setf highest (max highest val)))
	  (setf highestrms (get '*SCRATCH* 'hbb-album-highestrms))
	  (set highestrms (max highest highestrms))
	  (putprop '*SCRATCH* highestrms 'hbb-album-highestrms)
	)
)

(if (arrayp *track*)
    (max-rms (aref *track* 0))
    (max-rms *track*))

(format nil "Highest RMS is ~a dB" (get '*SCRATCH* 'hbb-album-highestrms)))  ;; doesn’t work with or without and I am not sure whether it is needed.

(defun max-rms (sig)
  (let ((rms (rms sig 4))
        (highest 0))
    (do ((val (snd-fetch rms)(snd-fetch rms)))
        ((not val) (linear-to-db highest))
      (setf highest (max highest val)))
	  (setf highestrms (get '*SCRATCH* 'hbb-album-highestrms))
	  (setf highestrms (max highest highestrms))
	  (putprop '*SCRATCH* highestrms ';version 4

(setf *float-format* "%.2f")  ;2 decimal places
(putprop '*SCRATCH* (linear-to-db 0) 'effectx)

(defun max-rms (sig)
  (let ((rms (rms sig 4))
        (highest 0))
    (do ((val (snd-fetch rms)(snd-fetch rms)))
        ((not val) (linear-to-db highest))
      (setf highest (max highest val)))
	  (setf highestrms (get '*SCRATCH* 'hbb-album-highestrms))
	  (set highestrms (max highest highestrms))
	  (putprop '*SCRATCH* highestrms 'hbb-album-highestrms)
	)
)

(if (arrayp *track*)
    (max-rms (aref *track* 0))
    (max-rms *track*))

(format nil "Highest RMS is ~a dB" (get '*SCRATCH* 'hbb-album-highestrms)))
	)
)

(if (arrayp *track*)
    (max-rms (aref *track* 0))
    (max-rms *track*))

(format nil "Highest RMS is ~a dB" (get '*SCRATCH* 'hbb-album-highestrms))

Is there a Nyquist plugin out there that operates with a similar approach so i can have a look at its code?

Some pointers:

It does, but it may look a bit different.

LISP based languages use “S-expressions” (“symbolic expression”). This takes a little getting used to, but it’s rather nice once you’ve got used to it because just about everything works the same way. Every expression is enclosed in parentheses “( )”
The first item in the parentheses is the command, operator, macro name or function name, which is then followed by the function’s arguments.
Example:
“setf” is a function that assigns a value to a variable.
Whereas in Basic (if I recall correctly) we would write:

variable = value

where “=” is the operator / function that assigns a value to a variable
In a LISP based language we would write:

(setf variable value)

There’s an example of this in my previous code:

(setf *float-format* "%.3f")

Here we assign the string literal (a string value) “%.3f” to the variable float-format.
When you see a variable that has asterisks either side, that should alert you to the fact that it is special in some way. In this case, the variable float-format is a variable that is used by Nyquist itself, to define how decimal (floating point) numbers are formatted when printed. The value “%.3f” is a “magic” string value that tells Nyquist to print decimal numbers with 3 decimal places.


There’s a problem with that (if you want accuracy). If in one part of the stereo track you have a loud sound in one channel and silence in the other, and in another part you have fairly loud (but not quite as loud as the first loud sound) in both channels, the part with fairly loud sound in both channels will sound louder than the loud sound in one channel only.

When applied to a mono track, it returns a string value from the final line of the function “max-rms”.

When applied to a stereo track, the code creates an array (a “vector”) containing two string values. An array of strings is not directly printable, so you see no output. However, if you look in Audacity’s log, you will see the error:

Nyquist returned nyx_error:

One way to show the two strings, would be to concatenate them like this:

(if (arrayp *track*)
    (strcat (max-rms (aref *track* 0)) (max-rms (aref *track* 1)))
    (max-rms *track*))

or: (both versions are identical)

(if (arrayp *track*)
    (strcat (max-rms (aref *track* 0))
            (max-rms (aref *track* 1)))
    (max-rms *track*))

I wasn’t sure about the variables as the term never ever appears in any tutorial or reference. It sometimes looked like a variable, sometimes it doesn’t, and since I’m new to Lisp and the likes I didn’t think I got that right. Are these symbols/variables accessible from anywhere or is a symbol that has been established in a function only available in the function? (I believe they’re accessible from anywhere during a single execution of the plugin.)

What you’ve said about the stereo track - it let’s me to believe that the max-rms function will not do the trick alone on a stereo track as it always operates on the whole channel and only one channel at a time.

I kinda get back to your extract-abs idea from the beginning. Here I could get the RMS for the time window, first left, then right, calculate the joint RMS (something like (RightRMS * LeftRMS) ^ 0.5), store it in some variable, get to the next time window and so on. Maybe I could even use your ReplayGain plugin… at least, part of it. (Isn’t ReplayGain the better option for measurements of loudness? Yes, far more complicated…)

Variable may be “global” (available anywhere), or “local” (only available within a limited scope).
Try running these examples using the Debug button in the Nyquist Prompt:

(setf I-am-a-global-variable 42)  ;sets the value of a global variable to 42

(defun a-function ()
  (print I-am-a-global-variable))  ;prints "42" to the debug window if this function is called

;; call the function:
(a-function)

(print "Done")



(defun test ()
  (setf  I-am-a-global-variable 42))

;; call the function:
(test)

(print  I-am-a-global-variable) ;This only exists after we have called the function



(defun test ()
  (let ((i-am-a-local-variable 42))
    (print i-am-a-local-variable))) ;prints "42"

;; call the function:
(test)

(print i-am-a-local-variable) ;error. The local variable does not exist here.

It does only act on one channel at a time, but we can temporarily assign the RMS to a variable. In Nyquist, sounds and signals are data types, just like numbers, characters and strings.

Example:

(if (arrayp *track*)
    (progn ;starting a block of code
      (setf rms-l (rms (aref *track* 0) 4))
      (setf rms-r (rms (aref *track* 1) 4))
      (mult 0.5 (sum rms-l rms-r))) ;return the average of the two RMS signals
    (rms *track* 4))

Right now I don’t have much time, I’ll get back to that later.

I think I understand variables better now.

  1. You’ve created a global variable outside a function and thus it is available everywhere after that line of code.
  2. You’ve created a function to create a function. The variable is available everywhere once the function has been called.
  3. You’re created a strictly local which is only available to the code within the function.

As for your arrayp code, so it is possible to code that way (as I’ve intended to) but using a vector (as described in the Stereo Track Tutorial) was the wrong way. But still, wouldn’t it find the highest RMS on the left channel (say in the 4th time window) and the highest RMS on the right channel (say in the 6th time window), while the it is the 5th time windows that has the highest average of left and right RMS?