Dereverb

steve · October 7, 2022, 2:40pm

Comments are not really necessary here.

(defun process_band (sig ln params) ;; params = (hpass lpass R_offset T_offset)
  ;; Extract parameters
  (setf hpass (nth 0 params))
  (setf lpass (nth 1 params))
  (setf R_offset (nth 2 params))
  (setf T_offset (nth 3 params))
  ...

This is just as clear as it is obvious that you are unpacking “params”, and it avoids the risk of names colliding with global variables:

(defun process_band (sig ln params)
  (let ((hpass (nth 0 params))
        (lpass (nth 1 params))
        (R_offset (nth 2 params))
        (T_offset (nth 3 params)))
    ...

jozefh · October 7, 2022, 3:17pm

In this case, “sig” is local to the LET block, which Nyquist deals with as a “block” (limited “scope”). Nyquist knows that “sig” is local to the block, so it knows that it does not need to hang onto the samples beyond the scope of the block.

Thank you. I have applied this to the band processing function, but still the same amount of RAM is occupied. So why it still keeps in RAM the data of already processed bands? Sorry, I still do not understand this.

(defun process_band (sig ln params)
  (let ((hpass (nth 0 params))
        (lpass (nth 1 params))
        (R_offset (nth 2 params))
        (T_offset (nth 3 params))
        (gatefollow 0)
        (output 0))
        ;; Isolate frequency band with HPF and LPF
        (when (> hpass 0) (setf sig (highpass8 sig hpass)))
        (when (> lpass 0) (setf sig (lowpass8 sig lpass)))
        ;; calculate the input variables for the noisegate command
        (setf gatefollow (gate-follow sig))
        (setf reduce (db-to-linear (+ reduction R_offset)))
        (setf threshold (db-to-linear (+ (get-rms sig ln) sensitivity T_offset)))
        ;; process the fq band with noisegate
        (setf output (multichan-expand #' noisegate sig gatefollow look attack release reduce threshold))))

jozefh · October 7, 2022, 3:34pm

I am thinking about how to overcome the selection duration limitation. I have this idea:

Identify all blocks of the recorded speech and store the time identification in the list array.
Then process the blocks one by one.

In the RAM there should be only this amount of data:
[data of the track] + [multiple times of one block size] + [output signal]

I expect that it should allow to select and process audio longer than one hour.

Do you think that this can work?

steve · October 7, 2022, 3:52pm

Getting this right (so that garbage collection can work) will be tricky.
My guess is that it will be possible, but to do so we need to ensure that we’re not holding onto samples anywhere in the code.

Here’s an example of the type of problem.
So that we don’t need to test on really long tracks, we can limit the amount of ram available for holding audio data by setting snd-set-max-audio-mem to a small value.

Here’s a filter that can do low-pass, high-pass, or band-pass (depending on what parameters we send):

(defun filter (sig low high)
  (when low
    (setf sig (highpass8 sig low)))
  (if high
    (lowpass8 sig high)
    sig))

and use it like this:

(filter *track* nil 1000)   ;low-pass

(filter *track* 1000 2000)  ;high-pass

(filter *track* 2000 nil)   ;band-pass

Set up a list of parameters:

(setf params (list '(nil 1000) '(1000 2000) '(2000 nil)))

Now if we try this, it will run out of sample memory and fail:

(snd-set-max-audio-mem 1000)

(defun filter (sig low high)
  (when low
    (setf sig (highpass8 sig low)))
  (if high
    (lowpass8 sig high)
    sig))


(setf params (list '(nil 1000) '(1000 2000) '(2000 nil)))


(setf band1 (filter *track* nil 1000))
(setf band2 (filter *track* 1000 2000))
(setf band3 (filter *track* 2000 nil))

(sum band1 band2 band3)

The problem is that we are hanging onto the samples in band1, band2, and band3 so that we can use them in the final line.

On the other hand, this should work:

(snd-set-max-audio-mem 1000)

(defun filter (sig low high)
  (when low
    (setf sig (highpass8 sig low)))
  (if high
    (lowpass8 sig high)
    sig))


(setf params (list '(nil 1000) '(1000 2000) '(2000 nil)))

(sum (filter *track* nil 1000)
     (filter *track* 1000 2000)
     (filter *track* 2000 nil))

and so should this:

(snd-set-max-audio-mem 1000)

(defun filter (sig low high)
  (when low
    (setf sig (highpass8 sig low)))
  (if high
    (lowpass8 sig high)
    sig))


(setf params (list '(nil 1000) '(1000 2000) '(2000 nil)))

(setf out 0)
(dolist (p params out)
  (setf out (sum out (filter *track* (first p) (second p)))))

and even this:

(snd-set-max-audio-mem 1000)

(defun filter (sig low high)
  (when low
    (setf sig (highpass8 sig low)))
  (if high
    (lowpass8 sig high)
    sig))


(setf params (list '(nil 1000) '(1000 2000) '(2000 nil)))

(let ((band1 (filter *track* nil 1000))
      (band2 (filter *track* 1000 2000))
      (band3 (filter *track* 2000 nil)))
  (sum band1 band2 band3))

steve · October 7, 2022, 3:54pm

Note that it’s not only in the filter that we need to be careful. Anywhere that there are parallel processing paths are potentially at risk, including the multiple “gate” functions.

steve · October 7, 2022, 4:03pm

Did I say previously that using these types of filter are probably not the best approach for making a de-verb effect? This is a terrific learning exercise, so it’s worth working with for a while, but I don’t think it’s worth getting too stressed about.

A better (although quite difficult) approach would be to use FFT to create a large number of frequency bands and approach it a DSP. I think this would be an excellent project for the future (when you’ve had chance to become much more familiar with Nyquist). There’s an FFT tutorial somewhere - I’ll see if I can dig it out.

steve · October 7, 2022, 4:04pm

As I said, it’s quite complicated, but here it is: Nyquist FFT and Inverse FFT Tutorial

jozefh · October 7, 2022, 5:48pm

I think I finally understood that’s a great explanation!

The matter about FFT - I will look into it, but not Friday night
I will keep you posted.

jozefh · October 10, 2022, 3:52pm

Steve, good news! I was able to overcome the selection limitation! Limit for a selection is raised up to 2 hours and 25 minutes.

I have attached the updated version to the first post. I spent a couple of hours during the weekend by optimizing the code and the biggest gamechanger was to swap gate with snd-gate. I do not understand why, but gate keeps everything in the RAM until the processing is over, but snd-gate does not consume the RAM at all.

The consumed size of the RAM by Dereverb is only the size of the selected audio (32-bit) now.
Example, when I select and process 60 minutes (48000Hz), the RAM consumption increases by 660 MB. And 660 MB = 48000Hz * 32bit * 3600s / 8 / 1024 /1024.

I set the limit up to the 417600000 samples, which is 2h 25 mins (48000Hz). If the selection is longer there is an error message.

steve · October 10, 2022, 9:17pm

Congratulations. I’m impressed

I’m not able to look at the new code right now, but will do as soon as I get a chance.

steve · October 11, 2022, 11:33am

Actually SND-GATE also retains samples, but its probably easier for Nyquist to do garbage collection after each run.
I think there’s a clue to why the two behave differently in the documentation:
https://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index738

The result is delayed by lookahead, so the output is not actually synchronized with the input. To compensate, you should drop the initial lookahead of samples. Thus, snd-gate is not recommended for direct use. Use gate instead

So GATE uses SND-GATE, but also trims the output to compensate for the lookahead delay. In order to trim the output of SND-GATE, it must temporarily store the samples in memory.

In the latest version of Audacity, GATE has been updated. Here’s an extract from the GATE code: https://github.com/audacity/audacity/blob/master/nyquist/nyquist.lsp
The long comment is interesting, and shows that garbage collection can be tricky to get right, even for Roger

  (let (s) ;; s becomes sound after collapsing to one channel
    (cond ((arrayp sound)           ;; use s-max over all channels so that
           (setf s (aref sound 0))  ;; ANY channel opens the gate
           (dotimes (i (1- (length sound)))
             (setf s (s-max s (aref sound (1+ i))))))
          (t (setf s sound)))
    (setf s (snd-gate (seq (cue s)
                           (stretch-abs 1.0 (s-rest lookahead)))
                      lookahead risetime falltime floor threshold))
    ;; snd-gate delays everything by lookahead, so this will slide the sound
    ;; earlier by lookahead and delete the first lookahead samples
    (prog1 (snd-xform s (snd-srate s) (snd-t0 s)
                      (+ (snd-t0 s) lookahead) MAX-STOP-TIME 1.0)
           ;; This is *really* tricky. Normally, we would return now and
           ;; the GC would free s and sound which are local variables. The
           ;; only references to the sounds once stored in s and sound are
           ;; lazy unit generators that will free samples almost as soon as
           ;; they are computed, so no samples will accumulate. But wait! The
           ;; 2nd SEQ expression with S-REST can reference s and sound because
           ;; (due to macro magic) a closure is constructed to hold them until
           ;; the 2nd SEQ expression is evaluated. It's almost as though s and
           ;; sound are back to being global variables. Since the closure does
           ;; not actually use either s or sound, we can clear them (we are
           ;; still in the same environment as the closures packed inside SEQ,
           ;; so s and sound here are still the same variables as the ones in
           ;; the closure. Note that the other uses of s and sound already made
           ;; copies of the sounds, and s and sound are merely references to
           ;; them -- setting to nil will not alter the immutable lazy sound
           ;; we are returning. Whew!
           (setf s nil) (setf sound nil)))

steve · October 11, 2022, 11:51am

Be careful when copy/pasting snippets of code.

You have:

  (when (< len 100) ; 100 samples required 
    ;; Work around bug 2012.
    (throw 'err (format nil (_ "~%Insufficient audio selected.
Make the selection longer than ~a ms.")
                        (round-up (/ 100000 *sound-srate*)))))

I guess you didn’t look up what “bug 2012” is.
It’s an old bug, logged here: https://bugzilla.audacityteam.org/show_bug.cgi?id=2012
The error message:

(_ "~%Insufficient audio selected.
Make the selection longer than ~a ms.")

is written as a translatable string, but this is not a built-in effect and there are no translations for this plug-in.
Better to write it like this:

  (when (< len 100) ; 100 samples required 
    (throw 'err (format nil "~%Insufficient audio selected.~%~
			     Make the selection longer than ~a ms."
                        (round-up (/ 100000 *sound-srate*)))))

Note that the message can be formatted using normal indentation rules for readability.
“~%” is a “format specifier” that means: “start a new line”
The final “~” at the end of a line, is a format specifier that means: “ignore leading whitespace on the next line”.
Format specifiers are documented here: https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-ref/xlisp-ref-121.htm

You are also using “translatable strings” in other places, which should really be ordinary strings, unless you add translations to the plug-in.
There’s some info about translations here: https://wiki.audacityteam.org/wiki/Nyquist_Plug-ins_Reference#Plug-in_Translations
but personally I wouldn’t bother - unless you’re writing a plug-in to be shipped as part of the standard Audacity bundle, just use normal strings. The translation mechanism for Nyquist plug-ins is a rather hacky add-on imo.

steve · October 11, 2022, 11:54am

This function is very badly named (misleading):

(defun round-up (num)
  (round (- num 0.5)))

steve · October 11, 2022, 12:16pm

Testing this plug-in in the Nyquist Prompt in Audacity 3.2.1 (AppImage version for Linux), the “Preview” button doesn’t work. This confused me for a while because I couldn’t see anything wrong in your plug-in code to account for it.
This problem is not in your code. The problem is in Audacity 3.2.1. “Preview” no longer when running a plug-in from the Nyquist Prompt
I’ll log this bug.

I think that concludes my review. Well done - there were some tricky issues to deal with, and you solved them

steve · October 11, 2022, 12:27pm

Done: Preview fails when running plug-in from Nyquist Prompt · Issue #3798 · audacity/audacity · GitHub

jozefh · October 12, 2022, 8:27am

Thank you Steve for your kind mentoring. I have fixed the above issues that you pointed to. The plugin is updated.

Just for the record… What next?

Stereo signal - I have tried to implement the command (multichan-expand #’ …), but I have realized that it is not that straightforward anymore. I do not pass the signal through the whole processing chain as we normally do. Of course, this is possible to overcome. Although, I do not see a huge value to add this, I will try to work on this later.
FFT crazines - I went through the FFT Turtorial. I think I understood the concept. In my mind I walked through the implementation into this plugin. When I run the simulations in the mathematic model, I have two exclamation marks:

The core postulate, RMS = Gate Threshold as a sweetspot maybe will not work when the signal is broken down into frequencies. It is not a problem though, I just need to test it and if it does not work, I need to come with a different out-the-box core postulate.
I wonder how the memory consumption will do… I expect to play with envelope of respective frequencies, and it requires to store the data of multiple blocks temporary. Again, I think this will work, but I do not know exactly how this will affect the already implemented optimization.

steve · October 12, 2022, 10:40am

First, let’s look at what multichan-expand is and does:

MULTICHAN-EXPAND is a macro that takes as it’s arguments, “a function and its list of arguments”.
Example, the function MULT:

(mult a b)

To use MULTICHAN-EXPAND with the above function:

(multichan-expand #'mult a b)

What the macro does, is to look for arrays in the function’s arguments. If an array is found, then the function is applied to each element of the array in turn, and returns an array containing each of the results.
When printed, an array looks like: #(e1 e2 …)

Example:

(setf a (vector 1 2 3))  ;an array with 3 elements
(setf b 2)

(format nil "~a" (multichan-expand #'mult a b))  ;prints #(2 4 6)

If more than one argument is an array, then all of them must have the same number of arguments. The function will then be run using the first element of each array, then the second element, and so on.

The name “multichan-expand” refers to it’s common usage(and the reason it was written) for handling multi-channel sounds. Multi-channel sounds are arrays. A “stereo sound” is an array of two “sounds” (left and right channels).

I suspect that the reason that you had difficulty with multichan-expand is due to your use of TRACK within functions.

Consider this code:

(defun sampleval ()
  (print (snd-fetch *track*)))

(sampleval)

The function SAMPLEVAL fetches and prints the value of the first sample of the selected sound from TRACK.
Because SND-FETCH requires a (mono) “sound”, SAMPLEVAL also only works with mono tracks.

Perhaps we can use MULTICHAN-EXPAND to make SAMPLEVAL work with a stereo track ?

Try this (spoiler, it doesn’t work, but use Debug so that you can see the error message)

(defun sampleval ()
  (print (snd-fetch *track*)))

(multichan-expand #'sampleval)

The reason that it doesn’t work is that we are NOT expanding the stereo sound array. The only argument passed to MULTICHAN-EXPAND is the function name “sampleval”. What we need to do is to expand TRACK, and pass each element of the stereo array to the function SAMPLEVAL.

So the SAMPLEVAL function needs to look like this:

(defun sampleval (sig)
  (print (snd-fetch sig)))

I use the variable name “sig” as an abbreviation of “signal”. Other common names that you may encounter are “snd” or “sound” or “s”, but personally I’m not keen on these names as they can easily be confused with other things, hence I usually use “sig” as the variable name.
We then need to call the function like this (mono track)

(defun sampleval (sig)
  (print (snd-fetch sig)))

(sampleval *track*)

NOW we can use multichan-expand and send each channel in track to SAMPLEVAL in turn.

(defun sampleval (sig)
  (print (snd-fetch sig)))

(multichan-expand #'sampleval *track*)

(use Debug to see the two printed values)

steve · October 12, 2022, 10:42am

If you want to discuss FFT, I’d suggest starting a new topic for that.

fearomen · October 16, 2022, 7:58pm

Hello,

I was trying to use the plugin, but nothing happens to the audio file.
If I hit the “Debug” button I get some errors on the Subr-SND-AVG function.
I’m equipped with Windows 10, Audcaity 3.2.1 64 bits
Is there any particular procedure I should be following to use the plugin once downloaded?

Attached is the dump of the Debug page.

Thanks.
Dump.txt (2.52 KB)

jozefh · October 17, 2022, 5:04am

Is it possible that you select a stereo sound? If so, please notice that the plug-in processes mono signal only. In Audacity, you can easily break a stereo track down to two mono tracks.