Dereverb

Possible improvement to the algorithm

Currently, the effect works as a multi-band gate (the “Simple” interface uses the same settings for every band). However, reverb will be pronounced after a loud sound, and be almost non-existent for quiet sounds. An unwanted side effect of normal multi-band gating is that quiet sounds will be blindly suppressed whether they are actual reverb, or just quiet (non-reverb) sounds.

I think that a better result could be achieved if the envelope tracking code was modified so that it takes into account the preceding level. As an example, consider just one frequency band:

  1. If the signal is high, then the envelop is high.
  2. When the signal level falls from high to low, the envelope drops low, but at a faster rate than the signal is falling.
  3. If the signal remains low, then the envelope floats high.

The gate takes the envelope as its input, and closes the gate when the envelope is low.

Thank you all for amazing comments! It looks like there is still a long way ahead. But this is for me a good exercise to learn programming of Nyquist plugins.
Let me comment on the last post at first:

the “Simple” interface uses the same settings for every band

I am not sure if I got this correctly. Also in the Simple mode, the plugin calculates the gate threshold for every band separately. This is based on the band rms. So for every band it is different threshold and for every processed sound the values are different. For the standard spoken words, this seems to be sufficient. For example, I have some sample audio and the plugin calculates these values:

  • Gate threshold for Low band = -17 dB
  • Gate threshold for Low-mid band = -25 dB
  • Gate threshold for High-mid band = -33 dB
  • Gate threshold for High band = -40 dB

(-40dB threshold is very light and preserves many soft sounds in high band. On the other hand the threshold -17 is very high and it helps to reduce a lot of loud reverbs. Splitting into multiple bands is improving precision, because statistically every band would have different but very specific dynamic range over time. This aspect is lost if using noise gate on the full spectrum. Of course the best would be to add a variable threshold over time and that can be addressed by manipulation of envelope that gate follows as explained by Steve in the previous comment.)
BTW, if you want to see the calculated thresholds values you can uncomment the last row in the plugin code. Steve, you mentioned “semi-automated way to create appropriate settings”, and this was my goal. I have tested this plugin with multiple friends and we all have very good results. Of course, still there can be done a lot of improvements.

I think that a better result could be achieved if the envelope tracking code was modified…

This is the gold! Thank you!! I will try to implement this idea into the code. In addition, I am also thinking about splitting the signal internally into more bands then just four. This will allow the plugin to respect more the right side of the shapes of letters. The problem is that to join the bands back together into one signal is very tricky. For example, I have to use combination of low-pass filter 1500 Hz for second band and high-pass filter 1330 Hz for third band, so when they are mixed back into one signal the frequency profile is as the original one.

Question: Why the Factory Presets > Defaults does not work for this plugin?

I have applied all these suggestions:

  • choice control fixed (Sorry Paul for that!)
  • debug button is enabled
  • bullet point are replaced with simple dashes

I am still searching for the signal that needs to be wrapped in (cue …). I did not find it, yet.

Glad to see that you are still proceeding with development of the plugin.
Please don’t be disheartened by the negative comments, they are simply feedback to help you on the way.
Wish more people would try creating plugins using Nyquist.

I think that may have been a red herring. I’m not seeing the warning message that Paul mentioned.


Easier if you use filters that approximate the “ideal filter” more closely.
Perhaps worth sticking with the simple lowpass8 / highpass8 filters for now, but later on you could replace them with “sinc filters”. See the “Spectral Delete” plug-in for an example: https://github.com/audacity/audacity/blob/master/plug-ins/spectral-delete.ny

Steve wrote:

I think that may have been a red herring. I’m not seeing the warning message that Paul mentioned.

That was on the Linux machine, will try again later.

I just created account just to say that the plug-in is amazing! No highpass filters are worth to use compared to it. It negates around 80% of room echo. No guides, no reddit was able to help it as good as your work. Trully big thanks for it! The only issue I had was Audacity freezing a bit while using it, no clue why, but this software does that a lot so it’s nothing new.

Thank yoU!

The only issue I had was Audacity freezing a bit while using it, no clue why, but this software does that a lot so it’s nothing new.

Let me read this back to you. You found the software useful on an unstable machine. No, Audacity doesn’t freeze a lot. You should find out why Audacity does that for you. The next step in this process is you posting that Audacity froze during a valuable production…and didn’t come back. The kiss of death forum postings go something like: “In the past when Audacity crashed, the auto recovery worked perfectly. This last time it didn’t. What am I going to do?”

You do not want to be that posting.

But this is for me a good exercise to learn programming of Nyquist plugins.

Not as good as it could be. It would be really good if your goal was achievable. In my opinion, it’s not, or at least not with conventional programming and non-AI tools.

Let me read the goal to you: Echoes are your voice bouncing from the walls (for example) and arriving at the microphone late. To suppress echoes requires the software to remove your voice from itself. The only key you have is volume, echoes are always lower volume than the main performance, but you can’t use volume as a key because, as you’re finding with gate tuning, you can’t perform vocal emphasis, interpretation, and theatrical expression in a show because they all change volume.

Koz

this software does that a lot so it’s nothing new.

Back to that posting for a minute. Audacity does not much like External, USB, Network, Internet, or Cloud Drives. Audacity always assumes it is running from stable, clean, roomy, internal drives. To do some of its production tricks, it has to have a white-knuckle tight grip on drive stability and timing. It can’t do that if it’s production drive, for example, is two time zones away in the case of a cloud drive.

Koz

Thank you very much for your feedback! This is very encouraging. There is still a room for an improvement, but I am happy that even now the plugin is helpful for somebody.

The thing is, everything is running nicely beside the fast that editting podcast can be painful due to the fact that audacity is freezing whenever I try to edit at once more than 10 minutes. I have newest audacity 3.1.3. So editing 2 hour of it can be unpleasant however quality after filter is still good, since a lot of plugins like this leave some static noise or change the pitch.

audacity is freezing whenever I try to edit at once more than 10 minutes.

What have you done so far to try and fix that? How much room do you have on your internal drive? As in comments above, are you trying to use drives other than your computer internal drive?

When was the last time you did the heavy virus scan that takes all night? You start it when you go to bed and it checks everything, not just simple problems while you’re editing.

Have you tried to Clean Shutdown your machine? Shift+Shutdown > OK > Wait. That closes the machine and doesn’t leave any programs, apps, or utilities running in the background.

But you’re most likely running out of drive space. How big is the drive and how much space do you have left?

Windows has a tool called CHKDSK (check disk) that can analyze your drive and find errors.

Koz

I can confirm that the latest version of Dereverb plugin works on Audacity 3.2. Please notice that the plugin works only for mono audio. In the future, I will enable it also for stereo, but first the memory usage must be improved.

It doesn’t work for long audio samples :frowning:

I’m trying to use it for podcasts episodes which are up to 2h long, and Audacity (64bit) freezes after a while when using the plugin. Apparently the plugin errors out in some way since it’s not even available in the Undo/Redo stack - it is af it hasn’t done anything. The waveform remains unchanged.

I don’t think it’s about memory usage - I’ve monitored Audacity’s memory usage while using the plugin and it does not change that much. This machine has plenty of RAM. There’s also about 500 GB of free space on the drive so it’d not that either. The disk drive is inactive while the plugin works.

The plugin does work when I select e.g. a 15 minute interval - but slicing everything up into 15 minutes intervals isn’t a viable solution.

Does anyone know of a similar dereverb plugin which works for long audio samples?

Thank you for this feedback. I appreciate your effort to test this plug-in.

I expect that it should be possible to update the code so it processes audio longer than 15 minutes. But this is something that can take more time to do for me. So before I jump on this and solve it for you, here is the question: Are you happy with the result quality? Does it reduce the room reverb to the satisfied level for you?

I am waiting for your confirmation.

About the code…

  1. Use spaces not tabs :wink:

  1. Nyquist / Xlisp is a “garbage collected” language. Memory management is automatic, and when working with small data objects the developer can usually leave this entirely up to Nyquist to deal with, but when working with very large data objects (such as long sounds) it is necessary to have at least some idea about what is going on. This is a rather “advanced” topic, but here’s a very brief account:

When a program runs, the data that is being operated on is allocated space in the computer’s RAM memory. When that data expires, the garbage collector (part of Nyquist) will attempt to free the memory, deleting the obsolete data so that the memory space may be used again.

When working with long sounds, a well designed program allows Nyquist to free up memory as it goes.

For example (simplified model):
Say we have an input sound “TRACK” which I’ll represent as lowercase letters:

abcdefghijklmnopqrstuvwxyz

and the code processes the sound and outputs a result which I’ll represent as uppercase letters:

ABCDEFGHIJKLMNOPQRSTUVWXYZ

If possible, Nyquist will work in this fashion:

read: a
a -> A
write: A
garbage collect: a and A
read: b
b -> B
write: B
garbage collect: b and B
read: c
c -> C
write: C
garbage collection
...

Because Nyquist is operating on blocks of samples, and is able to garbage collect the used blocks as it goes, it can work with extremely long sounds without ever running out of memory.

Now let’s look at another example - this one is a classic example of when Nyquist’s garbage collection can’t do it’s job.
Find the peak level, then amplify the track so that the new peak level has a specified value (“normalization”).
In this case, Nyquist must search the entire TRACK data to find the peak value, but cannot garbage collect because it needs to go back and multiply each sample by a value.

(setf target-val 1.0)
(setf absolute-peak (peak *track* ny:all))  ; Find the peak level
;; *track* data must still exist, so must not be garbage collected
(mult *track* (/ target-val absolute-peak))

I’m not sure if it’s possible to “fix” your code to allow garbage collection. (Nyquist does have some advanced techniques for this kind of job, but that’s too much for a forum post).


  1. “DRY” (Don’t Repeat Yourself)
    Notice that your code repeats something very similar to this, 4 times:
(setf sig (highpass8 sigtrack 450))
(setf sig (lowpass8 sig 1500))
(setf gatefollow (gate-follow sig))
(setf reduce (db-to-linear (+ reduction 2_band_R_offset)))
(setf threshold (db-to-linear (+ (+ (get-rms sig (truncate len)) sensitivity) 2_band_T_offset)))
(setf 2-bandgated (multichan-expand #' noisegate sig gatefollow look attack release reduce threshold))
(setf msg (format nil
                  "~a~%Low-mid: Reduce: ~a; Threshold: ~a"
                  msg (linear-to-db reduce) (linear-to-db threshold)))

Better to extract that out from your main function into a separate function, so that it can then be reused as many times as you want.

If you make the parameters for each band into a list, your “main function” could be reduced to something like this:

(defun process (param-list)
  (error-check)
  (let ((sigtrack *track*)
        (ln (truncate len))
        (output 0))
    (setf *track* nil)
    (dolist (params param-list output)
      (setf output (sum output
                        (process-band sig ln params))))))

Note also that calculations such as “(truncate len)” only need to be done once, whereas your code calculates it multiple times.
(in the specific case of “(truncate len)”, the cost of repeating the calculation is insignificant, but in other situations repetition can be costly in terms of performance and memory use.)

1 Like

Hi Steve,

thank you again for a great help and explanation! I have updated the code - your point 3 DRY is applied.
Now the maximum selection duration is little bit over 30 minutes.

Can you Steve please explain this? Why is memory not partially released after processing one band?

When I process the first band and I sum it with the output, I do not need to keep the first band in the memory. It can be removed. I just need to keep output from the beginning to the end, but each band is needed only up to the moment when it is summed with the output.

I have tried to play with prog1 command, but with no success.

Second question is related with this:

(SND-SET-MAX-AUDIO-MEM 2000000000)

I have tested different values and it does not make any difference if I set higher value than 2000000000 on my PC. I have 16GB of RAM (usually 7-8GB is free), but when I run Dereverb, I can see that Audacity memory consumption goes approx. up to 1700MB. I have tried to increase that value to 4000000000 or 8000000000 but with no success.

Why is this happening?

This is the easier question to answer - I’ll come back later re. your updated code.

Read this first:
https://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index304

Notice that it is talking about the amount of audio data that can be held in memory.

Try this:

  1. Generate a 30 second tone
  2. Apply this code using the Debug button:
(snd-set-max-audio-mem 1000)
(mult *track* 0.8)
  1. Notice that the code runs without error

The reason that the above works is because Nyquist can process track incrementally, releasing memory as it goes.

Now try this:

  1. Generate a 30 second tone
  2. Apply this code using the Debug button:
(snd-set-max-audio-mem 1000)
(setf sig *track*)
(mult *track* 0.8)
  1. Notice that there is an error

The error message should look something like this (below) and is fairly self-explanatory:

The maximum number of sample blocks has been
reached, so audio computation must be terminated.
Probably, your program should not be retaining
so many samples in memory. You can get and set
the maximum using SND-SET-MAX-AUDIO-MEM.
error: audio memory exhausted

The reason that we get the error here is because Nyquist cannot release data from track when it is processing the final line without corrupting “sig” (Nyquist does not know that we won’t be using “sig” later).

However, we can do this:

(snd-set-max-audio-mem 1000)
(let ((sig *track*))
  (mult *track* 0.8))

In this case, “sig” is local to the LET block, which Nyquist deals with as a “block” (limited “scope”). Nyquist knows that “sig” is local to the block, so it knows that it does not need to hang onto the samples beyond the scope of the block.

This illustrates one of the reasons why it’s best to limit the scope of variables - that is: avoid using global variables unless you really have to.


A bit more…

Notice that even this isn’t safe:

(snd-set-max-audio-mem 1000)
(let ()
  (setf sig *track*)
  (mult *track* 0.8))

but this is safe because we explicitly tell Nyquist that “sig” is local to the LET block:

(snd-set-max-audio-mem 1000)
(let (sig)
  (setf sig *track*)
  (mult *track* 0.8))

and another bit more…

Nyquist is basically a 32-bit app, so don’t expect it to handle more than 2 GB properly.
Roger Dannenberg (creator of the Nyquist language) has done some clever stuff in Nyquist so that it rarely runs into this problem, but be aware that exceeding 2GB can cause weird and wonderful bugs.

I’ve got a bit of time to start looking at your new code, so I’ll post short comments as I go (which hopefully will be useful :wink:)

First one:

(+ (+ (get-rms sig ln) sensitivity) T_offset)

The “+” operator takes one or more arguments - it’s not restricted to two.
All of these are valid:

(+ 3)         ;returns 3

(+ 1 2)       ;returns 3

(+ 1 2 3 4)   ;returns 10

(+ (get-rms sig ln) sensitivity T_offset)

(+ (get-rms sig ln)
   sensitivity
   T_offset)