Conditional Replacement by Silence

jbret · June 22, 2020, 7:42pm

Despite some research and leveraging on codes for similar tasks, I’m still not able to solve the following.

Given:

a sound selection;
a numerical threshold (e.g. 0.9); and
a (time-based) parameter named ‘radius’,
the code should:
map selection samples exceeding the threshold;
at each of those mapped points, should replace the interval (centered at the mapped point, length = 2*radius) in the original selection by a silenced interval of same length.

In other words, outcome should be:
A sound of same length as the input one, with silenced intervals replacing the corresponding intervals in the original sound selection.
Each interval being centered at an offending sampled point (i.e. above the threshold), having a length of twice the ‘radius’.

I provide the latest (failed) attempt:

; set threshold
(setq threshold 0.9)	
				
; silence radius around a given offender (in seconds)
(setq radius 0.005)    		
	
; duration of sound signal
(setq duration (get-duration 1))	

; silence interval same length as original
(setf silenced (abs-env (s-rest duration)))	

; initialize "output" as selected sound
(setf output s)		

;start loop
(do
    (
		;initialize and increment variables
		(count 0 (+ count 1)) 		
		(timepoint 0 (/ count *sound-srate*))
		(lbound 0 (- timepoint radius))
		(ubound 0 (+ timepoint radius))
		(val (snd-fetch s)(snd-fetch s))
	)
 
    
    (
		;loop until run out of samples	
		(not val output) 
    )

	  ;Loop body. Run code if offender. (This is where code will probably get most of changes in order to work as intended.)
	  (if (> (abs val) threshold)			
			(abs-env
				(setf output
					(sim (
							 (at-abs 0      (cue (extract-abs 0      lbound   output)))
							 (at-abs lbound (cue (extract-abs lbound ubound   silenced)))
							 (at-abs ubound (cue (extract-abs ubound duration output)))
						 )
					)					
				)
			)
	  )
)

I’ve tried other things such as SEQ, MULT… Failure probably stems from inadequate use of behaviours, sounds, samples etc. Still could not get my head around all those different concepts and their corresponding treatments.

Also, I’m sure there’s a much better way to approach it other than looping; any code that would get the job done would be much appreciated.

Thanks for any help in advance.

steve · June 22, 2020, 9:35pm

There’s a few things I’d mention about your code before looking at a solution for the task.

Firstly, well done for having a go!

Some of the details below indicate updates to Nyquist, (notably the preferred use of track rather than “S”), and some are common conventions for writing LISP code (Nyquist is based on a version of LISP called XLISP).

When writing new code, it is highly recommended to use “version 4” syntax. The main difference with version 4 is that it uses TRACK to pass the selected audio to Nyquist rather than the single character S. The old syntax may well be dropped in future versions of Audacity, so it’s worth getting used to the new syntax now.

Regarding indentation and layout: Sticking to conventional LISP indentation rules is a big help for others reading your code. Note the absence of hanging parentheses, and use of 2 space indentation (not tabs).

;version 4

(setq threshold 0.9)
(setq duration (get-duration 1))

; silence radius around a given offender (in seconds)
(setq radius 0.005)    	

; silence interval same length as original
(setf silenced (abs-env (s-rest duration)))

; initialize "output" as selected sound
(setf output *track*)	

(do* ((count 0 (+ count 1))
      (timepoint 0 (/ count *sound-srate*))
      (lbound 0 (- timepoint radius))
      (ubound 0 (+ timepoint radius))
      (val (snd-fetch *track*)(snd-fetch *track*)))
    ((not val) output)
  (if (> (abs val) threshold)
      (abs-env
        (setf output
          (sim (at-abs 0 (cue (extract-abs 0 lbound output)))
               (at-abs lbound (cue (extract-abs lbound ubound silenced)))
               (at-abs ubound (cue (extract-abs ubound duration output))))))))

I do like your use of meaningful names for variables - that’s allowed me to remove many of the comments without losing any of the meaning.

There’s an error in the DO syntax. In the second line of the DO loop, “count” is undefined.
For “count” to be properly defined, the list of bindings must be evaluated sequentially, and to do that, use DO* rather than plain DO (see: https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-ref/xlisp-ref-094.htm)

There was also an extra “(” in the SIM statement near the end of the code, and missing parentheses in "(not val output) ".

I think that the above code is syntactically correct, but it still does not work, and on my machine it crashes.

So that this post does not get too long, I’ll start a new post to look at an alternative solution.

steve · June 22, 2020, 9:42pm

If I understand correctly, then this image illustrates what you want.

The upper track is the original track.
The lower track is what you want, where the silent labelled regions A, B and C each have a duration of 2x Radius.
Is that right?

jbret · June 24, 2020, 12:44am

Thanks for the help, compliments, and words of encouragement, Steve!

Let me apologize for my trial code’s basic flaws and its lack of conventional practices using LISP.

Regarding the new post illustrating a potential solution, indeed it seems to be exactly the outcome I need!

I noticed the labels below the track. Was your approach based on first labeling the (somehow previously set/bounded) sections in order to finally silence them leveraging on the list of labels? Looks neat, anyways.

Lastly, I’d like to congratulate you on your posts; they are all very clear and helpful for anyone trying to learn this tricky framework/language.

steve · June 24, 2020, 12:04pm

That was just an illustration (done “manually”)

There’s already a plug-in that does something very similar called “Pop Mute” Missing features - Audacity Support
but if you’re interested in learning Nyquist, I’d be happy to go through the approach that I would take for programming this kind of task.

Thank you for the kind words.
When I started learning Nyquist there was a lot less documentation than now, but I was fortunate to receive help from other Audacity users, notably David Skye and Edgar-rft. You may well have come across their names as authors of most of the early Nyquist plug-ins. Edgar-rft collected, updated and published the XLISP documentation that we still use today (XLisp). From my perspective, the sharing of knowledge lies at the heart of “open source” and makes the world a better place - long may it continue.

jbret · June 24, 2020, 3:56pm

Thanks!
I have several audio files that need this treatment, probably due to how they were recorded. At first it seemed like a periodic issue that one would be able to pin down to a number. Yet, after some calculations I am now convinced it’s not predictable after all. So, I tried to come up with a code to treat it. Therefore, I’d certainly be interested if you could share how would you approach the task programmatically.

Reg the Pop Mute plugin:
How did you set the plugin parameters in order to get that outcome? Was it accomplished with just one pass? Not sure on how to deal with those dB setups.

You most certainly share the knowledge according to that open source perspective.

steve · June 24, 2020, 5:12pm

Happy to.

Rather than cutting and splicing, I’d create a “control signal” to modulate the selected audio.
Digital audio is a sequence of evenly spaced “samples”, and a “sample” is just a numeric value at a point in time. Arithmetic operations on digital audio are much like arithmetic operations on numbers. If you have a “control signal” that has a constant value of 0.5, and multiply that signal with some audio, it halves the amplitude of the audio.

As a demonstration of multiplying signals, here’s a little script that can be run in the Nyquist Prompt.
By setting “;type generate”, 1 unit of time = 1 second. This differs from “effects” (;type process) which treat the length of the selection as “1 unit” of time.

;version 4
;type generate

;; Generate a stepped signal from 0 to 1.
;; The sample rate is 1/20th of the track sample rate.
;; Each segment is 1 second duration (at 2205 Hz sample rate)
;; giving a total duration of 6 seconds.
(setf control
  (seq (const 0 1)
       (const 0.2 1)
       (const 0.4 1)
       (const 0.6 1)
       (const 0.8 1)
       (const 1 1)))

;; Generate a 6 second sine tone at 440 Hz.
;; The amplitude is +/- 1.
(setf tone (osc (hz-to-step 440) 6))

;; Multiply the sine tone by the control signal
(mult tone control)

So now, onto the task.
If we can generate a control signal that is normally a constant amplitude of +1, and drop the level down to zero each time the selected audio exceeds “threshold”, then we will be able to get the effect that we want by multiplying the selected audio with our control signal.

So that this post doesn’t get too long, I’ll post this now and continue in a new post…

steve · June 24, 2020, 5:51pm

… Continued:

For simplicity, I’ll assume that the track is mono. It may be adapted for stereo tracks later, but mono is easier to work with during development.
I’m also going to assume that we don’t need to apply this code to massively long selections, so we don’t need to worry about optimisations or the amount of RAM used. (If necessary, longer tracks can be processed by manually applying the code to consecutive sections).

To find the peaks in the selection, we ‘could’ simply loop through every sample of the selection, but Nyquist loops are quite slow, and looping 44100 times for each second of audio will be pretty inefficient.

From your original code, you had a “radius” of 0.005 seconds. At a sample rate of 44100, that’s 220.5 samples.
Rather than looping through every sample, we can loop through, say 44 samples at a time, looking at the highest absolute value within each group of 44 samples. 4 samples at 44100 Hz sample rate is close to 1 millisecond, so the result will not be exact, but I’m guessing that will be close enough (and very much quicker than looping through every sample).

To get the highest peak in every 44 samples, Nyquist has a useful function: SND-AVG

;version 4

(setf block 44)
(setf step 44)
(setf signal (snd-avg (snd-abs *track*) block step op-peak))
(print (snd-srate signal))

;; Force to the same sample rate as the track so we can view the result more clearly
(force-srate *sound-srate* signal)

If you run the above code in the Nyquist Prompt but use the Debug button, you will see that the sample rate of “signal” is 1002.27 Hz, or to be exact, it’s 44100 / 44 = 1002.27272727… Hz
We will need this value later.

So now we can loop through the samples, and detect when the level goes above “threshold”.
If you run this code in the Nyquist Prompt with the Debug button, it will show you a list of times that the peak level goes over the threshold.
(Tip: Ensure that your selection will not create thousands of entries or you could be waiting a long time for the output to be printed)

;version 4

(setf threshold 0.9)
(setf blockstep 44)
(setf signal (snd-avg (snd-abs *track*) blockstep blockstep op-peak))
(setf srate (snd-srate signal))

;; The sample period for each sample in 'signal' is 1/srate
(setf sp (/ srate))

;; Actual time at the start of the selection:
(setf t0 (get '*selection* 'start))

(do ((val (snd-fetch signal) (snd-fetch signal))
     (time t0 (+ time sp)))
    ((not val))
  (when (> val threshold)
    (format t "Above threshold at ~a seconds.~%" time)))

It’s getting close to time for dinner, so I’ll be back later with other pieces of the puzzle…

steve · June 24, 2020, 7:32pm

Continued again…

To generate our “control signal” we will use PWLV-LIST (see: https://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index410)
This function can handle hundreds (thousands?) of breakpoints, so should be adequate for our purpose.

Note that the breakpoints are a list in the form:
l1, t2, l2, t3, t3, … tn, ln
where “l” is the level and “t” is the time.

The list is constructed by pushing the next pair of values onto the start of the list, so the list will be in reverse chronological order. Before we use the listwe shall reverse it so that it is the right way round.

Another thing to note is that because this is a “;type process” effect, times are relative to the length of the selection. Thus all times that we calculate (in seconds) will need to be divided by the duration to get relative times.

;version 4

(setf threshold 0.9)
(setf radius 0.005)
(setf blockstep 44)
(setf duration (get-duration 1))
(setf signal (snd-avg (snd-abs *track*) blockstep blockstep op-peak))
(setf srate (snd-srate signal))

;; The sample period for each sample in 'signal' is 1/srate
;; and as a proportion of 'duration'
(setf sp (/ (* duration srate)))
;; radius as a proportion of 'duration'
(setf radius (/ radius duration))


;; breakpoints is a list in the form:
;; l1, t2, l2, t3, t3, ... tn, ln
;; Initial level  is 1.
(setf breakpoints (list 1))

;; Assume we start below the threshold
(setf below t)

(do ((val (snd-fetch signal) (snd-fetch signal))
     (count 0 (1+ count)))  ; just count the samples to avoid cumulative errors.
    ((not val))
  (cond
    ((and below (> val threshold))
      (push (- (* count sp) radius) breakpoints)
      (push 1 breakpoints)
      (push (- (* count sp) radius) breakpoints)
      (push 0 breakpoints)
      (setf below nil))
    ((and (not below)(< val threshold))
      (push (+ (* count sp) radius) breakpoints)
      (push 0 breakpoints)
      (push (+ (* count sp) radius) breakpoints)
      (push 1 breakpoints)
      (setf below t))))

;; Add the final breakpoint.
;; To avoid a glitch at the end, ensure that the control signal is longer
;; than the selection.
(push (+ 1 sp) breakpoints)
(push 1 breakpoints)

;; And finally, create our control signal and multiply by *track*
(mult *track* (pwlv-list (reverse breakpoints)))

You may notice that there is a lot of repetition in the above code. That could (“should”) be improved, which could be done by moving the code that adds break points to the list, out into a separate function. However, I’ll leave it here for now for you to try. Feel free to ask questions.

jbret · June 25, 2020, 10:29am

Wow, Steve! This is a full-fledged tutorial.

I learnt from virtually every line of your code and appreciated the exposure of the whole problem-solving/thinking process. I have now a much better grasp of the framework and its power.

Highlights:
After reading one of your previous posts, I had a hunch MULT would be a piece of the puzzle. Coupled with those block steps to improve efficiency - with no relevant sacrifice to precision - you proved it was a great approach. The piece-wise list of breakpoints to generate the control signal was new to me. It got me reading further about piece-wise approximations which clearly are extremely useful.

The proof is in the pudding:
After some tweaking got it to work on a couple of my files. As they were mono, spot on! The time taken to process each file being - at least to my standards - very short by the way. No crashes, smooth.

Let me thank you once more for all the help with this and patience to walk me through the whole process. I’ll go over your explanations a couple of times more so to not miss any details.

The forum is extremely rich in content and if one is willing to put the work, she/he can leverage on previous posts to address a specific problem or go all the way and contribute to the community.

“Long may it continue.”