My narrator's pause-trimmer

Using Nyquist scripts in Audacity.
Post and download new plug-ins.
Forum rules
If you require help using Audacity, please post on the forum board relevant to your operating system:
Windows
Mac OS X
GNU/Linux and Unix-like
Paul L
Posts: 1782
Joined: Mon Mar 11, 2013 7:37 pm
Operating System: Please select

My narrator's pause-trimmer

Post by Paul L » Sat Mar 30, 2013 2:20 am

Steve, the tool I was talking about, critiques welcome. My preliminary tests are promising.

Preparation: speak some sentences leaving an excessive pause between.
(Perhaps apply a high-pass at 20Hz to remove any subsound with noticeable amplitude in dB view.)

Parameters (which I may hard-code in my own practice): a "Resolution" more or less determining a frequency floor for on-glide sounds; a "Loud" threshold defining the onset of voice; a lower "Quiet" threshold determining on-glides (like the breathing of letter h) before "voice" that should be preserved.

Usage:
1) Select a range including the end of one sentence and the start of the next.
2) Play, then "Stop and Set Cursor" just where you judge the next sentence should start. (Ctrl-A by default but I bind it to G.) (Note that the right edge of selection is unchanged and the left has moved rightward.) Mentally note what time you stopped at.
3) Invoke the effect. (Nothing happens yet but the selection length is remembered in a global.)
4) Move left edge of the selection leftward "somewhat." (At least as long as any on-glide to the voice.)
5) Invoke the effect again.

Result: some sound is deleted from the selection, bringing the onset of "loud" voice forward to the place noted in 2, but preserving any on-glide up to the voice that crosses the threshold and remains above it. The tool also allows some error in the length of the deletion (1/"resolution" frequency) at each end to make a neat deletion at 0-crossings.

Do you understand what I'm doing?

My Nyquist wish-list posting a while ago was for elimination of steps 4 and 5 by somehow allowing me to examine context of the selection. But even with the necessity of steps 4 and 5 I like it that this tool eliminates other manual work.

Finally, a weird thing I didn't expect: sometimes this effect splits the track at the left edge of the selection! Even in the do-nothing first invocation! I think it only happens if the right edge is past the end of the track or clip. But why?
Attachments
Trim Pause.ny
(6.85 KiB) Downloaded 86 times

steve
Site Admin
Posts: 80677
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: My narrator's pause-trimmer

Post by steve » Wed Apr 03, 2013 12:58 am

Sorry about the delay - I've not had much time for Audacity things recently. I'll post a proper reply shortly.
Paul L wrote:Do you understand what I'm doing?
Not really.
Does the plug-in do what you expect it to do?
I'm unsure of what precisely I should be expecting it to do.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

Paul L
Posts: 1782
Joined: Mon Mar 11, 2013 7:37 pm
Operating System: Please select

Re: My narrator's pause-trimmer

Post by Paul L » Wed Apr 03, 2013 6:04 am

I don't know if I can describe it simpler than I have. I can say that I have been finding it a useful simplification for a common task. It spares me some zooming in and out to find precise endpoints for deletion. I'd like to march through a narration, fixing pauses as quickly as a can, without losing too much of the sense of the narrative flow.

The main thing is I want to just pick "by ear" where a long pause should be trimmed to, and have a program calculate what to delete so that the start of loud voice is moved up to just that point, but on-glides of the word are preserved too and may be moved even left of that point. I don't have the simplicity of a single pick and keystroke but I have achieved important simplification.

The whole thing is only three pages of printout, and I want to know that I didn't do anything too weird and crazy or inefficient.

Robert J. H.
Posts: 3633
Joined: Thu May 31, 2012 8:33 am
Operating System: Windows 10

Re: My narrator's pause-trimmer

Post by Robert J. H. » Wed Apr 03, 2013 6:56 am

It is indeed a little confusing with all the selecting steps.
Let me see if I got it right:
- The purpose of the first selection is to preview the section you're working with.
- You then play this section and re-set the left margin where (according to your "feel") the sentence should start.
- difference between desired and momentary start (+ the beginning of the sentence) is stored.
- This selection could be 0.5 s long. However, the louder part at the end is preserved because it lies above the first threshold. Thus, the actual difference could be 0.3 s.
- You now expand the selection again to the left into the region between sentence end and breath taking.
- The plug-in excludes then also the quieter sound from the stored selection (maybe 0.1 s) and we end up with 0.2 s that have to be deleted.
- after the detection of the zero crossing, the sound without the removed silence of about 0.2 s is returned.

It may be advantageous for people that do not use a mouse to save the first selection (edit menu), and to restore it before the actual silence removing.

I guess I would prefer an one-click effect that removes in each call the quietest rms-sections.
Something like this:
- You select the pause including ending and beginning of the concerned sentences.
- You call the plug-in.
- It takes the RMS measurement (let's say at 20 Hz)
- The curve is now multiplied by a raised half-period sine curve (bowl shape).
- This should ensure that start and ending of the selection aren't affected.
- You make a list with the time indexes and the weighted RMS values.
- after Sorting out the most silent ones, you can search for the zero crossings and remove these parts.
- The amount of RMS values that have to be removed is of course hard-coded.

20 Hz would mean 50 ms per value.
You could now tell the program to remove 10 of these.
When you additionally multiply this with a random factor of 1, 2 or 3, it is going to be easier to remove longer silences.
By pressing play, you can control the result and undo the last step (and try with a hopefully smaller random value) if necessary.
You could of course also start with relatively high values for longer selections.
Besides, I've posted a snippet that returns a list of zero crossings without the use of snd-fetch.
http://forum.audacityteam.org/viewtopic ... 39&t=67895

Paul L
Posts: 1782
Joined: Mon Mar 11, 2013 7:37 pm
Operating System: Please select

Re: My narrator's pause-trimmer

Post by Paul L » Wed Apr 03, 2013 1:27 pm

Robert J. H. wrote:It is indeed a little confusing with all the selecting steps.
Let me see if I got it right:
- The purpose of the first selection is to preview the section you're working with.
That, and set the common right boundary of the selections passed to the two invocations of the tool. I would error-check that the right boundary is the same both times but I don't know how. I choose something to play, listening to the end of one sentence and some of the pause between. The right boundary should be in the next sentence, but I stop before reaching it.
- You then play this section and re-set the left margin where (according to your "feel") the sentence should start.
- difference between desired and momentary start (+ the beginning of the sentence) is stored.
I don't know what "momentary start" means. The snd-length of the selection (as shrunk after stopping play) is remembered in a *scratch* property. That's all. I would remember the track time of the end of selection if I knew how. The place where play started has no importance in the calculations. It only matters to my intuition of where to stop afterward.

The selection contains the beginning of the next sentence, and I could locate it in this pass but I don't. I do that in the second. I suppose there would be savings if I did, with less to scan. It is the start of the second sentence that matters.
- This selection could be 0.5 s long. However, the louder part at the end is preserved because it lies above the first threshold. Thus, the actual difference could be 0.3 s.
- You now expand the selection again to the left into the region between sentence end and breath taking.
The difference between the start of this selection and start of sound above the loud threshold is important, but as mentioned, not calculated yet. That is the length that the deletion should have. But the right end of the deletion may need to be before the start of sound, meaning the left end of the deletion may need to be before the selection -- meaning we can't operate on this selection and need to stretch it left and call the tool again.

Left edge of the selection is somewhere in the pause between sentences. I move it left "some." At least as far, as the breath before the sound is long. (Sometimes not a breath, sometime's it's the faint "m" before initial "b"... I use "on-glide" as the general term.)
- The plug-in excludes then also the quieter sound from the stored selection (maybe 0.1 s) and we end up with 0.2 s that have to be deleted.
- after the detection of the zero crossing, the sound without the removed silence of about 0.2 s is returned.
I'm not sure what these numbers mean, I don't think they describe things.


It may be advantageous for people that do not use a mouse to save the first selection (edit menu), and to restore it before the actual silence removing.

I guess I would prefer an one-click effect that removes in each call the quietest rms-sections.
I'd like a one-click effect too, but do you understand now why that can't work with Nyquist's limitations? I don't know how to communicate a selection, plus a certain point in the middle of it, into Nyquist in one call. The workaround is to make "middle" the left on the first call.

Robert J. H.
Posts: 3633
Joined: Thu May 31, 2012 8:33 am
Operating System: Windows 10

Re: My narrator's pause-trimmer

Post by Robert J. H. » Wed Apr 03, 2013 2:11 pm

Obviously, I didn't get it right. lol.
I had somehow the impression that the whole pause, including short sections of the sentences were selected in the beginning.
I wonder, wouldn't it not be much more comfortable to select an arbitrary chunk within the pause, to preview the reminder with the c key and to delete it when the right length is selected?
Perhaps, I am hopeless off the trail and fail to see the concrete advantage of your procedure.
On the other hand, I haven't to worry about proper zooming and scrolling though.

Paul L
Posts: 1782
Joined: Mon Mar 11, 2013 7:37 pm
Operating System: Please select

Re: My narrator's pause-trimmer

Post by Paul L » Wed Apr 03, 2013 5:58 pm

Of course that's a simple way to do it, but that might take some fiddling each time. This operation is something I will do so repetitively that investment in simplification of the procedure seems really worth it to me. I want to listen and just have a "right there!" reaction with one finger. I want to take some of the tedium out of the work and keep up my "flow" as best I can.

Even so I suppose there are the problems of my reaction time and some playback latency, but even so I suppose I could adjust and train myself so that I can get my pauses just right on the first try.

You made suggestions I don't fully understand for using rms. I wonder if I should use rms instead of peak as my criterion for sounds loud enough to define the onset, soft enough to be noise, and on-glides in-between. What advantage might there be either way?

steve
Site Admin
Posts: 80677
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: My narrator's pause-trimmer

Post by steve » Wed Apr 03, 2013 6:31 pm

Paul L wrote:Usage:
1) Select a range including the end of one sentence and the start of the next.
2) Play, then "Stop and Set Cursor" just where you judge the next sentence should start. (Ctrl-A by default but I bind it to G.) (Note that the right edge of selection is unchanged and the left has moved rightward.) Mentally note what time you stopped at.
3) Invoke the effect. (Nothing happens yet but the selection length is remembered in a global.)
4) Move left edge of the selection leftward "somewhat." (At least as long as any on-glide to the voice.)
5) Invoke the effect again.

Result: some sound is deleted from the selection, bringing the onset of "loud" voice forward to the place noted in 2, but preserving any on-glide up to the voice that crosses the threshold and remains above it. The tool also allows some error in the length of the deletion (1/"resolution" frequency) at each end to make a neat deletion at 0-crossings.
OK, I see what it is doing.

Is there a reason why the following approach could not be used?
1) Select a range including the end of one sentence and the start of the next.
2) Play, then "Stop and Set Cursor" (shift+A) just where you judge the next sentence should start.
3) Apply an effect that deletes "silence" (below the threshold) up to the start of the sound.

Paul L wrote:I would remember the track time of the end of selection if I knew how.
Unfortunately Nyquist does not have access to that information.


I'm still working through your code, but a couple of general points:

In LISP / Nyquist programming it is strongly preferred to use spaces rather than tabs. Indentation makes a huge difference to the readability of LISP, but indentation is rather hit or miss when using tabs as it depends on the tab settings in the editor. Unfortunately many of the older Nyquist plug-ins have very poor indentation, but this is not surprising as many of the older plug-ins were written by David Sky (now sadly departed) who was blind. Indentation is largely irrelevant for blind coders but for sighted users it can be a great help in seeing where commands start and end, and for seeing the structure of a program. Code written by Roger Dannenberg and Edgar-rft provide good examples of code indentation. There is also a good guide here: http://dept-info.labri.u-bordeaux.fr/~i ... ation.html

Code: Select all

;; following two functions cribbed from SilenceMarker.ny, with constant
;; .........
(defun mono-s (s-in)
  (if (arrayp s-in)
      (snd-add (aref s-in 0) (aref s-in 1))
      s-in))

(setq my-srate-ratio 1.0)

(defun my-s (s-in)
  (setq my-srate-ratio (truncate (/ (snd-srate (mono-s s-in)) resolution)))
  (snd-avg (mono-s s-in) my-srate-ratio my-srate-ratio OP-PEAK))
Unfortunately some of the older plug-ins do not provide great models for Nyquist programming. In this case the (mono-s sound) function is probably not the best approach as it simply adds together the left and right channels, which will of course make the "silences" have higher amplitude, so that the threshold settings will be relatively too low. For your plug-in it may be better to either:

Code: Select all

;; Take an average of the left and right channels

(defun mono-s (s-in)
  (if (arrayp s-in)
      (mult 0.5 (sum (aref s-in 0) (aref s-in 1)))
      s-in))
or:

Code: Select all

;; Take the absolute maximum of the left and right channels

(defun mono-s (s-in)
  (if (arrayp s-in)
      (s-max (snd-abs (aref s-in 0))(snd-abs (aref s-in 1)))
      s-in))
A handy function to simplify your zero crossing detection: (plusp expr) http://www.cs.cmu.edu/~rbd/doc/nyquist/ ... #index1447

More to follow :)
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

Paul L
Posts: 1782
Joined: Mon Mar 11, 2013 7:37 pm
Operating System: Please select

Re: My narrator's pause-trimmer

Post by Paul L » Wed Apr 03, 2013 10:01 pm

What effect deletes silence?

I believe the more complicated thing I am doing may match my intent better: define "sound" as crossing some threshold, move that crossing forward in time to the desired point, BUT do not simply delete from the point to the threshold: preserve on-glides too for more natural transitions. I believe I'm better at picking where that threshold crossing should be, not where the start of the transition is. I will have some dials to fiddle to figure that out.

As for indentation... I just re-briefed myself in long forgotten emacs and just doing whatever the lisp-mode is doing by default for indentation. Do you know how to improve that.

steve
Site Admin
Posts: 80677
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: My narrator's pause-trimmer

Post by steve » Wed Apr 03, 2013 10:14 pm

Paul L wrote:What effect deletes silence?
(extract .....) or (extract-ebs ....)
Paul L wrote:just doing whatever the lisp-mode is doing by default for indentation. Do you know how to improve that.
First thing, set it to use spaces rather than tabs.
There are also some Emacs specific tips in the "good guide" link that I posted.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

Post Reply