Zero crossings -- Help me, Robert (or steve)

Robert J H referred me to this piece of code for finding zero crossings efficiently. I think I get what it’s doing, but I don’t understand some things.

What is the unused variable “zero” for?
Is the loop calling sref-inverse really efficient (the builtin function doing some one time caching) or it there a repeated scanning of the array which gives it quadratic time behavior?

You know I wrote my own naive zero crossing finder as a part of something else. There is no sort of “compilation” of my .ny files? Is interpretive overhead great enough that it’s worth it to write (plusp x) and not (> x 0) if I must write my own loops?

    ;;indexing
(Defun split (sig size)
  (let* ((sr (snd-srate sig)) (split-list (list 0))
	 (single (snd-from-array 0 sr (vector (/ (float size)))))
	 (zero (snd-from-array 0 sr #(0)))
	 (index (mult  sr (integrate (trigger sig  single))))
	 (last (sref (sound index) (* (- (snd-length index ny:all)
					 1)  (/ sr)))))
    (display "" last)
    (dotimes (num (truncate last))
      (setf split-list (cons (float (sref-inverse index (1+ num))) split-list)))
    (reverse (cons (second (snd-extent sig ny:all)) split-list))
    ))					; end split

Original discussion here https://forum.audacityteam.org/t/zero-crossing-detection/26237/1

The Zero variable is not used because the function is only an extract from a greater script.
It is initially used to extent s by a Zero sample.
This is necessary because integrate doesn’t otherwise return the last running sum for the Signal it is fed with.
The internal integrate function has some pecularities that I am not looking for in certain situations.

  • The first value is always 0
  • The running sum is divided by the sound-srate
  • The las sample value is omitted and not added up.
    Example code:
(setf sig (snd-from-array 0 44100 
  (vector   1 0 0.1 0.3 0.9 0.1 0 -0.4 -0.3 -0.1 0.5)))
(print (snd-samples sig ny:all))
(print (snd-samples
   (mult 44100 (integrate sig)) ny:all))
(print (snd-samples
   (snd-biquad sig 1 0 0 1 0  0 0) ny:all))

Returns the original Signal along with the Output from integrate (scaled) and an alternative Approach.

#(1 0 0.1 0.3 0.9 0.1 0 -0.4 -0.3 -0.1 0.5)
#(0 1 1 1.1 1.4 2.3 2.4 2.4 2 1.7 1.6)
#(1 1 1.1 1.4 2.3 2.4 2.4 2 1.7 1.6 2.1)

I do not quite understand, what you’re hinting at with the Loop construct and snd-inverse.
It functions as follows:
The Trigger function places a 1 at each Zero crossing.
(0 1 0 0 0 1 0 0)
After Integration, we have a step function:
(0 1 1 1 1 2 2 2)
sref-inverse searches for those hard steps.
This may not be ideally with increasing distance of the Zero crossing in comparison with T0 and causes each search to take longer.
However, usually only a part of the Signal (e.g 200 MS) is passed to the function.
I’ve not studied your code, but I guess it does its Job.
Something like:

  • get 2 samples
  • if one is 0 take this time index.
  • if one is positive and the other negative, control that the minimal value is small enough and take the time index.
    Example: You search for the Zero crossing in a saw wave.
    The first positive/negative pair you encouter could be 1 and -1.
    Since the smallest (absolute) value is still big, you will search further till you find a Zero or a pair like -0.00013 and 0.0021.

To clarify, the questions, about how snd-inverse is implemented under the hood, are these:

(1) if you call snd-inverse repeatedly on the same sound, are the later calls constant-time, because the first call precomputes and remembers an inverse lookup for the whole sound?

Or, (2) does snd-inverse restart from nothing each time you call it, scanning the sound always from the start until it finds a value? so the time it takes to answer (for a function that increases at a constant rate) is proportional TO the answer?

or – this just occurs to me – (3) does it “partially” generate its inverse only as needed, so neither of the above is right?

If (2) is the case (the obvious, dumb implementation), then calling snd-inverse on a nondecreasing sound, for every value that the sound takes, will take a total time that is not proportional to the length of the sound but (assuming the same constant slope of function) varies quadratically with the length of the sound. But if (1) or (3) is the case, it could vary linearly.

Maybe I just shouldn’t worry for typical short enough inputs.

“Neither of the above is right” or all…
You’ve mixed up two different functions:

  • (sref-inverse)
  • (snd-inverse)
    The former searches for the first appearance of a value and Returns the time index.
    The latter constructs lookup-tables.
    snd-inverse is not used by the Zero crossing detection.

Assuming that you mean sref-inverse, it’s the latter. sref-inverse searches for the first point at which it achieves value and return the corresponding (linearly interpolated) time. This is obviously not ideal for processing long sections, but for processing short sections it’s not a problem, and it’s much faster than looping through the samples in LISP.

sref-inverse requires care - if the specified value is not found it returns an error.

There’s not much difference in speed between “>” than “plusp”. (On Linux, plusp is fractionally quicker but hardly significant). Looping in LISP will always be relatively slow, so for looping through samples it is usually much faster to use one of the built in functions (written in C).

Ah, so you mean sref-inverse could give the routine “quadratic” complexity as I said… but the linearity of the LISP solution would have such a big coefficient that it is still a loser.

The LISP loops I have already written and used seem tolerably fast for the amounts of data I give it. Now I’m experimenting with snd-fft frames and have to write loops within loops. But if this is the better trick for zero crossing finding I’ll use it. Indeed I’ll want just one crossing, not a list of them.

I just looked up the distinction between sref-inverse and snd-inverse. Might the use of snd-inverse be an improvement in time complexity here?

It is probably easiest to search for the Zero crossing from a given time index via an ordinary Loop where the search in both directions is possible.
However, a list is sometimes very useful and the above method can easily be adapted to search for pauses greater than an arbitrary length.
Besides, Sref-inverse can totally be ignored, when the time Indexes is taken from within the triggerr function itself (low Level snd-trigger).
The nice Thing about lists is that you can sort, arrange and modify them any way you like.
In comparison, Arrays are very poorly implemented.

  • only vectors
  • no redimensioning possible
  • only “length” works with it.
  • no pointers to segments (like cdr)
  • no individual function calls for each element.
    As to your question from “elementary FFT”: mult, rrecip, integrate etc. do not work with (whole) Arrays.

(you’ve written meanwhile…)
snd-inverse won’t be a great improvement I guess, but some things you have to simply try by yourself.

But I suppose I could use snd-from-array and snd-fetch-array to convert?

Would this change a frame to the array of squares, faster than loops written in lisp? It is less code to read.

(let* ((snd (snd-from-array 0.0 1.0 frame))
       (sndsq (mult snd snd))
       (len (length frame)))
  (snd-fetch-array sndsq len len))

Then… what if I do this, and access only the odd-numbered entries? Is this the “cumulative” power spectrum, telling me how much energy is at or below each frequency? (zeroth is 0, first counts the DC which I ignore, third sums that with the squares of the first sine and cosine, etc. There is a scaling factor of (len/2)^2 that I ignore.)


(let* ((foo (setf (aref frame 0) 0.0)) ; just ignore the DC
       (snd (snd-from-array 0.0 1.0 frame))
       (sndsq (mult snd snd))
       (integral (integrate sndsq))
       (len (length frame)))
  (snd-fetch-array integral len len))

I think I account for the first issue below as described, for the second by making 1.0 the second argument of snd-from-array, and for the third (loss of the dog-whistle Nyquist frequency) by just ignoring it.

  • The first value is always 0
  • The running sum is divided by the sound-srate
  • The las sample value is omitted and not added up.

Yes that’ll be pretty fast.

A good way to get an idea of how quick or slow a function will be is to write a simple test case that can be run in the Nyquist Prompt effect, for example:

(if (boundp '*scratch*)
    (let* ((snd (snd-from-array 0 1 *scratch*))
           (sndsq (mult snd snd))
           (len (length *scratch*)))
      (setf result (snd-fetch-array sndsq len len))
      (setq *scratch* '*unbound*)
      (format nil "Got ~a squares in array" (length result)))
    (let* ((len (truncate len))
           (step len))
      (setf *scratch* (snd-fetch-array s len step))
      (print "Array copied to *scratch*")))

The first run is just to get an array that we can use - in this case, from “S” (the selected audio) and saved in scratch.
We are interested in the speed of the second run when we calculate the squares.


and then we can compare it with a simple LISP loop:

(dotimes (i (truncate len))
  (let ((val (snd-fetch s)))
    (* val val)))
(format nil "Returns nil at end: ~a" (snd-fetch s))

Note that we need to reference the loop in some way, or “lazy evaluation” will just skip running the loop, so here I’m printing one more (snd-fetch s) to ensure that the loop is executed.

I was rethinking and editing my last post, I think our messages crossed.