wow & flutter spectrum analysis

Hi

Id like to work on a plug-in to allow spectrum analysis of wow & flutter, but as to my limited skills, both in Nyquist coding as well as in digtal signal processing I could need same help.
:slight_smile:

My aim is to prepare a given sound file of approximate 30 sec length - containing a record of a testtone in the range of 3kHz - to allow spectral analysis down to as low frequencies as possible.

What I guess is need so far are four steps:

  1. preparing for demodulation and pitch detect
  2. performing pitch detect
  3. demodulating the testtone by multiplying with the detected pitch
  4. preparing for spectral analysis

part 3) seems to be the most easy part
Where “demod” is the demodulated signal and “carrier” is the pitch of the testtone :

(setf demod (mult s (hzosc carrier)))

for part 2) I already got a working solution from steve
https://forum.audacityteam.org/t/pitch-detection-plug-in/29126/8

This leavs part 1) and 4) for now.

  1. preparing for demodulation and pitch detect I guess must include
  • DC offset removal and bandpass limiting which I think is sufficient like this
(hp s 2800)
(lp s 3300)
  • trimming the start fo the testtone (for correct demodulation)
    My idea would be to do a search for the first sample that is the closest to 0 within lets say the first second of the testtone and delete anything in front.
    Furthermore the sample following the found sample should be greater, as to ensure that the testtone sine starts out in the positive direction.

Any ideas how to get that trimming done ?

Is this for recordings on tape?

If you have a sine wave as the test tone, then assuming negligible distortion and negligible DC offset, you can calculate the frequency very accurately by counting the zero crossing points (the places at which the waveform crosses the central “zero” line) over a given period.

To avoid problems with DC offset, apply a low frequency high pass filter to the sound, and then discard the first bit of the recording.
In Nyquist you can do that (for mono tracks) with something like:

(setf s (extract-abs 1 11 (highpass2 s 20)))

This will apply a 20 Hz high-pass filter to the waveform, and then extract from 1 to 11 seconds of the selected audio (assuming that the initial selection was at least 11 seconds long).

The time from one zero crossing point to the next will be half the wavelength of the tone.
Taking an average over enough cycles will give a very accurate value.
Note that because this only gives the zero crossing point to the nearest sample, to get an accurate figure you will need to average over a sufficiently large audio section, say 1 second.

Assuming that your test tone is around 400 Hz, the following code should give a pretty accurate frequency measurement:

(defun count-samples (sig crossings)
  (do* ((count 0)    ; number of samples between crossing points      
        (total 0)    ; total counter
        ; get the next sample value
        (val (snd-fetch sig) (snd-fetch sig))
        (zcount 0)   ; zero crossing counter
        (flag val))    ; sign of last value.
       ; stop when crossed zero twice and return 'count'
       ((> zcount crossings) total)
    (when (< (* val flag) 0)  ; crossed zero
      (setq zcount (1+ zcount))
      (setq total (+ total count))
      (setq count 0))
    (setq flag val)
    (if (> zcount 0) (setq count (1+ count)))))

(setq cycles 400) ; about 1 second for a 400 Hz signal

(setq sample-count (count-samples s (* 2 cycles)))
(print (/ (* cycles *sound-srate*) sample-count))

Note that this code is pretty slow because it is looping through the audio one sample at a time.
For a succession of values, just repeat the evaluation (try running this with “Debug” rather than “OK” in the Nyquist Prompt:

(defun count-samples (sig crossings)
  (do* ((count 0)    ; number of samples between crossing points      
        (total 0)    ; total counter
        ; get the next sample value
        (val (snd-fetch sig) (snd-fetch sig))
        (zcount 0)   ; zero crossing counter
        (flag val))    ; sign of last value.
       ; stop when crossed zero twice and return 'count'
       ((> zcount crossings) total)
    (when (< (* val flag) 0)  ; crossed zero
      (setq zcount (1+ zcount))
      (setq total (+ total count))
      (setq count 0))
    (setq flag val)
    (if (> zcount 0) (setq count (1+ count)))))

(setq cycles 400) ; about 1 second for a 400 Hz signal

; High pass and extract.
(setf s (extract-abs 1 11 (highpass2 s 20)))

(dotimes (i 10)
  (setq sample-count (count-samples s (* 2 cycles)))
  (print (/ (* cycles *sound-srate*) sample-count)))

Thaks Steve for your kind help

Yes, its primarily meant for tape analysis but may be useful for the vinyl guyes just as well.


I’ll try the code you suggest and will report back soon
:smiley:

Two questions so far

I’d like to apply a low pass at say 5kHz as well, as to avoid false zero detection by high frequnecy garbage.
(lp s 5000) does not have any effect when I check spectrum
(highpass2 s 1000) for the 3kHz testtone on the other hand works great

Your code seems to provide extreme precise frequency detection, at least when I run the non looped version in the prompt.
Sadly I badly failed to assign the frequency to a variable for further processing.

Why is it, that I cant write something like that:
(setq carrierfreq (/ (* cycles sound-srate) sample-count))

Instead of the “print” command

Uhhh, my bad
(setf s (lp s 5000)) works, forgot to assign

You can. That looks perfectly valid.

Ok.

Next line is

(setf demod (mult s (hzosc carrier)))

which works perfect in a former version, but this time audacity crashes fully and has to be restarted

here ist the full code so far:
(please remember the testtone is about 3kHz and the file is about 30sec)

;nyquist plug-in
;version 3
;type process
;name "TEST for WOW FLUTTER 3 - Demoduation and Normalization..."
;action "preparation for for analyzing wow and flutter..."


    (defun count-samples (sig crossings)
; number of samples between crossing points     
      (do* ((count 0)    
; total counter
            (total 0)    
; get the next sample value
            (val (snd-fetch sig) (snd-fetch sig))
; zero crossing counter
           (zcount 0)   
; sign of last value 
           (flag val))    
; stop when crossed zero twice and return 'count'
           ((> zcount crossings) total)
        (when (< (* val flag) 0)  
; crossed zero
          (setq zcount (1+ zcount))
          (setq total (+ total count))
          (setq count 0))
        (setq flag val)
        (if (> zcount 0) (setq count (1+ count)))))

; about 2 second for a 3kHz signal
    (setq cycles 6000) 


;;debugflags trace

; remove garbage lowpass at 5kHz
	(setf s (lowpass2 s 5000))
; remove garbage High pass at 1kHz and extract
	(setf s (extract-abs 1 21 (highpass2 s 1000)))
    
(setq peak-s (peak s NY:ALL))
(mult (/ 1.0 peak-s) s)

(setq sample-count (count-samples s (* 2 cycles)))
(setq carrier  (/ (* cycles *sound-srate*) sample-count))

(setf demod (mult s (hzosc carrier)))

Without the last line added, anything is fine and “carrier” pops up in display

The problem is that the function snd-fetch destroys the sound samples. http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index260
Snd-fetch is used in the function (defun count-samples (sig crossings)

To avoid the problem, when you call count-samples ensure that you pass a copy of “s” and not the actual sound “s”.
To do that, change line 43 of your code from:

(setq sample-count (count-samples s (* 2 cycles)))

to:

(setq sample-count (count-samples (snd-copy s) (* 2 cycles)))

ahh - thanks, now it works excellent.
The 3,15kHz carrier is completely suppressed

What you think, is the low freqeuncy rise of the am-demodulation process “real” data or an artefact to be corrected ?
am_demod_test.png

This is the test record by teh way:
test_record.png

but that spike at around 6.3 kHz is (the_carrier x the_carrier).
The signal that represents the “wow” is the low frequency component.

To eliminate the carrier, try a notch filter:
(notch2 sig 6300 1)
http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#index444

The frequency of that very low frequency should be proportional to the change in frequency of the carrier.
That can give you an idea of the amount of wow, but not really a “measurement”.
For a measurement you need to convert the “change in carrier frequency” to amplitude. In other words you need FM demodulation rather than AM demodulation.

There are several approaches to FM demodulation, none of them very easy to do digitally (radios usually do that with an analogue demodulator). I’ll be interested to see what you come up with.
If you get stuck I can probably find a couple of code snippets that may help point toward a solution.

Ok thanks.

The 6kHz peak is of no concern for me, as the whole analysis is meant to be valid only up to roughly 1khz


About the FM demodulation, well might be I have something like a “concept” - well, surely nothing really thought through - I’ll be back.

In which case you can just use a low-pass filter to remove it.

I look forward to seeing what you come up with.

What, if we start to create a FM demodulated signal “from scratch” - rather than to try to “knead” the original signal?

If we sufficiently precise could detect every zero crossing point of the signal given - and thus could calculate the phase difference to the carrier - we then could assign this phase difference as an amplitude value at every zero crossing point of the carrier frequency, resulting in a signal where the phase / frequency information is then present in the amplitude.
Of course, the sampling frequency would then be different - eg related to the carrier frequency.

What you think ?

Here’s a code snippet that might be useful.
It creates a control signal with the proper zero crossing rate.
It should theoretically demodulate the signal as well, if the hop size is one.
One has also to divide by half the peak and subtract 1 to get a signal going from -1 to 1.
You will observe that the frequency goes down in the example output towards the end. This is because 44100 samples are averaged at each measuring point but there will be only silence for the last samples at time 3 seconds.
For a demodulation, the size and amp values have to be adapted (according to the expected carrier frequency). There are lots of other demodulators possible. This procedure is at least faster than a sample-by-sample analysis.

;; Zero crossing rate
;; Size = amount of samples that are averaged
;; hop = samples to advance,
;; determines the sample rate of the result
;; amp is used to disable the averaging 
(defun zcr (snd size hop amp)
  (let* ((zero (snd-from-array 0 (snd-srate snd) (vector amp))))
         (snd-avg (s-max 
            (trigger (mult -1 snd) zero) 
            (trigger snd zero)) size hop op-average)))
;; Modulated signal (sweep)
(setf mod-signal (abs-env (hzosc (pwl 0 440 1 440 3 880 3) )))
;; Output to debug screen
;; We gather all 44100 samples at a sample rate of 20 Hertz
(snd-display (zcr mod-signal 44100 2205 22050))

Sorry, I’m completely in the woods

What I can see is, that your code is genereating a signal at full amplitude with a very strange spectrum, that develops towards lower frequencies as the time of investigation (3sec in your example) is set to longer periodes.

Could you please elaborate on what this code is actually doing and why you have chosen to do it that way - meaning - in which way could it possibly benefit the purpose of creating a signal with the T of crossing points translated to amplitude values ?

please bear with me, as I’m no more than a bloody beginner here.
:sunglasses:

What I ment to do was:

  • calculate the time between the first and the second zero crossing of the signal under investigation

  • calculate the difference of this time to half the T of the carrier

  • set the result as the amplitude value at new sample point 1

  • calculate the time between the second and the third zero crossing of the signal under investigation

  • calculate the difference of this time to half the T of the carrier

  • set the result as the amplitude value at new sample point 2

and so on…

Without doing some fancy and complicated interpolations, measurements for single cycles will not be accurate enough to get meaningful results.
If you are simply counting the number of sample, then the error will be up to 1/sample-rate, which at 44100 Hz sampling frequency is around 0.023 milliseconds.
For a 3000 Hz sine wave, one full cycle is about 0.333 milliseconds, so your error margin is around 7%. The simplest way to achieve greater accuracy is to average the calculated duration over a longer period.

Ahhh … silly me

of course, the spectra is - and must be - limited towards low frequencies, as teh signal under investigation becomes shorter.
FM_sectra.png