Framing & windowing speech signal

steve · November 16, 2020, 1:06pm

“Framing” may mean splitting the data into small sequential and possibly overlapping chunks.
“Windowing” is used with overlapping chunks of data to provide a smooth transition from one chunk of data to the next. (more info about window functions here: https://en.wikipedia.org/wiki/Window_function)

This code from Audacity’s Noise Removal effect gives an example of how windowing may be implemented in C++: https://github.com/audacity/audacity/blob/master/src/effects/NoiseRemoval.cpp

This code gives an example of how windowing may be implemented in an Audacity Nyquist script:
(this code is from this “AGC” effect: https://forum.audacityteam.org/t/agc-automatic-gain-control/26706/1)

;nyquist plug-in
;version 4
;type process
;name "Automatic Gain Control..."
;action "Applying AGC..."
;author "Steve Daulton and Nicholas Kudriavtsev"
;helpfile "agc-help.html"
;copyright "Released under terms of the GNU General Public License version 2"

;control prefilter "Pre-filter audio source" choice "None (Music),Voice,Telephone" 1
;control mix "AGC strength" int "%" 100 0 100

;; To enable the "Gain reaction speed" control, remove one semicolon from the start of the next line:
;;control framesize "Gain reaction speed" real "seconds" 0.5 0.1 10

;control floor "Squelch threshold level" int "dB" -60 -60 -6
;control gate "Squelch attenuation" real "dB" 0 -30 0


(defun chan-max (sig)
"If stereo, return max of L/R"
  (if (arrayp sig)
      (s-max (snd-abs (aref sig 0))
             (snd-abs (aref sig 1)))
      sig))


(defun raised-cos (phase)
"Generate raised cosines for overlapping smoothing windows."
  (sum 1 (osc (hz-to-step (/ framesize)) 1 *sine-table* phase)))


(defun agc (sig)
  (let ((g0list '(0.0))     ; break-point lists for stepping gain adjustments
        (g1list '(0.0 ))
        (g2list '(0.0))
        ;; Each sample of "peaks" is the peak level of a 'framesize' window
        ;; where windows have 1/3 overlap
        (peaks (snd-avg (chan-max sig) frameln step OP-PEAK)))
    (do* ((peak (snd-fetch peaks) (snd-fetch peaks))
          (phase 0 (if (= phase 2) 0 (1+ phase)))
          (start 0 (+ start stepsize))
          (end framesize (+ start framesize)))
         ((not peak))
      (setf gain
        (if (> peak floor)  ; Squelch threshold
            (/ peak)        ; Normalize
            gate))          ; Squelch level
      (case phase
        (0 (setf g0list (append g0list (list start gain end gain))))
        (1 (setf g1list (append g1list (list start gain end gain))))
        (t (setf g2list (append g2list (list start gain end gain))))))
    (mult sig (/ 3.0)
      (sim
        (mult (abs-env (pwlv-list g0list)) (raised-cos 270))
        (mult (abs-env (pwlv-list g1list)) (raised-cos 150))
        (mult (abs-env (pwlv-list g2list)) (raised-cos 30))))))


;; The global variable "framesize" may be set by a control (if the control is enabled).
;; If the control is disabled (default), set it conditionally to 0.5 seconds.
(if (not (boundp 'framesize))
    (setf framesize 0.5))

(setf step (truncate (/ (* framesize *sound-srate*) 3.0)))  ; frames overlap by 1/3
(setf stepsize (/ step *sound-srate*))      ; step size in seconds
(setf frameln (* 3 step))                   ; frame size in samples
(setf framesize (/ frameln *sound-srate*))  ; actual frame size seconds

(setf wet (/ mix 100.0))
(setf dry (- 1 wet))

;; The effect GUI could be simplified by omitting the "Squelch attenuation" control
;; and calculating a value based on the "floor" threshold level.
;; In practice, the added flexibility of a separate contril is often useful.
(if (boundp 'gate)
    (setf gate (db-to-linear gate))
    (setf gate (/ (db-to-linear (* -1.0 (+ floor 30))) 2.0)))
(setf floor (db-to-linear floor))


;; Prefiltering:
(case prefilter
  (1  (setf *track*
        (if (>= *sound-srate* 16000)
            (lowpass2 (highpass4 *track* 150) 7000)
            (highpass4 *track* 150))))
  (2  (setf *track*
        (if (>= *sound-srate* 11025)
            (lowpass4 (highpass4 *track* 300) 5000)
            (highpass4 *track* 300)))))

;; wet/dry mix
(sum
  (mult wet (agc *track*))
  (mult dry *track*))