automation using audacity

Hello Guys,

I am trying to automate some stuff in python which requires to automate audacity part also.
Usecase:

  1. Out of two wav file open wav files one by one.
  2. In the wav file which contains audio stream for which wav file was generated, select that wav file and then, there will be multiple audio tracks → select one audio tracks that contains the utterance spoken.
  3. In selected audio audio track, lets say its plays (left and right words) → so for that left word spoken, select that → and get the timestamp of the starting point where stream actually starts (scales in positive direction).
  4. get the timestamp and that file name (randomly select wav file out of two present wav file).
  5. again import pcm file in audacity with some input parameter.
  6. repeat step 2 and 3. (we have to select soken left word only because in wav file we had selected that one only).
  7. get the sample of starting point of left word utterance.


    I would really thanks if anyone can help on this to me

The first thing you need is a version of Audacity that has “mod-script-pipe”.
It is planned to include mod-script-pipe in the next Audacity release. If you don’t have mod-script-pipe, then there is a pre-release “alpha” version available here: https://www.fosshub.com/Audacity-devel.html

To access mod-script-pipe from Python, I’d suggest that you import the pipeclient module which is available here: https://github.com/audacity/audacity/blob/master/scripts/piped-work/pipeclient.py


Audacity does not have a built-in way to detect the start of a sound, so you would need to make a Nyquist plug-in to do that. See: https://wiki.audacityteam.org/wiki/Nyquist_Plug-ins_Reference
If you know that the tracks will always be mono, you could do like this (this code may be run as is in the Nyquist Prompt):

(setf labels ())  ;this list will hold the labels
(setf threshold 0.001) ;Threshold level for non-silence


;; Function to add a label at the n'th sample position.
(defun add-label (n)
  (setf seconds (/ n *sound-srate*))
  (setf time (format nil "~a" seconds))
  (push (list seconds seconds time) labels))

;; Loop through sound until sample > thrshold.
(do ((i 0 (1+ i))
     (val (snd-fetch *track*) (snd-fetch *track*)))
    ((not val) "Error.\nNo sound found.")
  (when (> val threshold)
    (add-label i)
    (return labels)))