Need advice about a bio-acoustics project using Audacity

bftech · November 9, 2018, 11:57am

Hi, i’m a technician in a French university ethology laboratory and we are using software to analyse birds and mammals song. We had for a long time a software developed by someone from the lab years ago (first version was on amiga !) but now this software is harder and harder to install and we need something else. So my idea was to extract (or replicate) useful features from it and add them in a up-to-date “host” like audacity, which would take care of the “common things” like files IO, simple effect, cutting en manipulating sounds… instead of me redoing everything.

But i’m stuck with HOW to do it ? So far i found some possibilities :

Forking Audacity : maybe the most ambitious one, but our goal is to make some kind of a “real software” for bio-acoustics and not some small utilities, so it may not be a bad idea (plus i could change the interface).
Developing plugins for Audacity : maybe the easiest and fastest way ? But which one ? Nyquist, lv2, vst, vamp … ?
Using mod-pip : so i could use python which i’m more familiar with, but in the end isn’t it like forking audacity ?

For my requirements, i would like it to be :

able to analyse sounds and extract data (to get frequencies, harmonics and later pattern searching, cross correlation, etc), i don’t really need to edit the sounds.
cross platforms : windows and linux (ubuntu) at least
easy to update : mainly in the case of a fork, i would like to be able to benefit from the classic audacity updates, without redoing everything each release
easy to install : even if i have to learn how to package, create installer or a custom install script, i would like it to be easy for the non-tech end user to install (it’s mainly why we can’t use our old software anymore).

Sorry if it’s not the right place to post or not the right way to ask, i’m reading the wiki and the forum but maybe i’m missing something.

steve · November 9, 2018, 2:03pm

There are pros and cons to each approach, but you need not use just one approach.

Forking Audacity is, as you say, the most ambitious. It is the most flexible, and probably the most difficult. The code for Audacity is very complex, with around 600,000 lines of code (mostly C++). Even if you are a highly experienced C++ developer, it is likely to require a substantial amount of time and effort just to gain familiarity with the code base. As we use GitHub to host the Audacity code, making a fork is easy, and so long as your custom version does not diverge too far, it should be relatively easy to merge updates from the Audacity code, and to push updates from your version back “upstream” to Audacity.

Nyquist is probably the easiest. It is based on a dialect of the LISP programming language (“XLISP”) with a strong focus on audio. Prototyping can be very quick as there is a large library of audio related functions. Although Nyquist runs as an interpreted language, many of the primitives are written in C, so with careful programming it is often possible to achieve performance similar to a pure C/C++ application.

To give an idea of how rapidly features can be developed in Nyquist, (how little code), here is a script that can be run from the Nyquist Prompt effect that will add a label each time the selected audio exceeds the specified threshold:

;type analyze
;version 4

;control thresh "Threshold" float "dB" -24 -72 -6

(defun label-sounds (sig)
  (setf step (round (/ *sound-srate* 100)))
  (let* ((sig (snd-avg sig step step OP-PEAK))
         (srate (snd-srate sig))
         (labels ())
         (sound-flag nil))
    (do ((val (snd-fetch sig)(snd-fetch sig))
         (count 0 (+ 1 count)))
        ((not val) labels)
      (cond
        ((> val thresh)
            (unless sound-flag
              (push (list (/ count srate) "Sound") labels))
            (setf sound-flag t))
        (t (setf sound-flag nil))))))

(setf thresh (db-to-linear thresh))
(if (soundp *track*)
    (label-sounds *track*)
    (print "Mono track required"))

Although Nyquist is quite an easy language, it may not be the best approach for everything. In particular, working with FFT is very slow and quite difficult.

“Mod-script-pipe” is the new kid on the block, and has recently received a major upgrade.
Basically, mod-script-pipe allows you to control Audacity from an external script (such as Python). Almost everything that you can do in Audacity manually (via the GUI) can be done via “scripting commands”.

Most of the scripting commands are also available to Audacity’s “Macro” feature (see: https://manual.audacityteam.org/man/macros.html), though with macros you can only run a linear sequence of commands.

A large portion of the scripting commands are also available to Nyquist via a special “AUD-DO” function (https://manual.audacityteam.org/man/nyquist_macros.html). The main limitation for running scripting commands from Nyquist is that you can’t run Nyquist scripts from an aud-do command (Nyquist currently does not support multiple simultaneous processes or threads).

I’d suggest not bothering with LADSPA as it is becoming obsolete.

VST “should” work on Windows and Linux, though we have to use 3rd party open source headers for VST, so there can be compatibility problems.

I don’t know much about VAMP, but I’ve seen some impressive VAMP plug-ins, and it is open source, and they are intended specifically for audio analysis. There’s some information here: https://www.vamp-plugins.org/

Regarding a starting point:

Audacity is an extremely versatile open source multi-track audio recorder and editor. We’ve had visitors to this forum from people that use Audacity in other fields of zoological research, including someone that was studying Orca, and someone that was studying nocturnal behaviour of apes. I think that spending some time becoming familiar with Audacity would be time well spent.

Although Nyquist may not be suitable for ‘all’ of your requirements, it is probably the easiest and quickest way in to making custom plug-ins. Because Nyquist plug-ins are just plain text files (compiling not required), installing and modifying Nyquist plug-ins is very easy. We also have a forum board specifically for questions about Nyquist: https://forum.audacityteam.org/viewforum.php?f=39

Trebor · November 9, 2018, 3:54pm

IMO Audacity’s (less powerful) competitor OCENaudio has a better spectrogram : easier to adjust its settings …

Whilst I’m being heretical , have a look at …

http://ravensoundsoftware.com/software/raven-lite/

bftech · November 12, 2018, 10:13am

@steve
Thanks for this very detailed answer !
As you recommend, i think i will start with Nyquist (and AUD-DO) or maybe Vamp as i found a python wrapper for it (Vampy). My only concern with Nyquist is Lisp, which seem very different from the languages i use (mostly python, then js, java, …) but learning one more shouldn’t be that hard !
A fork would have been ideal to make a version dedicated to bio-acoustics, but i’m not a “highly experienced C++ developer”…

@Trebor
Thank you, i already looked at Oceanaudio but it only support VST plugins and we don’t need the real time spectrogram settings that much (our researchers need fixed parameters to be able to compare results).
I tried SonicVisualiser, and i really like the way you can instantly see datas such as frequencies on the spectrogram with the mouse(and i think i will need to fork audacity to add that), but we also need to edit our recordings, clean them, reduce noise… and to keep it easy to our less tech oriented users i would like to use only 1 software.
And for Raven, it seems they already tried it but it was before i was in the lab. I think it didn’t fit their “workflow”. And i can’t add our in own functions to it to make it as we need, no plugins nor open-source. But as it’s made for bio-acoustics, i will check Raven myself, maybe they just didn’t find how to use it correctly. (It surprisingly hard for some people to switch habits ^^)

steve · November 12, 2018, 10:39am

Ooh, that’s interesting. I presume that you mean this: VamPy: Vamp Plugins in Python
I wasn’t aware of that - I’ll have to take a look (when I find time… sigh), then we can share notes

It’s a lovely language once you get used to all of the parentheses (though not as beautiful as Python )

Essential ingredients for Nyquist (LISP) programming:

A text editor that does parentheses matching (preferably one that has some form of LISP syntax highlighting). On Linux I use Scite, and on Windows Notepad++. LISP Syntax highlighting for Nyquist is not 100%, but it is close enough to be useful.

Careful indentation (you should be used to this from using Python). Although LISP does not “require” indentation in the way that Python does, it makes the code MUCH more readable. You should not be relying on counting parentheses to read the code.
Here’s an example from a plug-in that I wrote recently:

(defun stereo-rms(ar)
  ;;; Stereo RMS is the root mean of all (samples ^ 2) [both channels]
  (let ((left-mean-sq (* (aref ar 0)(aref ar 0)))
        (right-mean-sq (* (aref ar 1)(aref ar 1))))
    (sqrt (/ (+ left-mean-sq right-mean-sq) 2.0))))

You should be able to see (even without knowing LISP syntax) that:

This defines a function called “stereo-rms”
The function takes one argument called “ar” (the name could be more descriptive, but this is my shorthand for an array variable)
Nyquist is normally written in lower case (Nyquist is NOT case sensitive, but case is preserved in quoted string literals)
Semicolons (any number) are used for comments
The LET form in this example has two assignments: “left-mean-sq” and “right-mean-sq”.
The function returns the result of (sqrt (/ (+ left-mean-sq right-mean-sq) 2.0))

The other thing that you will probably notice is the “s-expressions” (symbolic expression). In this kind of notation, functions are called as:
(function-name arguments)
so rather than writing (x + y + z), where the function is “+”, in LISP it is written (+ x y z), and no commas!

The example code above “could” have been written as:

(defun stereo-rms(ar)
(let ((left-mean-sq (* (aref ar 0)(aref ar 0)))
(right-mean-sq (* (aref ar 1)(aref ar 1))))
(sqrt (/ (+ left-mean-sq right-mean-sq) 2.0))))

or (even worse) as

(defun stereo-rms(ar) (let ((left-mean-sq (* (aref ar 0)(aref ar 0))) (right-mean-sq (* (aref ar 1)(aref ar 1)))) (sqrt (/ (+ left-mean-sq right-mean-sq) 2.0))))

but I hope you agree that the first form is much more readable.

I would highly recommend taking a glance at the links in this post: Manuals and reference material
Any questions regarding Nyquist may be posted in this forum board: Nyquist - Audacity Forum

Trebor · November 12, 2018, 11:13am

It’s been many years since I last used Raven, it’s improved since I last saw it …

Raven Lite (on Windows 32-bit OS).png
The lite version lacks the search-for-similar-sounds function you’re looking for,
I think you’ll need the Pro ($) version for that.

steve · November 12, 2018, 12:51pm

Unfortunately it looks like development of Vampy has slowed down to nothing over the last few years.
I’ve not tried building from source, but I’ve not been able to get Audacity to recognise the pre-built binaries (testing with Audacity 2.3.1 alpha on Linux). Have you had any success with it?