I have a few very large audacity projects (1-2 hours each, 8 mono tracks) which I would like to open and analyse in C++. I have been trying to go through the source code and finding a simple way of reading these projects, but so far in vain.
Well, I’m only getting started and am not completely sure of all the details yet, but it will surely involve a few Fourier transforms, comparisons between the channels, …
I just need easy and fairly “transparent” access to the data. Converting to wav files is alright, but they are such large files that it can be a little inconvenient.
You could export manageable sized sections of the recording as raw PCM format.
By default Audacity stores the data in 32 bit float format, so exporting as 32 bit float PCM RAW will be an exact copy of the project data.
“File menu > Export Selection”
then select “Other uncompressed formats” as the file format,
then click the Options button to select RAW (headerless), 32 bit float.
The analysis however will be done in C++, hence why I was trying to find a way of easily opening an Audacity project and reading the data. I know how to export the data from the Audacity GUI.
I guess it comes down to the question, do you want to spend time and effort developing the code to read a project, or spend that time developing the code to analyze the data.
Audacity uses an open XML file format to encourage interoperability. Beginning with version 1.3.0, we are publishing the following DTD to aid those who want to develop software that reads and writes Audacity project files:
audacityproject-1.3.0.dtd
More documentation on the Audacity project file format is forthcoming.
This is an invitation to freely develop compatible software and I think the Audacity team should keep encouraging this philosophy. It is true that trying to retrieve audio from an Audacity project is not exceptionally easy, but it is far from being impossible for a C programmer.
First of all you need to download the dtd document from http://audacityteam.org/xml/audacityproject-1.3.0.dtd. This document describes the structure of your .aup project file, which can be browsed with any text editor. Though this official document is outdated, it will suffice to locate and retrieve audio information. Essentially you have wavetracks that contain waveclips that contain waveblocks, each one pointing to an .au file. The naming of these files (each one of at most about 1 Mbyte) is ackward, but their order is unequivocal.
Second, you need the format of .au files that are mentioned in the .aup XML document and that you will find digging inside the project’s data file directory, which can be found in the same directory of the .aup file.
This is the more difficult part since there is no official complete information about its structure (it is similar but not identical to Sun Microsystems’ .au files). However, after doing some research I could manage to put most pieces together. The structure seems to be something like this:
Bytes . . . . . . . . . . . Content
4 . . . . . . . . . . . . . “dns.” [exact reverse of .snd, the “magic number” of Sun Microsystem’s .au]
4 . . . . . . . . . . . . . Little endian hex Offset to audio data
4 . . . . . . . . . . . . . 0xFFFFFFFF [code to describe unknown --may be irrelevant?-- length]
4 . . . . . . . . . . . . . 0x06000000
4 . . . . . . . . . . . . . Little endian hex version of sampling rate in Hz
4 . . . . . . . . . . . . . Little endian hex version of number of interleaved channels
8 . . . . . . . . . . . . . "Audacity"
12 . . . . . . . . . . . . "BlockFile112" [Why "112"?]
Offset - 44 . . . . . . . 1/85 subsampling of audio data, presumably for quick waveform rendering at very low zoom
File_size - Offset . . . Audio in 32 bit IEEE-754 floating point
You are interested in the second line and last one, which contains the tiny portion of audio (about 6 s) contained in each file.
Please comment if this has been of help.
Federico Miyara
Just a comment:
Developing an application that is interoperable with Audacity projects does not require the C/C++ programming language. Any computer language could be used to create a program that is capable of reading/writing Audacity projects. The project format is independent of the programming language.
Interoperability with the Audacity project format is not in itself required to access the data of an Audacity project. There are many other approaches that can be taken. The “best” approach depends on what you are trying to achieve.
An example of where it might be best to develop direct reading of Audacity projects could be if you already have a working application and you want to add support in that application for Audacity projects.
An example of where developing a module might be the best approach could be if you wish to develop a program that adds functionality to Audacity.
An example of where using the Audacity GUI to export audio data might be if you have a few Audacity projects and you want to analyze data from those projects.
Other approaches include:
Using mod-script-pipe.
Developing your application as a plug-in (for example, a LADSPA, Nyquist, AU or VST plugin).
Modifying the Audacity source code so as to add the required functionality into Audacity.
Developing an application that is interoperable with Audacity projects does not require the C/C++ programming language. Any computer language could be used to create a program that is capable of reading/writing Audacity projects. The project format is independent of the programming language.
I agree. I just mentioned C programming because Enedrox was programming in C. I myself feel more at ease with Matlab or Scilab than with C.
As regards
Other approaches include:
Using mod-script-pipe.
Developing your application as a plug-in (for example, a LADSPA, Nyquist, AU or VST plugin).
Modifying the Audacity source code so as to add the required functionality into Audacity.
fact is that those of us who are used to other language it is easier to dig into .aup / .au / .auf files and recycle well-tested pieces of code that we have developed before than trying to express our ideas in a different, unfamilar framework–at least for a beginning.