How to translate/decompose words in label track?

djulien · November 10, 2012, 7:24pm

How would I go about translating/decomposing Labels into phonemes using Audacity? The Label Track feature is very useful for adding and positioning lyrics, markers or “cues” into the audio, so it seems like most of the functionality of Papagayo is already there (http://www.lostmarble.com/papagayo/index.shtml). The only missing part seems to be the automatic decomposing of words into phonemes using a dictionary, so I was thinking that it should be possible to do this using Labels in Audacity. (There would also be the part about displaying mouth shapes, but that seems like an easier problem to solve).

I don’t see any pre-built functions for manipulating the text within Labels and then splitting/positioning those, so I assume that I would need to write some kind of script or plug-in (unless I overlooked something already available). There seems to be multiple ways to do this: scripting, Nyquist, plug-in, etc. I am not sure which approach would be the easiest or most appropriate, and it looks like there would be a learning curve for any of those, so I was wondering if someone could point me in the right direction (I don’t mind writing code).

(Sorry if I posted in the wrong place - I didn’t think I should post in the Features forum yet until I knew if there was an easy DIY way to do this).

don

steve · November 11, 2012, 4:49am

Your correct, Audacity does not have any functions for manipulating the label text.
It is possible to import and export labels.

Plug-ins are not able to directly read label tracks. Nyquist probably comes closest as it is at least able to write labels, but not read them.

The easiest approach would probably be an indirect approach: Export the labels with the words (creates a text file) then write a program in your preferred language to manipulate the text, then import the modified label track back into Audacity.
See the bottom of this page for importing/exporting labels: Audacity Manual

If you are an experienced C++ developer then there are other approaches that you could take.

djulien · November 12, 2012, 5:46pm

Steve,

Thanks for the info!

It is possible to import and export labels.

Plug-ins are not able to directly read label tracks. Nyquist probably comes closest as it is at least able to write labels, but not read them.

I think I read that Nyquist can do file I/O, so could I export the labels to a file, feed the file to Nyquist, then write the new labels using Nyquist?

The easiest approach would probably be an indirect approach: Export the labels with the words (creates a text file) then write a program in your preferred language to manipulate the text, then import the modified label track back into Audacity.

Could I use Audacity scripting to perform the export, shell out or invoke an external program to update the label file, then re-import the new label file? I’ve read that the scripting feature is development/experimental only, but if it is stable for triggering a menu item and invoking an external program, then this sounds like a good solution. (trying to minimize the number of manual steps for the user).

See the bottom of this page for importing/exporting labels: > Audacity Manual

I had looked over an older version of that info but I missed the Labels Editor. Is that built-in to the core Audacity code, or is it something like a plug-in that I could wedge in some additional code to alter the labels?

don

djulien · November 12, 2012, 5:55pm

If you are an experienced C++ developer then there are other approaches that you could take.

(sorry, I forgot to reply to this part) Yes, I am comfortable with C++ so if this feature would be of general interest I could take a try at adding it directly to the Audacity source code (probably Label Editor). That would have been my default approach if none of the other approaches worked; I’m just trying to reduce the number of learning curves needed in order to obtain the dictionary/decompose feature.

don

steve · November 12, 2012, 7:37pm

Yes, Nyquist can read/write files, though it is not very convenient because there is no file browser widget for Nyquist plug-ins, so the absolute file address needs to either be hard coded or (more usually) entered manually by the user.
An example of file output can be found in the “Sample Data Export” plug-in.

I’ve done a bit of testing with scripting support (using Python and Bash on Linux) and it’s worked pretty well. Apparently it is not suitable in a corporate environment as there are security issues that have not yet been addressed (I believe this is the main reason that it is not currently built in the release version). To enable scripting support it os necessary to build Audacity from the source code, then build the scripting module (must have the same build date), then there is an option in Audacity Preferences (Edit menu > Preferences) to enable the module.

Setting up the build environment on Windows is not a trivial task for non-programmers. Instructions are given here: Missing features - Audacity Support

It’s built into the core code.

That opens up a lot of possibilities.

Yes it could perhaps be built into the Label Editor, but that is already quite a “busy” (complex) interface.
Alternatively it could be developed as a separate module. The Audacity developers are keen to implement more modules as it can add features with little risk of destabilising the core code.

Apparently one of our forum users Edgar has done some work with hacking into Audacity to automate tasks, so he may be able to help in some areas (he mostly posts in the Programming section of the forum.

Another approach could be to look at how to make label tracks accessible (for read) to Nyquist. This is a feature that I would very much like to see.
If Nyquist could read label tracks then this job (and many others) would be trivial to accomplish in a Nyquist plug-in.

djulien · November 12, 2012, 8:49pm

Yes it could perhaps be built into the Label Editor, but that is already quite a “busy” (complex) interface.

I was thinking just one more button near the button, which would then pop up another window or expand the existing window if any additional prompts were needed (naming and location conventions could be used to avoid most of the prompts, so there might not need to be any other UI stuff, although that takes away a little user flexibility).

Alternatively it could be developed as a separate module. The Audacity developers are keen to implement more modules as it can add features with little risk of destabilising the core code.

I got the impression from the Requested Features page that there are a lot of other items that would take priority over this, that would be competing for their time (assuming you meant that they would implement this feature rather than me), but I may be incorrect.

Another approach could be to look at how to make label tracks accessible (for read) to Nyquist. This is a feature that I would very much like to see.
If Nyquist could read label tracks then this job (and many others) would be trivial to accomplish in a Nyquist plug-in.

I’m willing to implement it which ever way is mutually most beneficial (assuming it’s not a lot more effort and learning curve one way or the other). Would there need to be some kind of design review process for that, or would I just do it and then it would need some adjustments or rework by the developers if it were to be added back into the main source tree?

don