Annotating speech

Does something like this exist?

I do a lot of speech recording and editing with Audacity, and after a recording session with multiple retakes, finding places to cut the results together with the waveform alone can sometimes be tricky. I’m thinking of some kind of plugin that would:

  • Take the selected audio and run it through a speech-to-text library
  • Send back a set of words/sentences that it recognized along with timestamps
  • Show a rough summary of what it heard, aligned to the waveform.

I’m demonstrating the level of accuracy I’d be happy with in the screenshot - it wouldn’t even have to be particularly accurate to be recognizable :slight_smile: And it looks like a sync-locked label track would work to display the result generated by the plugin.

Has anyone done something like this already, or would find it useful if I were to look into Audacity plugin development?

Audacity does not do speech-to-text.
Furthermore, speech-to-text is technically very difficult and is not likely to be incorporated into Audacity in the foreseeable future.