Does something like this exist?
I do a lot of speech recording and editing with Audacity, and after a recording session with multiple retakes, finding places to cut the results together with the waveform alone can sometimes be tricky. I’m thinking of some kind of plugin that would:
- Take the selected audio and run it through a speech-to-text library
- Send back a set of words/sentences that it recognized along with timestamps
- Show a rough summary of what it heard, aligned to the waveform.
I’m demonstrating the level of accuracy I’d be happy with in the screenshot - it wouldn’t even have to be particularly accurate to be recognizable And it looks like a sync-locked label track would work to display the result generated by the plugin.
Has anyone done something like this already, or would find it useful if I were to look into Audacity plugin development?