Intelligent splitting of audio files?

I’m trying to run an interesting experiment. I am testing the use of the Google speech-to-text API to auto-transcribe audio files. The problem is, it only allows clips up to about 15 seconds long, so I have split my audio file into sections. However, when I blindly split an audio file, some words are inevitably split in half. I know it wouldn’t be perfect, but I need some way to automatically split an audio file by voice detection to avoid this issue.

I don’t care if heavy, destructive processing is needed to do this. I only need to know the timecodes, because I could automatically split the original audio by that list of timecodes. For instance, is there some way to (heavily) process the audio until all that remains is mostly voices, the rest being total silence? If so, then I can split by silence detection at whatever time interval I choose – for example, five seconds.

Again, I know such a method wouldn’t be flawless. I am simply looking for a way to split an audio file with a minimum of split words. (It seems to me that any method with some intelligence to it would be better than blindly splitting the audio.) Another important note is that the content of these recordings can be somewhat random – they can clear or noisy, and they can be loud or quiet.

I hope I’ve communicated the general idea. Any suggestions on how to approach this? Thanks!

(P.S.: I’m using Audacity 2.0.3 on Windows Vista 64-bit.)

I suspect somebody would have to write that. Further, you appear to be needing a collection of tools and filters that don’t work too well when used by themselves, and would be a time-consuming disaster when combined.


Yes, that is what I am starting to realize. I’m looking into simpler ways to run this experiment. Thanks for the response.

Assuming that you’re not wanting to automatically process huge amounts of speech:
Use the “Regular Interval Labels” feature (Analyze menu to generate labels every 12 or 13 seconds.
Then go through and manually adjust each label so that it falls in a gap between words. To adjust the label position, click carefully on the circle on the stem of the label and drag it left/right.
You can then use “Export Multiple” to create separate files for each labelled section.