help (and wisdom) wanted: How to record and name 34,000 very small files as efficiently as possible?

aud-dude · September 23, 2018, 10:15am

Hi, I’m not a power Audacity user, but need help from you who are.

We’ve literally created a dictionary for a minority language, and now want to create a corresponding audio resource for each of the 17,000 words and their 17,000 corresponding example sentences.

Our text database has 17,000 lines in it that include:
(a) a number, starting at 1 and going to 17,000,
(b) a word, and
(c) an example sentence that corresponds to the word.

Though I am eager for wisdom about the best audio file type to use, and settings, I post in hopes that you will help me figure out super fast way to record these words and example sentences, having organized file names as we go.

How would you do this without having to type in the file names one at a time, hit save, etc.?

Can a macro (?) be set up so that the reader can hit a button to start, read a word, hesitate (while it sequentially names the file), read the sentence, hesitate (while it sequentially names the file), etc. etc.?

Seems like there needs to be some flexibility for the times when a recording is goofed up, so that words can be redone.

THANK YOU.

kozikowski · September 23, 2018, 2:17pm

Google didn’t work for you, either?

I don’t think you’re going to be able to automate it in the way you want. For one thing, it presupposes that nobody ever has to breathe or cough. Presentation errors will just kill you. Did somebody write that pronouncing the word and using it in a sentence have to be separate? I know that’s how it’s done, but it’s done by corporations, not someone producing a reference work for a little known language.

That reduces your workload from 34000 to “only” 17000.

“Procol [pause] Procol [pause] Long walks with bad shoes hurt my procol.”

That’s one entry, not two.

You can insert Labels on the fly into an Audacity recording.

https://manual.audacityteam.org/man/label_tracks.html

When you get to the end, File > Export Multiple and Audacity will produce sequentially numbered sound files based on the label positions.

Some audiobook readers produce a significant sound event (blow a whistle) when they make a mistake so to make finding that mistake later easier by inspecting the blue waves…manually, finding it and manually correcting it.

I don’t see a normal human pushing a button announce an entry, pushing a button and announcing a second entry > etc. It’s not unusual for audiobook readers, who are naturals for automated production, manually correcting words and sentences extensively through the work.

All that is assuming you get your microphone working correctly and clearly in a quiet, echo-free room.

Koz

aud-dude · September 23, 2018, 3:08pm

Agreed. In addition to being able to redo mistakes, there needs to be a way to hit pause. I think those to things will go a long way.

Thanks for brainstorming, but we need unique files for the words and their definitions.

We could experiment with the idea of pushing a button per entry but need the naming to be automated and easily edited. Still eager to hear ideas on this.

steve · September 24, 2018, 8:44am

How about using a “text to speech” application rather than recording a real voice?
You may be able to automate a text to speech application using an automation app such as “Hammerpoon”, “Karabiner”, “Keyboard Maestro” or AppleScript.

aud-dude · September 25, 2018, 12:05pm

Interesting idea. We’ll explore. Thank you!