AUTOMATING inserting separate voice recordings to create a conversation output. Has anyone done it?

Has anyone created a plugin that automates inserting of the recordings of Character A into character B recording to create a conversation output (maybe using markers)? This is needed when the two character voices have been recorded at separate times, and you need to edit them together. Commonly occurring with recording voices.

The process of highlighting and cut-and-pasting from one track to another is the standard method and is okay but if you have long files and doing this often, it is a repetitious process. Has anyone created an automatic solution for this already?

(if not) I would think that it could be done by using markers and labeling them sequentially, say label the first marker when the first Character’s starts talking with a ‘1’, then label the reply from the second Character with a ‘2’. And the start of the first characters next reply with a ‘3’ etc. (So the length of a character’s speech goes until the next marker on their track.

Happy to pay for such a plugin as it will save the time.
Can anyone help with a solution?

There’s no plug-ins for this that I know of.
Audacity’s usual scripting tool “Nyquist” would not really be suitable because it can only access tracks sequentially one at a time (track 1, then track 2, then track 3… and no going back).

The next version of Audacity has a substantial amount of new “scripting” capability, which “may” be sufficient for this task, but as this is a brand new feature there are currently very few ready made scripts, so it’s a matter of reading the documentation and experimenting. The new version is due to be released within the next couple of weeks. There is some documentation for this new feature here:

Thank you very much, Steve.
But I think it should be possible to solve that ‘accessing of tracks one at a time’ problem by just searching through a track until you hit the next sequential label number (eg. search track 2 until you find label 44, if not then search track 3 -if it exists. etc.). What do you think?
I saw you answered a query on audacity’s batch file capability, and said to the enquirier to use Python (see
I have programmed a few languages, but not Python yet, but I gather it is not that hard (i hope that is true as I would only be creating a simple loop that incremented)
Which way do you recommend? (Sorry I am brand new to audacity programming) Is there a batch facility? Do I implement python like you recommend to that guy/ or do I propose this question on the ‘Programming and Development’ forum instead (now that I know there is no plugin and I have to make one myself.)

Thanks for your guidance
All the best

Nyquist cannot (directly) read label tracks.

In the next version of Audacity (2.3.0), Nyquist has, with some constraints, access to the same Audacity scripting commands that are available to Python. This includes being able to access label track data. The major advantage of Nyquist over Python is that Nyquist is built into Audacity as standard, whereas Python is a totally separate thing, and Python requires Audacity to be build with mod-script-pipe enabled (which currently it isn’t).

Unfortunately, because this is a totally new feature, there is not yet much documentation / support material for using Audacity’s scripting with Nyquist.

If you are interested in taking a look, there is now a “release candidate” for Audacity 2.3.0 (see: The manual for the development version of Audacity is here:

This is probably the most relevant documentation:

There’s lots of documentation for Nyquist. See:

Good on you Steve.

I appreciate the guidance.
At this point, I am fearful that seems past my time-constrained abilities. I could get lost there for some time just effectively ‘setting up’.
So how would I find someone interested in being paid to do that for me?
Do I post something on your Board index > Programming and Development > Nyquist forum?


That’s really outside of the scope of this forum. We don’t offer commercial services. This is a “help forum” where we help people to get the best out of the Audacity software.

Perhaps if you told us more about what you are trying to do, we may be able to make suggestions of other ways to streamline the process.

What is your starting point? Do you have two long continuous recordings, one of “voice A” and the other of “voice B”, or lots of short recordings, or something else?

What’s the end product? A radio show? Should the end result sound like natural conversation?

Is it a scripted dialogue (like a play)?

Thanks, Steve.

Yes, this is normal for our industry. And yes, you can think of it like a radio show. Each character is played by a different voice.
The various actors do their voiceovers at different times and often different venues. So each voice has to be inserted into the script in sections.
So the usual practice is to cut and paste a section of one track then next find its corresponding reply off another track and paste it in. And keep repeating that process manually.
It is repetitious and does your head in. (Especially if one recording does not suit the other adequately and you have to adjust a voice and repeat the whole process from scratch again.)
It would be so much simpler to have the markers on points of the recording and then press a button so it cuts and pastes the pieces from various tracks into the first track at the correct positions. Then all is instantly done. (Is that described clear enough to make sense?)

Hope you can help make it simple for me. I can program but I have not used Nyquist (or Python), and there is always a learning curve. So when you sid I was going to be frontier-ing new paths, that had me worried.

But regardless, thanks for your help. And thanks for the prompt replies. Your service is great.

David Miers

Which industry is that? Language courses?

The reason I’m asking is so that I can get a feel for what you are trying to achieve. With natural dialog, the timing between one voice and another is fluid (there may be gaps, or voices may overlap), but for a listener the precision of the timing is vitally important as it can change the whole mood of the exchange.

There’s an old joke:
Comic: What’s the secret of comedy?
Audience: I don’t know, what is the sec…
Comic: Timing.

What are the timing constraints for this job, and how do you intend to create the labels?

Hi Steve,

Thank you so much for taking an interest.
I like that joke.

The industry is audiobooks.

So as the voices are recorded separately yes there is always a timing aspect to set manually. But at least some programming could get all the bits into all in the right spots automatically.
And hopefully, you can minimize the effort with the timing by leaving the correct amount of space at the end of a section of speech (This would be the pause before the next marker - in my proposed way to use the automated system that I am hoping to create).

The editor might still listen through the dialogue to make sure the correct spacing is there and edit some so the timing is right. (particularly as you said if there was talking over the top of each other.) But at least the bits of speech would all be in the right places already. And the editor would have to listen through to make sure you put the markers in the right spots and check everything anyway. This would at least minimize the editor’s time.

Hope this helps for guiding how I should try and attack this.

All the best Steve

Hi Steve,
I am thinking the logic is something like the following:

Instructions :

The aim is to build up a new track by adding the appropriate new voice in the script.
So first the 3 steps to prepare the tracks in Audacity:

Step 1: Chronologically label markers on each bit of conversation according to the order they appear in the story. Do this for each voice / track.
Step 2: Create a new track (that all the voice segments will be pasted into)
Step 3: Insert a marker titled “Insert-here” at the end of that new track 1.

(Now, make sure that each voice/track has been edited and production done on the voices before activating the plugin.)


The plugin programming would follow something like this:

For Loop =StartingNumber To Finishing number (for say 1 to 1000 and jumps out of loop if a number is not found on any of the tracks)
. For track 2 to LastTrackNumber (no. of voices/tracks open +1)

. If marker.title = loop then ( This finds the track that has the marker with that number)

. Select & copy the area up to the next marker on that track.

. Then in the first track, Find the marker that is labelled “Insert-here”
. Paste the copied section to just before the “Insert-here” marker
. (and leave “insert here” marker at the end of the track for next paste.)
. Endif
. Next track
Next loop


Any help or guidance would be appreciated

OK, here’s how I can see this working:

First you need to label each track, with each piece numbered sequentially, so that the first bit is numbered 1, the next bit numbered 2 and so on (regardless of who said which bit). So you now have a project:
Audio Track 1
Label Track 1
Audio Track 2
Label Track 2

Then use “Export Multiple” based on labels (
Note that Export Multiple mixes all unmuted tracks, and uses the topmost label track, so to export the segments from Audio Track 1, you will need to mute Audio track 2. Then to export the segments from Audio track 2, you will need to delete Audio Track 1 and Label Track 1.

You now have folder with lots of little audio files named sequentially:

That’s the preparation done. So now you need a script to import all of the clips in numeric order, end to end.

Does that sound like it will produce the desired result, or do you need something different?

That was great, Steve. That is definitely the way to attack it. Job done!!
Thank you so much. You are a legend!!! :smiley:

The bit that’s missing:
Align End to End

Sometimes the Hollywood people have someone from production reading the mirror part of the dialog so the actor has something to work from to suggest rhythm and timing, as opposed to reading the script cold which most actors hate.

The industry is audiobooks.

I know ACX suggests a split between audiobooks and radio theater. How did you get, essentially, multi-voice radio programs past them? Or is this a special application?

Would we know any of these works?