I have several files with a speaker in a foreign language, whose speech is followed by a translation. I want to create files that contain only (or mostly) the translation without the foreign language speech.
I marked the begin of foreign language sections with “d” and the begin of the translation with “d”. (using Ctrl-M)
I woud like to remove the section between “d” and “t” and to keep the sections between “t” and “d”.
Are there any ideas? I’m not familiar with audacitiy and have some knowledge in python.
The way that I would have approached this would have been with region labels rather than point labels (https://manual.audacityteam.org/man/label_tracks.html)
If all of the parts to be deleted were marked with region labels (and the parts to be retained were NOT marked), then the procedure to remove the marked regions is straightforward. See “Edit menu > Labeled Audio > Delete” https://manual.audacityteam.org/man/edit_menu_labeled_audio.html
If you have already done a lot of marking with the scheme outlined in your post, then perhaps you could write a Python script (or do some spreadsheet magic) to create a new label track with region labels, using the data from your current label track. To do that, you would Export the label track (https://manual.audacityteam.org/man/file_menu_export.html), convert it with your Python script, then Import it back into the project (https://manual.audacityteam.org/man/file_menu_import.html)
The format for label track files is described here: https://manual.audacityteam.org/man/importing_and_exporting_labels.html
Just so we’re clear. When you delete a segment, do you want that time to vanish and the two keeps to hit each other, or do you want the delete space to go to silence and keep all the timings.
People who manage translations and classes frequently want silent stretches for the students to try the translation. That’s much more popular than what it sounds like you want to do.
Koz
Thanks to your hints, I was able to solve the problem with the following python script:
"""
This script creates a file with region labels from point labels in audacity,
designed for the case, where the regions will be deleted from audacity
Point labels "b" stand for the begin of the regions to be created.
Point labels "e" stand for the end of the regions to be created.
Usage:
Insert "b" and "e" labels into audacity
Export point labels from audacity.
Adjust path and input file name in lines 24 and 25 of script.
Run script.
Import regions.txt into audacity.
"""
import sys
class Label:
def __init__(self, begin, end, text):
self.begin = float(begin)
self.end = float(end)
self.text = text
path = "PathToYourDocument/"
input_file_name = "YourInputFile.txt"
infile = open(path + input_file_name)
point_labels = [] # create empty list for all labels
# read point labels from infile into list point_labels
line = infile.readline()
while line:
tmp = line.strip("\n").split("\t")
point = Label(tmp[0], tmp[1], tmp[2])
point_labels.append(point)
line = infile.readline()
infile.close()
# test, if there are only "b" and "e" as texts by putting all texts into set s
valid = {"b", "e"}
s = set()
for p in point_labels:
s.add(p.text)
if not s == valid:
print("there are invalid texts:" + str(s))
sys.exit()
# write regions to delete to text file
out_file_name = "regions.txt"
outfile = open(path+out_file_name,"w")
# tolerance was inserted to make sure that not too much will be deleted
tolerance = 1.5
for i in range(len(point_labels)-1):
this_point = point_labels[i]
next_point = point_labels[i+1]
if this_point.text == "b": # a region to delete begins with text "b"
if next_point.text == "e": # write region only, when text "b" is followed by text "e"
outfile.write(str(round(this_point.begin + tolerance,1))+"\t")
outfile.write(str(round(next_point.end - tolerance,1)) + "\t")
outfile.write("toDelete"+"\n")
else:
print("wrong label in line "+str(i)) # if "b" is not followed by "e"
sys.exit()
outfile.close()
Super Thanks for the update.