Improving Sound Finder

Share your Audacity/Nyquist plug-ins here, or test drive the latest plug-ins submitted by Audacity users.

After testing a plug-in from this forum, please post feedback for the plug-in author.
steve
Site Admin
Posts: 81229
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Improving Sound Finder

Post by steve » Tue Dec 08, 2020 1:38 pm

Do the red arrows indicate where you want the splits to occur?

I don't know how much leading / trailing silence you want, so I've set 0.02 seconds as an arbitrary amount.

finder.png
finder.png (47.9 KiB) Viewed 443 times
Tracks000.png
Tracks000.png (22.06 KiB) Viewed 443 times
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

tlm
Posts: 91
Joined: Thu Nov 22, 2012 4:32 am
Operating System: macOS 10.15 Catalina or later

Re: Improving Sound Finder

Post by tlm » Wed Dec 09, 2020 12:24 pm

The red arrows represent the longer silences, and my point was that these need to be prioritized over the short silences.

My application is language learning. Listening to a native speaker and repeating what they just said is a good way to learn. However, the learner can only handle so much. So it is good to break the audio into phrases, the shorter the better. However, if the phrases are too short they will lack enough context to make sense.

So the idea is that longer silences correspond to the most significant phrases (such as full sentences) so should have priority over short silences, which usually correspond to clauses, or even words. In practice, it's not that clean, of course.

So, now hopefully you see why I wanted code to label the audio based on (1) prioritizing the silence lengths, while at the same time being governed by (2) a maximum sound duration. The noise floor is also important, of course. The 2 histograms below correspond to the sound lengths I generated for one audio using -40 and -30dB. In each case, I used 5 silence tracks using 0.5, 0.4, 0.3, 0.2 and 0.1s, and had min and max sound durations set at 2 and 3.5s
histos.jpg
histos.jpg (58.81 KiB) Viewed 431 times
Sometimes shorter and longer sounds get through, both due to the reality of the audio and the fact that my code isn't perfect.

steve
Site Admin
Posts: 81229
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Improving Sound Finder

Post by steve » Wed Dec 09, 2020 1:22 pm

When programming, it is essential that the program can handle every possible case that it could be exposed to. For a robust program there must be no possible input that can create an "undefined" result.

The problem with having a maximum sound length setting can be seen in this example. Say that we set the minimum sound length to 0.5 seconds and the maximum sound length to 2.5 seconds:

sf.png
sf.png (9.56 KiB) Viewed 428 times
The program needs to define "something" for that long section in the middle.
  • It could throw an error and cancel the effect, but that could be rather annoying.
  • It could add a label after 2.5 seconds of the long sound, even though there is no silence there.
  • It could ignore the max sound length control as there is no obvious place to split.
  • It could split in the exact middle of that long sound (if longer, split into 3, 4 or however many necessary)
  • ...
Whichever way, it's unlikely that it will reliably do what the user wants it to do, and it makes the plug-in more complicated.

If you used point labels rather than region labels, then you could easily just add the occasional extra label by hand if required. Is there a reason why you need to use region labels rather than point labels?
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

tlm
Posts: 91
Joined: Thu Nov 22, 2012 4:32 am
Operating System: macOS 10.15 Catalina or later

Re: Improving Sound Finder

Post by tlm » Wed Dec 09, 2020 6:44 pm

steve wrote:
Wed Dec 09, 2020 1:22 pm
[*]It could ignore the max sound length control as there is no obvious place to split.
I suggest the above. It is realistic -- long sounds that cannot be split are easily understandable to a user, and in my experience common. But the existence of exceptions like this, doesn't lessen the value of having the option when it's what the user is after.

For the final output, I feel region labels make more sense for a "sound finder" since the thing you're trying to find (sounds) have a beginning and an end. Regarding adding labels manually, it's obviously fine as long as there are not too many. But my experiences is that it's often 'too many' which is the whole reason we program.

One thing I find quite interesting is that the "sound finder" methods you've made are assessing sound amplitude (average, rms, peak) while the work-around I've made does not since it is really manipulations based on a silence finder. Of course, in both cases the dB value is set.

Post Reply