Following the suggestion of leaving Silence Finder and Sound Finder more or less as they are (simple and unsophisticated) and adding “Advance Sound Finder” as a new plug-in, here’s an updated version of Silence Finder and Sound Finder.
The original code is incorrect (a bug) in that the silence threshold is the same regardless of the number of channels, but stereo tracks are summed which will raise (roughly x2) the background noise level. The update fixes this by using the maximum of the two channels.
Superfluous text removed from the ;info (in line with other bundled effects).
Square brackets in GUI replaced with round brackets (in line with other bundled effects).
License information added to code (in line with Nyquist plug-in recommendations)
Minus sign for silence threshold level moved to the value (in line with other bundled effects).
Add label at end option removed from Sound Finder (as previously discussed).
I’m still not convinced by the extremely long control names. They are out of keeping with other Audacity effects and look ugly, but perhaps easier for novice users.
Following on from the idea of retaining the simple versions I’ve expanded the functionality of the “Advanced” Sound Finder.
Here’s what it looks like with the current default settings. (I’m happy to tweak the defaults if there is a case for alternative default settings).
New silence detection methods.
Peak Level : Same as SilenceFinder and Sound Finder. Most accurate label positions but most prone to false labels due to clicks.
RMS Level : Detects sounds based on the rms level of the sound. Less accurate label positions but improved rejection of clicks.
Filtered Peak : DC offset and high cut filter to improve click rejection. Very slightly less accurate than Peak method.
Filtered RMS : Best click rejection but least accurate label positions (roughly +/- 20 ms)
This is a renaming of “Minimum Label Length”. It describes what this plug-in feature actually does rather than just the effect on region labels. I think the new name provides a better idea of how this feature might be used, and can be more easily explained in the manual.
“Number of Digits” and “Digit Position” combined into one control. Only one combination will ever be used at a time, so it saves space to combine them. The options are:
“None-text only” / “1 before label” / “2 before label” / “3 before label” / “1 after label” / “2 after label” / “3 after label”
“Number Only” is achieved by leaving the “Label Text” field empty.
“What To Label”.
Start of Sound : Useful for splitting long recordings without removing intentional pauses/silences.
End of Sound : Counterpart to the option above.
Sound Region (default) : Same as SoundFinder
Silent Region : Marks the spaces between sounds. Useful for cutting out silences when a single output file is required.
There is fairly comprehensive error checking and hopefully meaningful error messages if wrong data is entered.
In cases where user settings will produce weird labels (such as before zero or labels with start points after end points) the plug-in attempts to make an “intelligent” decision about where to place the label.
There’s still a little tidying up to do to the code, but functionally this does everything that I want it to do.
Update to the (proposed) manual page to follow.
Marking Silences does not mean that the plug-in is “detecting silences”. The sound detection algorithm is identical to the other labelling methods. The only difference is that the gaps (silences) between detected sounds (or groups of sounds) are labelled rather than labelling the sounds themselves. AdvancedSoundFinder.ny (12.9 KB)
I agree with what I take as your comment here that we don’t need two Sound Finders. I have never liked the similarity of Silence Finder and Sound Finder, and suspect that few novices use Sound Finder given most documentation bundled with turntables and cassette decks will only mention Silence Finder. So I would think we could bring the current “Sound Finder” to the end of the line with your bug fix and see if we can get (Advanced) Sound Finder to replace it in Audacity.
OK, good points. Though I don’t think the end point default should be set to assume fade outs.
Is this the same as the erstwhile “Minimum duration of silence between sounds”? Someone who e-mailed me was having a lot of trouble with “allowing” something “less than”, but less trouble with ignoring “less than”.
Any reason not to use “Ignore”? It’s shorter too. I am not sure I like “isolated” either ; it may confuse people who are not thinking exactly on your wavelength. Does isolated mean any more than “between recognised sounds”?
Could be pushing it as I would prefer on balance not to retain a “simple” sound finder in Audacity.
I thought we abandoned something like that because it gave the impression that the labels were joined together? Region labels will be the main use rather than point labels. "Use one label for sounds less than: ? I don’t think that’s ideal, but I think it is better than “Group Sounds” even for point labels.
So as far as I can see, “label poition before start of sound” affects the position when labelling the start of the sound, but not when labelling the end of sound, and vice-versa? Not very convinced we need these, but the Manual will need to mention the effect of choosing these on the “Label Position” controls.
My first reaction was that it should actually be “Silent Region between Sounds”, which makes clearer your point that it isn’t silence detection. But again we have interactions with the “Label Position” controls which are potentially confusing. “Label position before start of sound” refers to the sound following the silence, and “label position after end of sound” to the sound preceding the silence, so these controls contract the region until it becomes a point equidistant in the silence. OK, it wouldn’t be reasonable to expand the region, but it will need to be clear in the Manual.
I had not looked at this or Regular Interval Labels for a while and completely mistook “Begin numbering from” as “begin numbering from the nth label”, but of course it means “start the numbering sequence from the first label at n”. “First label numbered as:” could mean “only number the first label”, so that is out. Is “Numbering from first label starts with:” worth the extra words? It doesn’t look too bad in the interface.
Were we going to have a new control in Silence Finder “Minimum distance between labels [seconds]” as in this experimental version ? It has four votes on Feature Requests (plus a couple others not yet added) who asked for this. Maybe there are forum votes not officially counted too? I think this might be more useful than adding silence regions to Silence Finder, especially if we have “silent regions between sounds” in (Advanced) Sound FInder.
Thanks for the comprehensive feedback Gale. It’ll take me a little while to work through all of the points.
Regarding wording in the GUI I think that a lot of the problems come from trying to help the user (perhaps too much sometimes). I’m looking at my amplifier and it has a control for “Increase or decrease the low frequency level”, but it does not say that, it just says “BASS”, and the “Input Selector Switch” doesn’t say anything, it just has options for “MD/Tape, Tuner, CD, AV/DVD, AUX/Phono”. The amp came with a manual.
By the way, I’ve been having trouble with the “S” key on my keyboard (hence"label poition") but I’ve fixed that now (the letter “S” is used a lot in Nyquist and “setq”/“setf” were regularly coming out as “etq”/“etf” )
I agree, hence the proposed defaults of zero.
What do suggest for the maximum slider ranges for “Label position before start / after end of sound”?
0.0 to 1.0 (default 0.0) for Label position before start?
0.0 to 10.0 (default 0.0) for Label position after end?
The control allows “gaps” (silences) in the sound up to a specified duration.
Perhaps a better name for this control would be:
“Allow gaps in sound up to: … seconds”
I think the meaning is more obvious than “Minimum duration of silence [seconds]”
“Minimum duration of silence” could be read to imply that this amount of silence is allowed at the end of a sound, thus if a sound ends at 10 seconds and “Label position before/after” are set to 0.0 and “Minimum duration of silence” is set to 5.0 seconds, the user may (wrongly) assume that 5.0 seconds of silence before/after the sound is “allowed”, thus Sound Finder will mark the end of the sound at 15.0 seconds (in fact it will mark at around 10.0 seconds, unless trailing silence less than 5 seconds is included in the selection - another somewhat strange behaviour).
I prefer the word “disregard”.
As an example, let’s say that we have:
2 second sound → 1 second gap → 1 second sound → one second gap → 2 second sound.
If we “disregard” the 1 second sound in the middle, then effectively we have a 3 second gap (from 2.0 to 5.0 seconds). This is what the plug-in does.
If we “ignore” the 1 second sound in the middle then it is ambiguous as to whether we have effectively a 2 second gap or a 3 second gap.
"Does isolated mean any more than “between recognised sounds”? "
No, but it’s shorter.
“Disregard isolated sounds less than:…seconds”
“Disregard sounds between recognised sounds less than:…seconds” (longer and looks wrong)
“Sounds less than…seconds in silences between recognised sounds” (text after value much too long)
Any better suggestions?
I agree, but as “Advanced Sound Finder” was posted as a culmination of 7 pages of discussion I was not anticipating any major changes.
I can go back and add a date to each version if you think that necessary (the date is the same as the forum post date).
Following the feedback from Gale, Peter and Bill, I’ve made a few tweaks to Advanced Sound Finder:
Corrected typo “Label position”.
Disabled “Begin numbering from” option. Now always numbers from 1 if numbering is selected. (may be re-enabled by removing one semicolon from the start of line 22).
Changed error message from: “Try setting ‘Disregard isolated sounds’ to a smaller value” to: “Try setting ‘Disregard isolated sounds’ to a smaller value, or select a longer track region.”
“Label position before start of sound” range 0 to 5 seconds, default 0.1 seconds to help avoid clipping the beginning of tracks.
“Label position after end of sound” range 0 to 10 seconds, default 0.3 seconds, to help avoid clipping the end of tracks.
“Allow gaps in sound less than … seconds” changed to “Allow gaps in sound up to: … seconds”
“What to label” default changed to “Start of Sound” (highly recommended by Peter and Bill)
There is one new feature -
Option: “Allow overlapping labels” [No / Yes] default=No.
When set to “No” region labels will be constrained so that they do not overlap. AdvancedSoundFinder-23May2012.ny (13.7 KB)
That looks much better providing it does that - within the margin of error, a gap at the stated value is “allowed”.
I don’t think users will see the distinction between disregard and ignore. “Ignore” is simple - it is not treated as a recognisable sound. “Disregard” starts me thinking it has some “special” meaning. Same problem with “isolated” - does it have some special technical meaning for which you have to RTFM?
Is “ignore sounds less than” misleading? Or (if “disregard” is better), “Disregard sounds shorter than”?
I would really be guessing what that was without trying it or RTFM. Is it ever going to produce more than one label for that chosen length?
I suppose due to space constraints we have to choose if this or “Allow overlapping labels” is more important. Stopping overlap is quite a nice idea but isn’t it preventing what the user asked for in the label positioning? Possibly the numbering feature would be more widely used than this.
But not by me - it defeats the main distinguishing purpose of the tool! I would hate having to change that whenever I launch Sound Finder in a new session. You can use Silence Finder if you actually want silence or low level noise included in the exported files.
I think that the current wording is a more precise description of what the plug-in actually does. The plug-in has a “rule” to mark sounds that are above the threshold level, but this control creates a “special case” for short sounds that have recognised silence on both sides. In this special case, the effect is told to disregard the first rule and so not mark that sound.
If you’re happy that “ignore sounds less than … seconds” conveys the idea adequately then we can go with that.
As is the case with most of the more “advanced” tools (e.g. Compressor, Contrast, Auto Duck, Noise Removal…)
We can just squeeze in both controls if we reduce the ;info to one line of text.
Stopping overlap overrules the label position settings. In practice this can be quite useful as it provides a means to mark the middle of each silence, which can work very nicely when transferring a cassette album to CD.
As described in the Sound/Silence Finder enhancements topic, the “distinguishing purpose” that you are referring to is an illusion. Both Sound Finder and Silence Finder create labels relative to detected sounds - in the case of Silence Finder they are point labels and in the case of Sound Finder they are region labels - other than that they are near identical (and share much of the same code).
The real distinguishing feature is that Advanced Sound Finder is far more flexible with a more powerful algorithm than either of the “simple” effects.
Whatever defaults we choose it will not suit everyone. The best that we can try to do is to please most of the people most of the time. If we have any way of knowing which option most of the people want then that’s the one to go for. In the absence of such information my guess is that most users would want point labels most of the time.
I believe so. Though I prefer “ignore sounds shorter than … seconds”.
I feel “Group” implies something “plural” whereas I think (without studying use cases) that we will only produce one label for this length. If that’s true I think it easier for the user to visualise the labels that are created than ones the algorithm sees but “groups”.
Leave “overlapping” in then, in principle, though I think it is more “advanced” than the numbering feature.
No, the “main distinguishing feature” I would not want to lose the Sound Finder default for is the region distinction. This is a very visually obvious distinction. It is a distinction which at default settings of both effects results in much less “silence” being included when using “Sound Finder” ( that result is aurally obvious ).
As already observed, the fact that Silence Finder really places labels based on detected sound is near-irrelevant to users of Silence Finder. That is not to say it couldn’t be improved. Do you want to rewrite Silence Finder so that it technically detects silence rather than sounds? That’s fine too if it remains “simple” and ideally avoids labelling sound regions. I don’t see what it achieves for most users to put sound regions in Silence Finder. There will probably be resistance to renaming that effect after all this time. We have marking of sound regions in Sound Finder (it’s default at the moment).
Is your aim to have Advanced Sound Finder as a non-shipped effect and replace Silence Finder with a “simple” Silence and Sound Finder? I cannot see that working without over-complicating Silence Finder.
Then that is the flexible one that can be more “complex”.
As you say, information is sparse as to what users of Sound Finder want. But there is no strong demand I know of for changing it to use point labels by default, which seems to me a clumsy way of trying to mark sound regions (excluding silence). I would want strong evidence a long-established default was wrong before changing it. I think that is the general principle in Audacity. This is even more true given we cannot store preferences for Nyquist effects.
OK, I’ve changed it to “Ignore sounds shorter than … seconds”
In a way I guess it is.
The user case in the front of my mind is for splitting lectures/audio books.
The user wants to split in reasonable size chunks, but avoid splitting the natural flow (which is likely to happen with Regular Interval Labels).
If the effect is set to place labels between sentences and similar pauses, then it is likely to produce far too many labels.
Here we see a speech recording. The first label tracks shows the sounds that were detected with a threshold of -30 dB and minimum gap duration of 1.0 seconds. This has detected reasonable gaps, but some of the detected sounds are very short and this would export far too many files, some of which are only a few seconds duration.
The second label track shows what happens if we “group” the detected sounds. In the second track, if a detected sound is less than 2 minutes then it will be “grouped” with the next sound. This process continues until the length of the group exceeds the specified duration. In this case we have grouped sounds that are less than 2 minutes duration. The objective has thus been achieved in that the splits occur in natural pauses (> 1.0 seconds) but the exported files will all have a reasonable duration (at least 2 minutes).
It is perhaps more “advanced”, but it should perhaps be the default.
Can you think of any user cases where it would be desirable to allow labels to overlap?
I’d rather not remove the possibility of overlapping labels in case anyone actually does use that feature, though I can’t think of such a situation.
Which assumes that marking sound regions is not available in the simple effect (an assumption that I’m questioning).
That is one option that I have considered, but I’m not sure how useful it would be.
As you wrote in the other topic: “I think they are wanting a simple tool that places a point label somewhere sensible that gives them a bit of “silence” between the tracks.”
I agree that this is the likely to be the main user case.
Ideally what I’d like to do is to re look at that task and provide a simple and logical tool to perform that task, and then have an “Advanced Sound Finder” to provide the flexibility for users that have more specific requirements.
I agree that if the underlength final label is needed, that would rule out “Minimum label length”. In that case I would be happier with “Combine sounds shorter than:” (you mention “combine” in your docs). That doesn’t imply quite so strongly that the labels are going to be strung together inside another label. Still better IMO would be “Make one label for sounds shorter than:”. It would extend the width by a couple of characters if no other wording changed.
The other reason I prefer “Minimum label length” is that it it cannot possibly be misconstrued as a verb. “Label position” (after all those verbs above it) looks as if you might be “labelling a position” instead of “positioning a label”. I know this can never be perfect and sometimes you have to use verbs.
Not right now, but there will bound to be someone out there that wants that.
Which glosses over the issue that the simple effect is rather less “simple” if sound regions are included (as your mockup demonstrated). Which glosses over that there is also a demand (seen again in the votes) for minimum distance between labels in silence finder. We can overload the simple effect simply by putting too much in it.
For users of the current Silence Finder. I think those that use Sound Finder like its “simple” way to exclude as much silence as possible, but that may be too complex for Silence Finder users. Even if Silence Finder users can live with Sound Regions in the effect, sound regions could not be default. So under your scheme, neither the “simple” nor “advanced” effect would have a default for sound regions. I find that unacceptable.
Again, I think it may simply be too late to rename Silence Finder (other than replace “Finder” with some other term). Users of Silence Finder don’t see your lack of logic, though I can see them appreciating 1) consistent positioning (and possibly an option for) the last label and 2) a solution to sometimes missing the first song when there is not enough trailing silence. But I think the only thing they may actually understand easily is an option to “put an additional label at the start if needed”.
I would still rather see “Silence Finder” bug fixed, not radically altered or extended and not including sound regions (on the basis of what you presented so far).
The first sound starts after few seconds.
It is less than 2 minutes, so the next sound is added to the group.
It is still less than 2 minutes so the next sound is added to the group.
At the 2 minute mark the “group length” is just under 2 minutes duration, so the next sound is added.
The next sound is about 10 seconds duration bringing the total group length to about 2:08
As the group length is now greater than the required 2 minute minimum, the next sound starts a new label.
What if all of the sounds are less than this value? Doesn’t “Make one label for sounds shorter than:” imply that all of the sounds will be grouped into one label? That’s not not what happens. The sounds are only combined up to the specified “group size”.
(If after the final full 2 minutes there are, say, 3 sounds of 10 seconds each with 2 second gaps between, they will form the final group of 34 seconds duration. The final group is less than 2 minutes because we have run out of sounds.)
The documentation is currently incomplete. I’ll update it when the plug-in is finalised.
That will be tricky with the existing SilenceMarker.ny code. It’s already like a plate of spaghetti without adding new logic to it. (Adding region labels does not require changes to the algorithm because the start/end times are already calculated (but not used) in Silence Finder.)
I’m not very keen on having no simple effect for labelling sound regions, so perhaps we need to retain simple versions of both Silence Finder and Sound Finder?
Silence Finder (I prefer the name “Silence Marker” or “Label Silences”):
mark silences (no exceptions)
Minimum distance between labels (default 0 seconds)
marks sounds with region labels by default but with an option for point labels.