Sound Finder / Silence Finder improvements

Thanks for your comments Gale.
I think discussion about the points in your most recent post fit better in the “Sound/Silence Finder enhancements” topic, so I’ll respond to them there.

I agree, hence the proposed defaults of zero.
What do suggest for the maximum slider ranges for “Label position before start / after end of sound”?
How about
0.0 to 1.0 (default 0.0) for Label position before start?
0.0 to 10.0 (default 0.0) for Label position after end?


The control allows “gaps” (silences) in the sound up to a specified duration.
Perhaps a better name for this control would be:
Allow gaps in sound up to: … seconds

I think the meaning is more obvious than “Minimum duration of silence [seconds]”
“Minimum duration of silence” could be read to imply that this amount of silence is allowed at the end of a sound, thus if a sound ends at 10 seconds and “Label position before/after” are set to 0.0 and “Minimum duration of silence” is set to 5.0 seconds, the user may (wrongly) assume that 5.0 seconds of silence before/after the sound is “allowed”, thus Sound Finder will mark the end of the sound at 15.0 seconds (in fact it will mark at around 10.0 seconds, unless trailing silence less than 5 seconds is included in the selection - another somewhat strange behaviour).


I prefer the word “disregard”.
As an example, let’s say that we have:
2 second sound → 1 second gap → 1 second sound → one second gap → 2 second sound.
gaps.png

  • If we “disregard” the 1 second sound in the middle, then effectively we have a 3 second gap (from 2.0 to 5.0 seconds). This is what the plug-in does.
  • If we “ignore” the 1 second sound in the middle then it is ambiguous as to whether we have effectively a 2 second gap or a 3 second gap.

"Does isolated mean any more than “between recognised sounds”? "
No, but it’s shorter.
“Disregard isolated sounds less than:…seconds”
“Disregard sounds between recognised sounds less than:…seconds” (longer and looks wrong)
“Sounds less than…seconds in silences between recognised sounds” (text after value much too long)
Any better suggestions?


I agree, but as “Advanced Sound Finder” was posted as a culmination of 7 pages of discussion I was not anticipating any major changes.
I can go back and add a date to each version if you think that necessary (the date is the same as the forum post date).

Following the feedback from Gale, Peter and Bill, I’ve made a few tweaks to Advanced Sound Finder:

  • Corrected typo “Label position”.
  • Disabled “Begin numbering from” option. Now always numbers from 1 if numbering is selected. (may be re-enabled by removing one semicolon from the start of line 22).
  • Changed error message from: “Try setting ‘Disregard isolated sounds’ to a smaller value” to: “Try setting ‘Disregard isolated sounds’ to a smaller value, or select a longer track region.
  • “Label position before start of sound” range 0 to 5 seconds, default 0.1 seconds to help avoid clipping the beginning of tracks.
  • “Label position after end of sound” range 0 to 10 seconds, default 0.3 seconds, to help avoid clipping the end of tracks.
  • “Allow gaps in sound less than … seconds” changed to “Allow gaps in sound up to: … seconds”
  • “What to label” default changed to “Start of Sound” (highly recommended by Peter and Bill)

There is one new feature -
Option: “Allow overlapping labels” [No / Yes] default=No.
When set to “No” region labels will be constrained so that they do not overlap.
AdvancedSoundFinder-23May2012.ny (13.7 KB)

The “don’t overlap” feature is not quite right.
If we want this feature I’ll work on it.

That looks much better providing it does that - within the margin of error, a gap at the stated value is “allowed”.

I don’t think users will see the distinction between disregard and ignore. “Ignore” is simple - it is not treated as a recognisable sound. “Disregard” starts me thinking it has some “special” meaning. Same problem with “isolated” - does it have some special technical meaning for which you have to RTFM?

Is “ignore sounds less than” misleading? Or (if “disregard” is better), “Disregard sounds shorter than”?

I would really be guessing what that was without trying it or RTFM. Is it ever going to produce more than one label for that chosen length?

I suppose due to space constraints we have to choose if this or “Allow overlapping labels” is more important. Stopping overlap is quite a nice idea but isn’t it preventing what the user asked for in the label positioning? Possibly the numbering feature would be more widely used than this.

But not by me - it defeats the main distinguishing purpose of the tool! I would hate having to change that whenever I launch Sound Finder in a new session. You can use Silence Finder if you actually want silence or low level noise included in the exported files.


Gale

I think that the current wording is a more precise description of what the plug-in actually does. The plug-in has a “rule” to mark sounds that are above the threshold level, but this control creates a “special case” for short sounds that have recognised silence on both sides. In this special case, the effect is told to disregard the first rule and so not mark that sound.

If you’re happy that “ignore sounds less than … seconds” conveys the idea adequately then we can go with that.


As is the case with most of the more “advanced” tools (e.g. Compressor, Contrast, Auto Duck, Noise Removal…)


We can just squeeze in both controls if we reduce the ;info to one line of text.
Stopping overlap overrules the label position settings. In practice this can be quite useful as it provides a means to mark the middle of each silence, which can work very nicely when transferring a cassette album to CD.


As described in the Sound/Silence Finder enhancements topic, the “distinguishing purpose” that you are referring to is an illusion. Both Sound Finder and Silence Finder create labels relative to detected sounds - in the case of Silence Finder they are point labels and in the case of Sound Finder they are region labels - other than that they are near identical (and share much of the same code).

The real distinguishing feature is that Advanced Sound Finder is far more flexible with a more powerful algorithm than either of the “simple” effects.

Whatever defaults we choose it will not suit everyone. The best that we can try to do is to please most of the people most of the time. If we have any way of knowing which option most of the people want then that’s the one to go for. In the absence of such information my guess is that most users would want point labels most of the time.

I believe so. Though I prefer “ignore sounds shorter than … seconds”.

I feel “Group” implies something “plural” whereas I think (without studying use cases) that we will only produce one label for this length. If that’s true I think it easier for the user to visualise the labels that are created than ones the algorithm sees but “groups”.

Leave “overlapping” in then, in principle, though I think it is more “advanced” than the numbering feature.

No, the “main distinguishing feature” I would not want to lose the Sound Finder default for is the region distinction. This is a very visually obvious distinction. It is a distinction which at default settings of both effects results in much less “silence” being included when using “Sound Finder” ( that result is aurally obvious ).

As already observed, the fact that Silence Finder really places labels based on detected sound is near-irrelevant to users of Silence Finder. That is not to say it couldn’t be improved. Do you want to rewrite Silence Finder so that it technically detects silence rather than sounds? That’s fine too if it remains “simple” and ideally avoids labelling sound regions. I don’t see what it achieves for most users to put sound regions in Silence Finder. There will probably be resistance to renaming that effect after all this time. We have marking of sound regions in Sound Finder (it’s default at the moment).

Is your aim to have Advanced Sound Finder as a non-shipped effect and replace Silence Finder with a “simple” Silence and Sound Finder? I cannot see that working without over-complicating Silence Finder.

Then that is the flexible one that can be more “complex”.

As you say, information is sparse as to what users of Sound Finder want. But there is no strong demand I know of for changing it to use point labels by default, which seems to me a clumsy way of trying to mark sound regions (excluding silence). I would want strong evidence a long-established default was wrong before changing it. I think that is the general principle in Audacity. This is even more true given we cannot store preferences for Nyquist effects.




Gale

OK, I’ve changed it to “Ignore sounds shorter than … seconds”


In a way I guess it is.

The user case in the front of my mind is for splitting lectures/audio books.
The user wants to split in reasonable size chunks, but avoid splitting the natural flow (which is likely to happen with Regular Interval Labels).
If the effect is set to place labels between sentences and similar pauses, then it is likely to produce far too many labels.

Here we see a speech recording. The first label tracks shows the sounds that were detected with a threshold of -30 dB and minimum gap duration of 1.0 seconds. This has detected reasonable gaps, but some of the detected sounds are very short and this would export far too many files, some of which are only a few seconds duration.

The second label track shows what happens if we “group” the detected sounds. In the second track, if a detected sound is less than 2 minutes then it will be “grouped” with the next sound. This process continues until the length of the group exceeds the specified duration. In this case we have grouped sounds that are less than 2 minutes duration. The objective has thus been achieved in that the splits occur in natural pauses (> 1.0 seconds) but the exported files will all have a reasonable duration (at least 2 minutes).
window000.png

It is perhaps more “advanced”, but it should perhaps be the default.
Can you think of any user cases where it would be desirable to allow labels to overlap?
I’d rather not remove the possibility of overlapping labels in case anyone actually does use that feature, though I can’t think of such a situation.


Which assumes that marking sound regions is not available in the simple effect (an assumption that I’m questioning).

That is one option that I have considered, but I’m not sure how useful it would be.
As you wrote in the other topic:
“I think they are wanting a simple tool that places a point label somewhere sensible that gives them a bit of “silence” between the tracks.”
I agree that this is the likely to be the main user case.

Ideally what I’d like to do is to re look at that task and provide a simple and logical tool to perform that task, and then have an “Advanced Sound Finder” to provide the flexibility for users that have more specific requirements.

I am not very clear why the first grouped label in your image doesn’t end at about 2 mins. The sound that ends at 2 mins looks like the first one to end after the 1 minute specified.

I am not sure if we discussed it before, but have we agreed the best behaviour is to leave a final label that can be much shorter than the chosen “grouped” length? I don’t see this point in your docs http://manual.audacityteam.org/man/User:Stevethefiddle .

I agree that if the underlength final label is needed, that would rule out “Minimum label length”. In that case I would be happier with “Combine sounds shorter than:” (you mention “combine” in your docs). That doesn’t imply quite so strongly that the labels are going to be strung together inside another label. Still better IMO would be “Make one label for sounds shorter than:”. It would extend the width by a couple of characters if no other wording changed.

The other reason I prefer “Minimum label length” is that it it cannot possibly be misconstrued as a verb. “Label position” (after all those verbs above it) looks as if you might be “labelling a position” instead of “positioning a label”. I know this can never be perfect and sometimes you have to use verbs.


Gale

Not right now, but there will bound to be someone out there that wants that.

Which glosses over the issue that the simple effect is rather less “simple” if sound regions are included (as your mockup demonstrated). Which glosses over that there is also a demand (seen again in the votes) for minimum distance between labels in silence finder. We can overload the simple effect simply by putting too much in it.

For users of the current Silence Finder. I think those that use Sound Finder like its “simple” way to exclude as much silence as possible, but that may be too complex for Silence Finder users. Even if Silence Finder users can live with Sound Regions in the effect, sound regions could not be default. So under your scheme, neither the “simple” nor “advanced” effect would have a default for sound regions. I find that unacceptable.

Again, I think it may simply be too late to rename Silence Finder (other than replace “Finder” with some other term). Users of Silence Finder don’t see your lack of logic, though I can see them appreciating 1) consistent positioning (and possibly an option for) the last label and 2) a solution to sometimes missing the first song when there is not enough trailing silence. But I think the only thing they may actually understand easily is an option to “put an additional label at the start if needed”.

I would still rather see “Silence Finder” bug fixed, not radically altered or extended and not including sound regions (on the basis of what you presented so far).



Gale

The first sound starts after few seconds.
It is less than 2 minutes, so the next sound is added to the group.
It is still less than 2 minutes so the next sound is added to the group.

At the 2 minute mark the “group length” is just under 2 minutes duration, so the next sound is added.
The next sound is about 10 seconds duration bringing the total group length to about 2:08
As the group length is now greater than the required 2 minute minimum, the next sound starts a new label.



What if all of the sounds are less than this value? Doesn’t “Make one label for sounds shorter than:” imply that all of the sounds will be grouped into one label? That’s not not what happens. The sounds are only combined up to the specified “group size”.


(If after the final full 2 minutes there are, say, 3 sounds of 10 seconds each with 2 second gaps between, they will form the final group of 34 seconds duration. The final group is less than 2 minutes because we have run out of sounds.)

The documentation is currently incomplete. I’ll update it when the plug-in is finalised.

That will be tricky with the existing SilenceMarker.ny code. It’s already like a plate of spaghetti without adding new logic to it. (Adding region labels does not require changes to the algorithm because the start/end times are already calculated (but not used) in Silence Finder.)


I’m not very keen on having no simple effect for labelling sound regions, so perhaps we need to retain simple versions of both Silence Finder and Sound Finder?

How about

Silence Finder (I prefer the name “Silence Marker” or “Label Silences”):

  • bug fixed
  • mark silences (no exceptions)
  • Minimum distance between labels (default 0 seconds)

Sound Finder:

  • bug fixed
  • marks sounds with region labels by default but with an option for point labels.

Advanced Sound Finder:

  • A downloadable plug-in on the wiki

Is “Label placement” better?

Now that I’ve had chance to sleep on this, I do see your point that if we have two “song labelling tools” (as we have now) then it is useful if the default for one is point labels and the default for the other is region labels (as we have now, though currently there is no choice of label type within the plug-ins).

What I find problematic is that the default setting for a plug-in should be to make up the numbers.
The idea that Advanced Sound Finder should use region labels as the default because the other effect uses point labels is imo the wrong reason.
The decision about which should be the default setting in any plug-in should imo be based on which option is most useful to most users and for the Advanced Sound Finder I think that the most useful default is point labels.

Sadly there is no mechanism for a user-set default (other than editing the plain text .NY file) so I think we need a different solution.

One possibility is for 3 plug-ins: SilenceMarker (point labels), SoundMarker (region labels), Advanced Sound Finder (choice of label type).
The only problem that I see with this is that there are then 3 song labelling plug-ins which is probably a bit excessive but given your comment (above) I don’t see an alternative.


The other issue that I find highly problematic is the idea of being tied into a bad decision made in the past.
Is it not possible to gradually supersede and replace a feature with a better feature?

In my opinion Sound Finder should replace Silence Finder:

  • First songs can be handled logically with or without leading silence.
  • Trailing silence can be handled logically.
  • Excessive silences between “songs” can be handled.

The only thing that is missing in Sound Finder that is available in Silence Finder is the ability to use point labels and that is easily added with a single user option. If this change is too much of a shock for some users, is there some way that we can lessen the shock?


Minimum distance between labels is available in the Advanced Sound Finder (“groups”). If we really want to keep Silence Finder as simple as possible then why do we need to duplicate this feature in (simple) Silence Finder? It would be more simple without it.

@Steve, @Gale,
Would there be any advantage, either now or at some point in the (near?) future, in my offering my services as a “beta tester” for these plug-ins? I’m aware of some of the discussions that have taken place between the pair of you, but have not tried to follow all the details. I believe I would, in reality, be coming at these tools with no previous “baggage” to cloud my judgement or influence my thinking. There’s just one small problem: the only media that I have is all CD. Somebody would have to provide me with, say, the digital file from a transcribed LP and/or a transcribed cassette for me to play with.
As always from me… just a thought!
Peter

Thank you very much for the offer Peter.
As a general point, beta testers for plug-ins are always extremely welcome.
With regard to these particular plug-ins there will be a need for beta testing before release of any new versions but I don’t think that we are quite at that stage yet.

If you’d like to do some beta testing now, there are two new plug-ins that are currently waiting for testing before being uploaded to the wiki download pages:
https://forum.audacityteam.org/t/shelf-filter/24512/1
https://forum.audacityteam.org/t/turntable-warping-v3/24650/1

Should your reference above to “if a detected sound is less than 1 minute then it will be “grouped” with the next sound” have read “2 minutes” ?

I don’t see that “Group sounds that are less than:” avoids that objection except in so far as it’s vaguer so is open to more interpretations. Both ideas miss the explicit “up to”.

“Minimum label length” seems more likely to me to be interpreted as what we mean (except for the problem with the last label, but that is not I suppose an insurmountable objection). Does “Group short sounds until at least:”, “Group short sounds to minimum:” or similar help? Can you remind me of the other reasons why you now don’t like “Minimum label length”? To me it says all this with much less potential for confusion. I don’t care if the effect does what you are describing to make the label. I care about describing the end result the user sees.

Is that more useful than completing a label even if it is rather longer than the length specified?


Gale

Yes +1. It matches with current Silence Finder and Regular Interval Labels.



Gale

A lot hinges on my perception that new users will struggle both with sound regions and with the concept of splitting songs by labelling the songs. If you have songs separated by silence then it is easy to understand that you can split the songs by making a single chop in the silence. This is proved - it creates almost no issues for users; the main issue it creates has a well documented workaround (include audio before first label).

I think that labelling sound regions is sufficiently complex for new users (in concept and effect on the interface if combined with point labels) that there is probably no point putting it in a “simple” plug-in. That leaves a problem that an “advanced” plug-in may still be too complex for people of only average ability; and the suggestion that point labels are preferable for users even when marking sounds. I still see these as much lesser problems than putting sound regions into Silence Finder.

I understand that “minimum distance” may be difficult to code in the current Silence Finder and I think I am getting the impression that it may be easier to start over with a “simple effect” than patch up Silence Finder? In that case probably you should start over because I think “Groups” in Sound Finder is likely to be harder to understand in amongst all its features than putting it into a Silence Finder that places labels in silence.

I think most users who are not intimidated by labelling sounds rather than silence will cope OK with point labels in Sound Finder. I don’t think it is an important option (people who want point labels will probably prefer Silence Finder; people who are comfortable with sound regions will find it easier to actually label the region) but I would like it in if there is vertical room. My strong concern is that I don’t want a point label option to be default in Sound Finder.

All this still seems to be coming round to having

I concur with all of that, though I am not quite convinced Sound Finder will be so complex that it requires splitting into “simple” and “advanced”. If we can have minimum distance in Silence Finder then most novice to average users will probably not need to concern themselves with Sound Finder. Most who want to trim the silences on an album will probably be more comfortable doing it manually than using Sound Finder.

So in terms of the number of users it will help, the first task is probably to get a bug fixed Silence Finder out (specification above).

I too think there is a (fairly small) group who want a “simple” Sound Finder not much more complex than now, but I think trying to accommodate them by good design in a revamped Sound Finder will be easier and much less risk than trying to make an easily intelligible “Silence Finder” that is presented as marking points close to sounds and also offers sound regions.

I also see a group who find the current Sound Finder as underpowered and inflexible so I think it would be better to give them a shipped effect if it can be done.



Gale