Sound / Silence Marker

Discussed many times before but never resolved… When Sound Finder was first introduced it was proposed that it should eventually be integrated with Silence Finder.

Would this be too complicated?
If it is, then which features would you wish to retain/discard?
SoundSilenceMarker.png
Controls:

  • Sound/Silence threshold (dB): The detection threshold below which audio is considered “silent” and above which “sound”.
  • Min detected silence (seconds): Silences shorter than this are ignored.
  • Min detected sound (seconds): Sounds shorter than this are ignored.
  • What to label: Choice - Sounds / Silences.
  • Can label either sounds or silences.
  • Where to label: Choice - Before Start / Middle / After End / Region.
  • Can produce point labels:
  • before the start of the detected region,
    • in the middle of the detected region,
    • after the end of the detected region.
    • or can label the region (region label).
  • Before Start max offset (s): If a label is placed before the start of a sound/silence, this sets the maximum time (seconds) before the start. Detected regions are not allowed to overlap and the first label cannot be before the start of the selection.
  • After End max offset (s): If a label is placed after the end of a sound/silence, this sets the maximum time (seconds) after the end. Detected regions are not allowed to overlap.
  • Label text (optional): If not empty, each label will have this label text.
  • Number from (optional): If not empty, each label will start with a number, counting up from this number. Leading zeros are honoured.

Looks good to me. I don’t think you can really do without any of those options (aside from label name and number, but that is so simple and useful it should be included).

It would certainly be simpler than having the two different effects.

Is this a GUI mockup, or do you have code ready to post?

– Bill

Part way between the two.
It is partly coded. I don’t want to waste a lot of time on this if it is not what people want.
Thanks for the feedback Bill.

I actually prefer two effects instead of one.
The Analyse menu does not contain an overwhelming amount of effects and if I can decide right at the very beginning if I search for Sound or silence - what’s wrong about that?
Of course the both effects have to be identical in structure.
Ok, I use them only once in a decade, so my opinion is rather biased and unreliable.

Which currently they are not, so if you want to search for sounds and use point labels, then hard luck, or if you want to search for silences and use region labels, again, hard luck, or if you want the labels numbered and search for silences, then yes you’ve guessed it, hard luck again. The proposed effect offers a lot more flexibility, which I hope is without too much more complexity.

Yes, they are not, but they should be equally structured.
I actually embrace a re-definement of those effects.
The only Thing that I propose is that they are seperated within the menu and not within the plug-in itself.
Of corse, one can put a compressor and a Expander into the same effect but both are often available as a unique device.
So, if the user knows what he’s looking for, he can decide right away from within the menu. This could make the Parameters a Little bit more consise and readable.
As I said, this works only in the Analyse menu which doesn’t hold a hundred items at once.
Anyway, as Long as Labels are poorly accessible , I rather don’t care and so don’t give too much weight on my opinion.

OK, I’ll agree to disagree on this occasion :smiley:

After the “where to label” option I’ve added one more option.

  • Crackle rejection: Choice Enabled / Disabled. This will particularly benefit users that are splitting vinyl recordings as it makes silence detection immune from the effects of crackles.

I have also reversed the order of “Label text” and “Number from” as I think that it will generally be more useful for label numbers to be before other label text.

Dont’t be obsequious…
Thats a great idea about the crackling.
Am I right that instantaneous Peaks are ignored?

You are, but reading that has made me realise that it’s not quite right. When enabled, very short “isolated” sounds within a period of silence are ignored (not counted). In fact this will make the silence count slightly wrong - the clicks should be added to the silence count.

From the view Point of power or energy it is indeed silence as soon as it lies below the threshold.
For certain applications, the usage of an constant threshold is not Adequate. But your working with the RMS values in the first place, this evens out the irregularities to a certain degree.
Sound Events are a fascinating subject to Analyse.
If I Play a note on the trumpet sforza (bAAahuuuuaAAAAh), you will immediately know that it is one tone, but the Sound finder will most likely detect two tones.
And he won’t detect the first Transition or attack and will decide that the tone Ends amidst in the exponential decay (of which there are two in a sforza).
A really excellent Sound finder must therefore work with more Parameters than only energy.
Maybe the yin function could be examined too. It offers a “cheap” way to determine if the so called silence is only the cyclic tail/decay of a Sound. It appears to me that this differenciation is important when the user decides to mark the end of sounds or the Overall range. Am I wrong?

A Sound or Silence Finder will never be perfect. Trumpet playing aside, end-of-sound or region labeling will almost always fail on songs with fade-outs. That’s why I never use it - it is always necessary to tweak the results and the time spent tweaking is about the same as doing the whole job manually. That said, I know this is an often-request effect and getting it “as right as possible” is a worthwhile endeavour.

– Bill

+1 for a unified tool

WC

I would be +1 on a unified tool if the parameters were saved so that in the 99.9% of the times that I use it for Silence Finder the odds are very high that it will start out configured for Silence Finder.

I have a whole bunch of albums recorded, but not yet split into tracks, on my hard drive. I could give this a good work out when you’re ready.

and +1 to that finessing

WC

@Steve:

  1. is the ball in your court on this one?
  2. Are you planning to write a proposal?
  3. Are you planning to write and submit the code?
  4. Can I archive this thread - or do you want (a simplified version of) it transferred to the Wiki>PFR?

Peter.

  1. Yes
  2. No
  3. Yes
  4. I’d like the votes to be recorded.

Progress so far is “slow but steady”. I dip into this when I have time. Getting it right is quite tricky and I’m not rushing it.

Negative votes aren’t really allowed on Feature Requests, but I’m still of the opinion that the current effects have the right idea a) a simple Silence Finder b) as complex as you like Sound Finder (and I guess it “could” find silence too). Think “Bass and Treble” versus EQ, Compressor versus Leveler etc.

The proposal here is I think seriously unusable for novices. Sorry, I have to say it, although I agree there are weaknesses and inflexibilities in both effects.

Also I think we should link here to the other long discussions we’ve had about one or both effects.


Gale

Your response raises more questions than it answers.

  1. a) a simple Silence Finder b) as complex as you like Sound Finder
    So what about users that want to mark silences but require more flexibility than the simple Silence Finder provides?
  2. The proposal here is I think seriously unusable for novices.
    because ?

One of the disagreements that we frequently have is that you strongly feel that more verbose control descriptions are easy to understand for novices, whereas I strongly disagree. We are not going to resolve that difference of opinion, though I note that virtually every hardware manufacturer agrees with my point of view. As one typical example (chosen simply because it is on the table next to me - a “beginners” DAB radio (branded “Bush”):

Control Labels // More verbose labels

Volume // Volume for playing audio
Standby Mode // Standby mode
Menu // Menu for additional options and functions
Alarm // Alarm clock settings
Sleep // Deactivate Alarm clock
Snooze // Pause Alarm clock’s alarm for 5 minutes
Back // Back to previous menu setting or option
Info // Information about current selection
Scan // Search for stations
Preset // Select station previously detected by scanning stations
Mute // Mute audio (overrides volume level)
Tune/Select // Multifunction control for manual adjustment of station tuning or for scrolling through menu options.

The professional designers at Bush (in common with virtually all hardware designers throughout the world) decided that the short labels are better - I agree with them, though I understand that all DAB radios may be too complicated for some technically challenged consumers.

Note that the posted screenshots are “pre-alpha” and so are subject to change.

Comparing the proposed effect (PE) with the current “simple Silence Finder” (SSF):

  • Both have a “threshold” control. SSF has the more verbose name “Treat adio below this level as silence [-dB]”. I would argue that the new “threshold” is “easier” because it uses negative values (as used elsewhere in Audacity) rather than negative units.
  • Both have a similarly named “minimum duration of silences” control.
  • PE has a “minimum duration of sounds” control. The control name is subject to change, but the concept is no more difficult than for silences.
  • PE has a “what to label (sounds/silences)”. Easy decision - do you want to label sounds, or do you want to label silences? If you don’t know, give up.
  • PE has a "where to label (Before start/After end/region). Easy decision - do you want the label before the start if the detected sound/silence, or after it, or mark the entire region (with a region label). If in doubt, leave at the default, which is the same as SSF.
  • Both effects have a “Before start offset” setting.
  • PE has a similar “After end offset”, which is conceptually no more difficult, but is required if the user marks the end or a region.
  • “Label text (optional)” - Easy.
  • "Number form (optional) - Easy.

So where are the hard bits? I would argue that by far the hardest part of “simple Silence Finder” is getting the thing to work while avoiding false detections. The proposed effect is designed to make accurate detection considerably easier by using a more intelligent algorithm internally.

The “simple Silence Finder” is not at all simple if the user wants region labels.

I would also argue that the “simple Silence Finder” is conceptually confusing because it is actually detecting “sounds”, and then deducing that “silences” are the regions between the detected sounds, then placing a point label toward the end of the previous sound so that it is before the start of the deduced silence. The subtle difference between what the effect says and what the effect does will be lost on most users.

Conceptually the proposed effect is much simpler in that if you wish to mark sounds, the effect will detect sounds and label them, but if you wish to mark silences then the effect will detect silences and label them.

The major difference between the proposed effect and the current effect is that the interface of the proposed effect breaks away from strict algorithmic descriptions and aims to produce more intuitive results, leaving the complex decisions about whether a “click” is a “sound” to the built-in AI rather than burdening the user with defining every parameter.

The major fallacies in saying that the current “Silence Finder” is “easy” are:

  1. The effect frequently fails to accomplish what the user wants (false positives or false negatives).
  2. It only deals with one case out of 6:
  • Point label before start of sound - Pass
    • Point label after end of sound - Fail
    • Point label before start of silence - Fail
    • Point label after end of silence - Fail
    • Mark sound with region label - Fail
    • Mark silence with region label - Fail
  1. “Label placement [seconds before silence ends]” :confused: What a convoluted way way to say “before start of sound”. Double negatives are invariably a bad idea in UIs.

Essential controls for detecting silences are:

  1. Duration of what you mean by “silence”.
  2. Threshold below which you mean “silence”.

Essential controls for detecting sounds are:

  1. Duration of what you mean by “sounds”
  2. Threshold below which you mean “sounds”.

The proposed effect uses both “Duration of what you mean by silence” and “Duration of what you mean by sounds” in order to intelligently determine what is “sound” and what is “silence”, thus making the effect far more accurate with real world audio (less false positives/negatives).

Do you want to mark sounds, or do you want to mark silences?
Not a hard question with the proposed effect, but as described above is confused in the current “Silence Finder”.

All other controls are about where to put the labels in relation to the detected sounds/silences. This can be as flexible or inflexible as we deem appropriate.

I wasn’t intending to provide answers to the plan you want to adopt, just trying to argue for what may be an easier solution to such issues as these effects have.

We sometimes have that argument, but I think that is only part of the issue here. The other part may be the sheer number of controls and the number of different concepts to be taken in. Unfortunately, removing helpful descriptions in the “harder to understand” controls may be the last straw when combined with layering in many extra controls as well.

Yes.

Why do we dumb down Bass and Treble for technically challenged consumers, but increase the complexity of Silence Finder which I think is often used by equally challenged people, if not more so?

Two reactions I had to that screenshot from users:

OMG Please don’t do that! I want to download my tapes to CD not take a degree in sciences

What’s wrong with it now? I don’t need all those features in my face making it harder to use.
Give it an advanced mode, perhaps ?

Silence Finder is already a “difficult” effect for many users. After Vocal Removal it’s the number one effect that I get personal requests for help with.

Hardly any users want region labels in Silence Finder. There are no votes for region labels in Silence Finder on Feature Requests and I don’t seem to have any stored that are waiting to be added. If there are votes uncounted (apart from yours) please show me where they are so they can be added.

Also IMO it would complicate the Tutorials e.g. Audacity Manual to discuss region labels in depth. And it seems you want sounds detection to be default.

I do think a control that offers “minimum duration of sound” or similar is wanted but at least we know a few users are asking for that. As the effects are now, Sound Finder may be the best place for that.

Precisely. So it is not really an issue for users. It is something you see by looking at the code.

Currently the user makes that decision by choosing the effect with the appropriate name.

:confused: I don’t see any “strict algorithmic descriptions” in the current Silence Finder interface.

More accurate results without too many false positives or missed detections is highly welcome. :wink:

This is not about claiming both effects are perfect - they are not so. It’s all about the user practicality of combining them. Logically I have always said I would prefer they would be combined, but often it just isn’t intuitively possible, or we might have had a combined Adjustable Fade and Text Envelope.

So I am clear, what is your objection to rewriting as separate simple “silence finder” and a complex “silence and sound finder”?

We did eventually get a very functional and very intuitive Adjustable Fade thanks to your efforts. The fact that there seem to have been few support issues with it is I think an indication that the effort was worthwhile.

I think combining Silence Finder and Sound Finder will be much harder because so many new and challenged users will be using it who have spent money on their USB tape and record converters. This effect is right up front in those users’ eyes and right up front in the damage it can do if the effect is seen as too complicated. Many will of course be trying to use this without looking at the Audacity Manual (or reading some manual that came with the hardware which describes something like the long-established Silence Finder). We’ve really got to be careful with this, IMHO.

In sum I see no compelling need to combine Silence Finder and Sound Finder. There are weaknesses in both but (as far as I know) there are not large numbers of complaints about either nor are there user demands to combine them that I know of.

If you are adamant we must combine these effects then I’ll try and work with you, that’s all I can say, but I am not sure it as good a use of time as improving separate effects.


Gale

Most of Audacity is “seriously unusable for novices” at least for those that don’t bother to RTFM - digital audio editing is not trivial …

I’m basically with Steve here with regard to shorter labels etc. - longer labelling is tedious for the power user and the seasoned regular user (of whom there are many).
And let’s not forget the nub of this particular proposal was to consolidate the two analyzers into a single one since they are so closely related - and I strongly support that underlying idea.

So what we should be exploring here is how to make the GUI for this proposed unified analyser understandable, approachable and effective.

I, for one, certainly don’t want to end up with two separate analyzers for this: one for novices and t’other for power users that would be a nonsense imo. Maybe though we could explore a default dialog which was simplified and an “Advanced Usage” (or whatever) button to reveal a more complex dialog interaction - plus a preference setting to always get the advanced dialog.

Peter.

P.S. I moved this back from the archive as it was locked for writing there (apart from Steve and Gale) - and I don’t think we are done with this topic quite yet.