label-sounds.ny not working for me, both mac and windows.

Hello, I posted the issue here: https://github.com/audacity/audacity/issues/1032 I’m using 3.0.2.

Maybe it’s user error (probably, I’m still pretty new with this program), but I thought not, because all the other nyquist plug-ins worked just fine for me (find sound, find silence, etc.).

Basically, I can’t get label-sounds to detect more than one sound in a very long track. I tried selecting just a small section of the long track and still got the same behavior. Tried every possible value for the DB, as well as every other option in label-sounds. And I tried on both Mac Big Sur and Windows 10. And I’m very stuck because the other sound/silence detecting plug-in’s work just fine for me. For now, I’m happy using Find Sounds instead, but I would really love to get label-sounds working as it has that neat “Minimum label duration” which can help me avoid making too short of clips, which the other plug-ins did not have.

Thanks,
Spatula

Let’s try a test case to check that the plug-in is actually working.

  1. Launch Audacity
  2. “Generate menu > Risset Drum” with default settings (as shown here), then “OK”.
  3. “Effect menu > Repeat” (below “Repair”), set the number of repeats to 9, then “OK”
  4. “Ctrl + F” (fit to screen, so that we can see the entire track)
    You should see the entire track selected, with 10 audio clips.
  5. “Analyze menu > Label Sounds”
  6. Click on “Manage button > Factory Presets > Defaults”
  7. Click “OK”

You should see a result like this:


Followed your steps and it indeed labeled the sounds, first time I’ve seen it work. Exciting. So I opened my long track and tried the same settings (changing nothing from defaults) and still only got one label. The only difference I can figure is that my track is 5 hours long, the source is an MP3, and it’s hz is at 22050, and stereo (but I have also mixed it down to mono previously and still didn’t work). Really strange. The source track is very clean (an audio book and the silences look very silent to me, and it does work when I use those other plugins, so I’m really confused now. :smiley:

https://media.discordapp.net/attachments/842259121189158962/855852412212215858/unknown.png?width=1570&height=935

And as a test, I did 10,000 risset drums which came to 5hrs 33mins, and it was able to label all 10,000 of them, so it doesn’t appear to be an issue of length?

Super :smiley:

Let’s turn our attention to the audiobook recording.
What are you trying to label? Your screenshot looks like one continuous noise, but I assume there are some small gaps that you are trying to detect. How long are those gaps?

Tip: To measure the peak amplitude of a selection of audio:

  1. Select the audio that you want to measure.
  2. Open the “Amplify” effect.
  3. Observe the default “Amplification (dB)” setting.
    By default, the Amplify effect offers to amplify up to 0 dB, so the “Amplification (dB)” setting shows the number of dB required to make the selection up to 0 dB. Put another way, it shows the number of dB below full track height.
  4. Click the “Cancel” button to close the Amplify effect.

Example: If the peak amplitude of the selected audio is -20 dB, then the Amplify effect will offer to amplify by +20 dB. Thus, if the default “Amplification (dB)” says “20”, then we know that the peak amplitude of the selection is -20 dB.


Pick a loud part of the recording and measure the peak amplitude. What is it?

Select a short section within a “gap” that you want to detect. What is the peak amplitude of that “gap”?
Measure a few of the gaps for both the length of the gap, and amplitude of the noise within the gap.

Gaps are varied, typical spoken narration pauses. A short gap is about 0.303s and a long gap is about 1.324s. Average gaps seem to be about 0.766s.
It offers about 6.0 dB to 7.0 dB amplification / 0.0 new peak aplitude, on a typical loud peak. On a silent section it offers amplification 50 dB / New Peak Amplitude -9.165, and another silent part showed 50 dB amplification with -31.145 New Peak Amplitude. This is what a typical waveform looks like in “normal zoom”: https://media.discordapp.net/attachments/842259121189158962/855884612240146442/unknown.png?width=1582&height=937

Referring to: Label Sounds - Audacity Manual



Threshold level (dB): (default -30 dB)
When audio is below this level, it is considered to be ‘silence’. The lower (more negative) this setting, the quieter the background level must be to be recognized as “silent”. If set below the track’s noise floor level, the entire track will be seen as one continuous sound.


So that “silence” is about -59 dB

and that silence is about -81 dB.
So we need to set the “Threshold level (dB)” a little bit higher than the silences. Let’s try -50 dB.



Threshold measurement: (default “Peak level”)
Peak level: The threshold measurement is based on the peak amplitude in each 10ms period.

We’ve been using peak amplitude measurements, so let’s leave this set to “Peak level”.



Minimum silence duration: (default 1 second)
When ‘silence’ of this duration (or longer) is found, preceding sound and following sound are considered to be separate sounds


So to detect gaps that are 0.303 seconds we need to set the “Minimum silence duration” to 0.303 seconds or less. Let’s try 0…25 seconds.
(When set to the default “1 second”, gaps less than 1 second are ignored)



Minimum label interval: (default 1 second)
Allows short sounds to be grouped within a label region. This ensures that labels will be at intervals of no less than this length. In effect this combines short sounds to create a group of sounds that is at least the specified length. Valid values are between 0.01 seconds to 2 hours.

Looking at your second screenshot (https://media.discordapp.net/attachments/842259121189158962/855884612240146442/unknown.png?width=1582&height=937), each of the sounds are longer than 1 second, so let’s leave this control set to 1 second so that every sound longer than 1 second will be labelled individually.



Label type: > (default: Region around sounds) Sounds / silences are labelled either with point labels or region labels.
Point before sound This option places a point label before each detected sound or group of sounds.

You can choose whichever you want to suit your needs, but for now let’s leave this as “Region around sounds”.



Maximum leading silence: > (default: 0 seconds)
When labeling sounds, a point label, or the start of a region label will be placed before the beginning of the sound by up to this amount.

So this option provides an offset for the position of each label. Let’s leave this at zero for now.
If you want to shift the start of each the labels a little bit earlier so that there is 0.1 seconds of space included in the labelled region before the start of the sound, then you could set this to 0.1 seconds.



Maximum trailing silence: > (default 0 seconds)
This setting is used by region labels only.
When labeling sounds, the end of a region label will be placed this distance after the end of a sound, provided that there is room to do so before the next sound.

This can provide an offset for the end of each region label. As the sounds that we are labelling most end with a gradually fading tail, let’s allow 0.1 seconds for the end of the tail. Set this to 0.1 seconds.



Label text: (default “Sound ##1”)
This is the text that will be entered in each label.

This can be customised to suit your needs, but let’s leave this at defaults for now.



If you do the above, the settings should look like this:

I tried exactly those settings. Still labels the entire track as one sound. Something is definitely strange with it. I can’t comprehend what would cause this when you’ve shown it does work. So strange! :smiley: here’s my screenshot (I just re-opened label-sounds to show what settings were used).

https://media.discordapp.net/attachments/856232479413108757/856232540293038110/unknown.png?width=1576&height=935

I ran the ACX plug-in on a one hour section of it. This is what that shows:
https://media.discordapp.net/attachments/856232479413108757/856235239634632735/unknown.png?width=1582&height=936

Select a section of about 10 seconds duration.
“File menu > Export > Export Selected Audio”
Export the selection as a FLAC file.
Attach the FLAC file to your reply (See: https://forum.audacityteam.org/t/how-to-attach-files-to-forum-posts/24026/1)

Sample is attached. Also if I open that sample and run label sounds with our settings, it labels it as one sound also.

Here is a longer 36 second sample also, in case you need a longer clip to test with.

Mystery solved. There’s a bug in “Label Sounds”.

You can workaround the problem by resampling the track to 44100 Hz. See: https://manual.audacityteam.org/man/tracks_menu.html#resample

I’ll submit a bug fix, which hopefully may be in time for the Audacity 3.0.3 release.

I’ve sent the fix to the developers.