ACX Check fails in 2.4.1

Hmm the behavior that Koz describes seems to indicate that perhaps nyquist sound memory isn’t getting freed up, or is getting fragmented.

Steve: Is there a Lisp idiom to only call the snd-set-max-audio-mem function if it exists?

Koz: I’ll be sending you a test version.

OK
Koz

Steve: Is there a Lisp idiom to only call the snd-set-max-audio-mem function if it exists?

Never mind figured it out:

(if (fboundp (quote snd-set-max-audio-mem)) (snd-set-max-audio-mem 1073741824) )

Exactly what I would have suggested :slight_smile:
well … very nearly, I’d have used a quote character rather than the word :smiley:

(if (fboundp 'snd-set-max-audio-mem)
    (snd-set-max-audio-mem <bytes>))

Terrific.

Mac Mini
16GB memory
======================

Clean the Machine
Clear Desktop
Restart

Audacity 2.4.1

30 minute mono show
Pink Noise
Low Rolloff
RMS Normalize
Analyze > AC Check


30 minutes Too Long

13 minutes Too Long

10 minutes Succeeds 
-- Seems to be correct.
-- No spinning beachballs.

10 minutes Success


30 minutes  Too Long

10 minutes  Success

12 minutes  Too Long

11 minutes  Success

03 minutes  Success


11:30 minutes  Too Long

11:15 minutes Success
11:15 minutes Success
11:15 minutes Success
11:15 minutes Success


========================================

120 minute mono show
Pink Noise
Low Rolloff
RMS Normalize
Analyze > AC Check

120 minutes  Too Long
90 minutes   Too Long
60 minutes   Too Long
30 minutes   Too Long

15 minutes   Too Long
11 minutes   Success

11 minute Stereo
             Success
15 minutes   Too Long



15 Seconds   Success
10 Seconds   Success

10 Second Stereo
             Success



==================================
Analyzed "Furries" mono voice track

31.1 seconds  Success
-- Compared with legacy method -
-- analysis is correct.

-end-

What am I doing to the code?
Koz

What am I doing to the code?

Line 58: replace

(setq max-len 30000000)

with

(setq max-len 50000000)

And test again to see if it breaks. (This will increase the limit to 18.8 minutes).
If it doesn’t break make it even bigger.

Once we figure out a safe ratio for the two parameters (with probably a factor of 2 safety margin) I’ll make a version that computes the memory request from the number of samples so that users with lots of memory can increase the parameter if desired.

OK
Koz

OK. I can’t focus any more.

On my machine, the hero setq max-len appears to be between 65M and 80M. 50M and 65M succeed. 80M crashes and gives a Debug Report.

nyx_error returned from ACX Check.
The maximum number of sample blocks has been
reached, so audio computation must be terminated.
Probably, your program should not be retaining
so many samples in memory. You can get and set
the maximum using SND-SET-MAX-AUDIO-MEM.
error: audio memory exhausted
Function: #<Subr-SND-LENGTH: #7fbe05a244f0>
Arguments:
  #<Sound: #147cbc520>
  2000000
Function: #<FSubr-SETQ: #7fbe05a2b6d8>
Arguments:
  BSIZE
  (SND-LENGTH RESULT CHUNK)
Function: #<Closure-MY-RMS: #7fbe05bb51c8>
Arguments:
  #<Sound: #147cbc2c0>
  7.938e+07
Function: #<FSubr-SETQ: #7fbe05a2b6d8>
Arguments:
  LARMS
  (* FUDGE (MY-RMS SA LEN))
Function: #<Closure-ANALYZE: #7fbe05bb06a8>
Arguments:
  #<Sound: #112412460>
  7.938e+07
Function: #<Closure-ANALYZE-MONO: #7fbe05bb0438>
Arguments:
  #<Sound: #112412460>
  7.938e+07
Function: #<FSubr-IF: #7fbe05a29860>
Arguments:
  (ARRAYP S)
  (ANALYZE-STEREO S LEN)
  (ANALYZE-MONO S LEN)
Function: #<FSubr-IF: #7fbe05a29860>
Arguments:
  (> LEN MAX-LEN)
  (FORMAT NIL "Selection too long for analysis, please select shorter section~%")
  (IF (ARRAYP S) (ANALYZE-STEREO S LEN) (ANALYZE-MONO S LEN))
1>

At 65M, the content cutoff point appears to be between 24:30 and 25 minutes.

Mac Mini
16GB memory
======================
80M to 65M


Clean the Machine
Clear Desktop
Restart

Audacity 2.4.1

30 minute mono show
Pink Noise
Low Rolloff
RMS Normalize
Analyze > AC Check


30 minutes  Too Long

15 minutes  Succeeds

22 minutes  Succeeds

26 minutes  Too Long

25 minutes  Too Long

22 minutes  Succeeds

23 minutes  Succeeds

24 minutes  Succeeds

24:30 minutes
            Succeeds

25 minutes  Too Long


============================

120 minutes  Too Long

90 minutes   Too Long

60 Minutes   Too Long

30 Minutes   Too Long

15 minutes   Succeeds


-end-

Koz

Ok so 65M is a factor of 16 between the maxlen and memory allocation. So I’m going to use 30 to provide a buffer for fragmentation or other system dependent issues.

Look for a new version late tomorrow.

How about something like this?
(This is not a full plug-in, so it has to be run in the Nyquist Prompt. It requires a fairly recent version of Audacity)

(defun getfloor (sig &aux (chunk 30))
  ;; Return guestimate of noise floor.
  ;; 'sig' is theRMS level and has around 100 Hz sample rate,
  ;; We look for the lowest 0.1 s block as the local 'floor' 
  ;; in each 'chunk' seconds, then pick the highest local-floor from 'locals'.
  (setf sig (snd-avg sig 10 10 op-peak))
  (let ((floor 999)
        (floor-list ())
        (chunkcount 0))
    (setf srate (snd-srate sig))
    (setf chunk (* chunk srate))  ;chunk size in samples
    (do ((val (snd-fetch sig) (snd-fetch sig)))
        ((not val))
      (cond
        ((<= chunkcount chunk)
            (incf chunkcount)
            (when (< val floor)
              (setf floor val)))
        (t  (setf chunkcount 1)
            (push floor floor-list)
            (setf floor val))))
    (push floor floor-list)
    (setf floor 0)
    (dolist (val floor-list)
      ;(print (linear-to-db val))
      (setf floor (max val floor)))
    (linear-to-db floor)))


(defun to-mono (sig)
  ;;; coerce sig to mono.
  (if (arrayp sig)
    (s-max (s-abs (aref sig 0))
           (s-abs (aref sig 1)))
    sig))


(defun stereo-rms(ar)
  ;;; Stereo RMS is the root mean of all (samples ^ 2) [both channels]
  (let ((left-mean-sq (* (aref ar 0)(aref ar 0)))
        (right-mean-sq (* (aref ar 1)(aref ar 1))))
    (sqrt (/ (+ left-mean-sq right-mean-sq) 2.0))))


(setf peak (linear-to-db (get '*selection* 'peak-level)))

(setf rms
  (let ((rms (get '*selection* 'rms)))
    (if (arrayp rms)
        (linear-to-db (stereo-rms rms))
        (linear-to-db rms))))

(setf noisefloor
  (let ((sig (rms (to-mono *track*))))
    (setf *track* nil)  ;free *track* from memory
    (getfloor sig)))

(format nil "Peak level:  ~a dB.~%~
            RMS level:  ~a dB.~%~
            Noise floor: ~a dB."
        peak
        rms
        noisefloor)

How about something like this?

What’s this code’s talent in English?

Koz

Peak and RMS measurements are extremely fast and use very little resources.

The noise floor calculation is the only part that requires significant work from Audacity.
This calculation processes the entire track, and releases memory as it goes.
The algorithm is as follows:

  1. If the track is stereo, convert to mono as the maximum of left and right channels.
    If the track is true stereo (rather than dual mono), this may give a result a little higher than expected, though it is arguably more accurate as a “floor” measurement than calculating the stereo average.
  2. Calculate the RMS with a 0.01s window (the root mean square for each 1/100th second)
  3. Find the peak of each 10 samples of the RMS signal.
    Each result represents the maximum RMS for a 1/10th second interval.
  4. Search for the lowest 1/10th second in each 30 second “chunk”.
    This gives the lowest level that is sustained for at least 0.1 seconds within the 30 seconds - the noise floor for that chunk.
  5. Compare the noise floor in each chunk and pick the highest.
    This is our final noise floor figure.

Note that this measurement is not suitable for advertising purposes as it is showing the maximum noise floor level for the entire track.

On my i7 Linux laptop, processing 20 hours of stereo audio takes about 16 seconds.

processing 20 hours of stereo audio

And your SSD got warm.

I think I had you right up to (5). Is that a typo? You picked the highest of the low points in each chunk?

Koz

Can we assume there’s no way this is ever going to see the room tone at the beginning and end of an audiobook chapter?

I’m still trying to wrap my head around what’s actually going to show up in this analysis. I don’t trust my not being able to do that.

Select the highest of the .01 samples over .1 seconds. Select the lowest .1 second sample over 30 seconds. Select the highest 30 second block.

Jot that down in your copybook being careful not to blot anything.

Koz

:smiley: Not noticeably.


It’s not a typo.

Say that you have a track that is 1 minute 30 seconds. That’s 3 chunks.
Now say that the first chunk has a noise floor of -64 dB, the second chunk has a noise floor of -58 dB, and the third chunk has a noise floor of -65 dB.
Which result do you want to see as the result? -64, -58 or -65?

My working assumption is that according to the specification, the noise floor should not exceed -60 dB RMS anywhere in the recording.

My thinking is that the analysis would be more useful if we showed all three results, which we could do with labels. A user would then be able to see if they have one noisy section in an otherwise good recording.

It does see the room tone at the start, but it sees and analyzes the rest of the track as well.

Consider the case of a rubbish recording with a noise floor around -40 dB. The user then adds a section of “room tone” at -65 dB to the start and end of the recording. Does that recording pass ACX specs?

I expect that a recording as described would be rejected, even though the room tone at the start and end meet the specification.
The old ACX check would give it a pass, because the start (-65 dB RMS) has 0.5 seconds below -60 dB.
The new code would give it a fail, because the noise floor of the middle section (-40 dB RMS) is above -60 dB.

Which result do you want to see as the result? -64, -58 or -65?

As a performer eager to close on that beach house, I’d like to publish the -65 value and go make coffee.

I think what’s bothering me is theatrical variables and accidents. It is possible to get a theatrical presentation with breathing effects—as I understand it, that’s allowed—whose “noise floor” never drops below -40.

However, in the absence of evil tricks, it’s hard to get a -65 value by accident. Chances are good that’s the “real” noise floor as current understood—stop moving and hold your breath.

Koz

:slight_smile: Hence my comment that this code is not good for advertising purposes.
“Oh yes, and this amp has a noise floor of -140 dB (A-weighted with the input shorted)


That’s why the effect only looks for 0.1 seconds. If you manage to get through 30 seconds without a gap of at least 0.1 seconds, your efforts are likely to be rejected for reading style.

This is also why I think labelling each “chunk” is a good idea. If say just one section fails from a long continuous take, then that could be a false negative.
Say you got labels like this:
-61 … -60 … -62 … -61 … -40 … -61 … -62 … -60 …

If this was all one take, then that’s a pretty clear indication that the noise floor is [fuzzy] around -61 dB [/fuzzy].
What’s gone wrong in that 5th section that’s showing -40 dB?

  • Does it sound OK?
  • Was it a “theatrical variable” or “accident”?
  • Was that an overdub replacement?
  • Was that when the washing machine in the kitchen hit the spin cycle?

This part could be improved. As it is will read a bit too high.
I think it would actually be better to just calculate the RMS with a 0.1 second window.

The code as it is, was intended to
a) check that the plug-in could be made to work without using a lot of RAM
b) present some alternative ideas.
c) help flynwill with some tricky memory management.

As you (Koz) usually field the questions re. ACX, and maintain a guide for meeting ACX specs, I think that earns you the right to say what kind of tool you want to work with. Flynwill and myself can then try to give you a tool that works.

This is what the little voice was warning me about. This is a presentation I shot a while ago. It’s a real human talking into a real microphone into Audacity. It’s fully mastered with DeEssing and moderate Noise Reduction. The room was quiet, but it wasn’t studio quiet.

Screen Shot 2020-06-04 at 18.41.18.png
Screen Shot 2020-06-04 at 18.41.54.png
So there’s some 7dB difference in opinion of where the noise floor is—and only one of them is submittable (barely).


This odd discrepancy carries forward. This is the show cut and mastered, but nothing else done to it. Analyze > Contrast was chosen at random from the spaces where the performer relaxed.

ACXCheck.png
SteveCheck.png
Contrast1.png
Contrast2.png
Koz