Compex Dynamics Processor

HBB.ThinkTank · May 10, 2020, 2:23pm

Compex (Compressor/expander) Dynamics Processor

Picking up where I left with my first training excercise at Code to Select Audio and Expander / DR Processor, I’ll build a more comprehensive plug-in that shall include a compressor and an expander. The major obstacle at the moment is the fact that the old version requires actually 7 individual files that need calling and that it is hard-knee only.

So here’s my eventual goal: a Nyquist plug-in that will automatically compress or expand the sound selection, adjusting its internal variables according to the parameters entered via user control (like the desired dynamic range) and the analyzed audio.

Basically it should have a very simple interface asking for the desired DR with two options: whether it should use a hard or a soft knee and whether it should try to comprensate for the soft knee (I guess the last feature will remain experimental).

HBB.ThinkTank · May 10, 2020, 2:24pm

Used Terms and Definitions

Hard knee
In an unaltered sound at any giving sample the input and output exactly identical (the basic idea of unaltered) and would appear as a straight line on a graph. A compressor or expander does change that. At a given threshold in and output will deviate by a specified ratio. On the graph it would show up as a bended line like a knee (with a sharp point).
A soft knee tries to reduce the perceived distortion in the sound by slowly changing the ratio from zero to the specified ratio, thus softening/rounding the knee on the graph. There are various approaches to when the gradual change of the ratio starts.
Loudness
RMS
The sound that we hear in real life (the analogue audio) is a cacophony of audio waves. Our ears and microphones perceive one single wave (in the case of our ears, as we have two, it will be duophonic or stereo, that is two waves). Digitizing a wave means slicing the analogue wave into samples, where the sample rate indicates the number of slices per second and the bit depth the precision with which the amplitude of the sample is recorded. Each of these samples has a value representing the amplitude of the slice of the original wave.
RMS stands for root mean square and is a form of calculating an average of the sound that gives us an idea of the perceived loudness of a sound.
It squares the value of each samples, finds the arithmetic mean of these squared values (by adding them together and dividing the sum by the number of samples) and gets the quare root of this mean.
Soft knee
See: hard knee.

If you feel that my explanations are wrong, feel free to PM me, but only if my explanation is wrong - not if you consider it incomplete!
For further explanations refer to the Sources and Resources section below. I might add images to better illustrate these defintions later.

HBB.ThinkTank · May 10, 2020, 2:27pm

Sources and Resources

Here I’ll list links to posts, articles and webpages that I deem useful.

Above all, check the Introduction to Nyquist and Lisp Programming.

Audacity Forum posts

Peak Limiter / Expander by steve, 2011
Square Roots and If-Thens by Roquentin, 2010
A little debugging counsel by steve, 2019

External Articles

BS.1770 : Algorithms to measure audio programme loudness and true-peak audio level, 2006-2015 International Telecommunication Union
Definitions of Audacity’s Standard Nyquist Functions, 2010-2020 Github
Digital Audio Sampling by Hal Robertson, Videomaker
Expanding on Compression: 3 Overlooked Techniques for Improving Dynamic Range by Nick Messitte, 2018 iZoptope
MLoudnessAnalyzer - PDF Documentation, MeldaProduction
Ploud (Loudness), 2010-2014 European Broadcasting Union (includes the EBU R 128 Loudness Recommendation)
ReplayGain 2.0 specification, The Hydrogenaudio Knowledgebase

steve · May 10, 2020, 2:57pm

You may find some useful information in this old thread: https://forum.audacityteam.org/t/peak-limiter-expander/17790/1

HBB.ThinkTank · May 10, 2020, 3:03pm

Hello Steve,

still running things smoothly (I hope), I see

I’m sure I’ll have a lot of questions coming up!

I’ll check this out later, right now I try to set up the project specifications.

HBB.ThinkTank · May 10, 2020, 3:23pm

The old set of plug-ins that I created in September, with the amiable support from steve, consists of seven .ny files.

dr-proc-chain-pt1-max-rms.ny
This plug-in analyzes the sound selection and stores the returned data in a global SCRATCH variable.
dr-proc-chain-pt2-peak-processor.ny
This plug-ins does the processing. It either compresses or expands depending on the SCRATCH data and the control input.
dr-proc-chain-pt3-normalize.ny
This plug-in normalized the sound selection. (I haven’t used it a lot lately.)
dr-proc-chain-copy-globals.ny
This one and the following one are little helpers but necessary. This one kinda duplicates the SCRATCH data for comparison. Without this step the following plug-in won’t work.
dr-proc-chain-display-properties.ny
It will display the global SCRATCH data.
dr-proc-chain-remove-globals.ny
The whole chain can be repeated. This one deletes the data from each individual run.
dr-proc-chain-remove-globals-final-run.ny
This one deletes all SCRATCH data.
dr-proc-chain-copy-globals.ny (497 Bytes)
dr-proc-chain-pt3-normalize.ny (410 Bytes)
dr-proc-chain-pt2-peak-processor.ny (3.3 KB)
dr-proc-chain-pt1-max-rms.ny (4.01 KB)

HBB.ThinkTank · May 10, 2020, 3:24pm

And here the remaining files.

As you can see this is quite cumbersome.
dr-proc-chain-remove-globals-final-run.ny (524 Bytes)
dr-proc-chain-remove-globals.ny (429 Bytes)
dr-proc-chain-display-properties.ny (2.32 KB)

HBB.ThinkTank · May 10, 2020, 9:17pm

No matter your source of audio, unless they have been (already) mastered to be played together chances are that they will sound very much different from one another and usually it has nothing to do with the kind of sound/music we’ll be dealing with. So we will ignore right now that a pieces of Johann Sebastian Bach will be much quieter than some pieces of Aggrotech or Psytrance or whatever (fill your own favorite style of dance music). And it will not matter to us whether it is some work in project set up for its first mastering or some piece that has been already released to the public and thus already mastered and sometimes even remastered and re-remastered. Having some code that will bring your different sound pieces to a similar sound level will matter to you no matter what your sound files are.

Loudness is a very subjective characteristic of a sound. While RMS was been used in the past it has been mostly discarded in relation to perceived loudness. My own plug-in Max-RMS was a way to deal with its major problem (as it uses an arithmetic mean), meaning the RMS of two sound selections will be different if one has longer quiet parts than the other even if the loudest parts are identical. Thus ReplayGain is not an option as it is build around a RMS.

MeldaProduction’s MLoudnessAnalyzer is a EBU R128 and ITU-R BS 1770-3 compliant loudness meter collection. I am especially interested in its true peak meter and the Integrated loudness meter.

True peak meter shows the true peak level. Most digital-to-audio (D/A) interfaces first convert the incoming audio into a higher
sampling rate and then generate the output analog signal fed into the audio monitors. The filtering involved in this conversion can cause
peaks higher than the original peak level.
True peak level simulates this conversion and displays level in this higher sampling rate. The goal is to avoid peaks over 0dB, otherwise
the actual D/A convertor may get overloaded and produce mild clicks or distortion. The usual practice is to use a limiter as the last stage
of your processing chain and set the ceiling to say -0.5dB, which is usually enough to avoid the overload. Note that since each convertor
is different, the true peak level cannot be correctly measured and different software provides different true peak levels.

The true peak is especially important for normalization of sound.

Integrated loudness shows the overall loudness, hence it is affected by the whole track from the beginning of the playback until you
reset it by clicking on the value field. The host may reset it too; it depends on your host.
Please note that the Integrated loudness is NOT the same as an averaged loudness, as it ignores quiet passages. Imagine a track
which is generally quiet but has a few loud sections. The averaged loudness will be less than the Integrated loudness. Its calculation
uses gating to ignore those quiet passages (levels less than 10 LU less than the current ungated level) of the track. Essentially,
Integrated loudness is a measure of the loudest sections of the track.

Unfortunately the MLoudnessAnalyzer - while working in Audacity - can not operate on a multiple selection.

So we have already identified three goals of the Compex Dynamics Processor:

It must operate on a selection of several audio sections - High Importance
It must be able to calculate the True Peak - Low Importance as normalizing could be done separately
It must calculate a loudness similar to the integrated loudness of the MLoudnessAnalyzer plug-in - Very High Importance

Here’s deep link to the latest Recommendation ITU-R BS.1770-4, PDF in English.

True peak and the integrated loudness (I shall borrow the term from MeldaProduction) are the two elements for which target values can be set. Or rather the difference between both values (as measured in LU) as PLR (peak-to-loudness ratio) and the loudness shall be the targets than shall be altered by the user (standards should be defined).

So the Compex Dynamics Processor shall analyze the selected audio. If the actual PLR is higher than the target PLR, compression will be applied, otherwise expansion. The parameters of both compression and expansion (threshold and ratio) will be calculated by the Compex Dynamics Processor.

I am still undecided about Attack and Release (common) but even more about the RMS length and knee size (rarer).

The most difficult task is calculating a compensation for soft knee processing. The reason for doing it is simple enough. It is already almost impossible to accurately calculate a precise ratio so that the outcome will have the targeted PLR (as you’d have to predict the post-processing loudness) but we’ll try that already. Now, a soft knee changes the average compression or expansion so that even a half-way precise ratio will be different at the knee area around the threshold. This will again change the resulting loudness. There’s a guess but maybe Nyquist will might have a hard time doing that. But I’ll try…

So here are the final goals of the Compex Dynamics Processor:

A control allowing the user to set Target PLR and Target Loudness - High Importance
A control allowing the user to set Attack and Release - Middle Importance
A controll allowing the user to set RMS length and Knee Size - Very Low Importance
Compressing or expanding with soft knee - Very High Importance
Calculating threshold (something close to loudness) and ratio - Very High Importance
Adjusting threshold and ratio to compensate for soft-knee processing - Very Low Importance

Next the technical specs…

HBB.ThinkTank · May 10, 2020, 10:29pm

Input: Audio Selection from Audacity, multiple tracks containing one or two channels (mono/stereo)

Step 1: Reading data provided by Audacity
Step 2: Analysis - getting true peak (based on ITU-R BS.1770-4) and integrated loudness (based on ITU-R BS.1770-4 but similar to MLoudnessAnalyzer but more likely limited to 6 dB instead of 10 dB)
Step 3: Calculating threshold and ratio based on data provided via control and comimng from analysis
Step 4: Processing

It is clear that Step 3 will be the most complicated one.

Placeholder for the Project Specifications

HBB.ThinkTank · May 11, 2020, 8:51pm

Hi Steve,

it seems like you’ve already looked at the ITU recommendation at https://sourceforge.net/p/audacity/mailman/message/34871790/.

Have you tried to implement ITU-R BS.1770-4 somewhere for Audacity?

I have only found C++ and Python codes here (C++) and here (Python), both licensed under MIT.

There’s also a C++ program for Linux that is GPL-licensed here based on EBU 128, so seem to be some others but none in Nyquist.

steve · May 11, 2020, 11:03pm

The next Audacity release will have a loudness normalizing effect, based on ITU-R BS.1770-4.

HBB.ThinkTank · May 13, 2020, 8:26pm

I know - I saw the post for the release candidate ready for testing. But that part needs “porting” to Nyquist

I hope it’s doable. I had a look at the ITU recommendation. I’m not an engineer (nor a doctor) and as we say in German: I was looking at it like a swine looks into a clockwork. Almost gibberish to me

Still going to give it a try though but having to juggle a job, a family and trying to squeeze in some off-time will make it slow progress.

HBB.ThinkTank · May 13, 2020, 8:47pm

A brief summary limited to the immediate scope of this project.

The verbal recommendation consists of a bare one and a half pages, of which about one page sets the basic background of thinking (the considering this and further considering that).

Then we habe six basic recommendations, of which only two are of any importance to us:

The ITU Radiocommunication Assembly […] recommends

that when an objective measure of the loudness of an audio channel or programme, produced

with up to 5 main channels per Recommendation ITU-R BS.775 (mono source, stereo and 3/2
multichannel sound), is required to facilitate programme delivery and exchange, the algorithm
specified in Annex 1 should be used;

[…]
[…]
that when an indication of true-peak level of a digital audio signal is required, the

measurement method should be based on the guidelines shown in Annex 2, or on a method that gives
similar or superior results,

[…]

Part 1 is for measuring the loudness, Part 4 is for (kinda) measuring the true peak.

HBB.ThinkTank · May 14, 2020, 9:47pm

Specification of the objective multichannel loudness measurement algorithm

The algorithm consists of four stages:

“K” frequency weighting
mean square calculation for each channel
channel-weighted summation (in stereo and mono each channel has the same weight)
gating of 400 ms blocks overlapping by 75% with two thresholds, the first at −70 LKFS and the second at −10 dB relative to the level measured after application of the first threshold

For each channel it should look like this:
x → K-filter → y → Mean square → z → G

As in the weight is 1 for mono or stereo tracks it woud actually look more like this:

x → K-filter → y → Mean square

These are added together, the sum of which is put through 10Log10 and then through the Gate.

tbc