Someone pointed me to this plug-in that is supposed to be a compressor, limiter, or expander depending on the settings. It is a few years old but I can’t find discussion of it anywhere. I think the controls are weird, and the code is a bit weird. (And inefficiently written to implement the underexplained mathematical formulas.)
The description sounds like Chris’s Compressor. Non-linear compander is how off-air broadcast sound chains work. Companding is determined by the show itself.
“These are the legal specifications and requirements for the air show and it doesn’t matter what the original show was.”
They’re difficult to fudge. I once worked for a station that was given a mild warning/suggestion for loud overmodulation. The station was so terrified that they later got an actual broadcast standards violation citation for undermodulation. Obviously this was before the Loudness Wars.
I should have kept a copy of it. Nobody would believe me.
Koz
I have read the code for both. This other is very unlike Chris’s Compressor. I can’t detect that it has any reputation good or bad.
The fancy part of Chris’s is defining the envelope just so, but then a simple gain curve is applied. The fancy part of this one is all in the gain curve – but strangely it does nothing to define an evelope, not even a simple snd-avg. I think that sounds like a recipe for distorting the timbre of everything.
It might be. The process is more difficult than it seems. Every few years, someone would come up with a different way to process the broadcast sound chain. CBS Labs, Dorough, etc. The last one (I forget the name) built the companding into the electronics for the broadcast transmitter. That was a very unusual step because it required replacing a portion of the certified, expensive transmitter with a separate product instead of just buying a little box with flashing lights on it. However, they did such a good job acoustically that they mopped the floor with everybody else.
That’s the goal. The show mostly sounds the same as before, except it now conforms to some desirable standard.
That sounds simple, doesn’t it?
Koz
If I remember correctly, the effect is a fancy and complicated way to do some wave shaping. It is nearer to a guitar distortion/compression effect than a multi-tool compressor.
By the way, have you looked at RBD’s compressor? The calculation of the non-linear 2-fold compression/expansion curve is quite intriguing.
It’s in the library section of Nyquist.
Can you give me a more exact pointer? Is it undocumented? Is it in C Or Lisp?
compress.lsp (13.3 KB)
I found it in Nyquist 3.08 source. So nobody adapted it into a .ny for Audacity? But that should be easy enough for me to try.
Igor likes his “magic formulae”.
Robert is right, what I am looking at is billed as a compressor/expander but without any smoothing at all of the envelope, wave shaper is the better name.
And he could have written it more efficiently with only 5 calls to mult instead of 12.
I am reading compress.lsp and I see it defines a gain function with three straight slopes and two soft knees that are defined by quadratics. Has anyone adapted this as a .ny? I could do it for myself but maybe someone has already done it.
Igor did something strange with just a single polynomial curve over the whole range of dB.
The basic compression (compress…) is already implemented in Nyquist for Audacity.
However, the art is to make the proper transfer function.
I’ve tried a totally different way, where the compression is locally linear but the length of the corrected segments change. It is actually the best I’ve come across so far–for spoken words at least.
The only two drawbacks are that the noise floor has to be a bit lower than normally because all long-term high-level segments are attenuated and that the number of iterations has to be defined by the user.
In other words, it lacks a low-level expansion (aka noise gate)and a probability analysis step.
I am tardy in updating my Audacity. I have a function called compress defined in follow.lsp which doesn’t look quite like this one.
I wonder whether complicated following as in Chris’s combined with a more complicated gain function is worth trying.
No compress function is mentioned in http://www.cs.cmu.edu/~rbd/doc/nyquist/part8.html#91
I think it is only mentioned in the library section.
The basic compression is the same but the interesting part, namely the map creation is unfortunately not automatically loaded.
The problem with all those compressors (in contrast to hardware/real-time) is that adjusting to the right thresholds for noise floor and signal level is always hazardous and tedious.
In my opinion, the greatest improvement would be to let the plug-in find the optimal settings by itself e.g. by the means of Hidden Markoff Models.
Of course, I’m always refering to single-voice compression and not music. Because only the former offers the chance to employ a general energy distribution, i.e. a modelled histogram.
The user would thus only set the target values, e.g. -3 dB peak, -20 dB Rms, -60 dB noise floor and a general compression ratio or a style preset.
I am confused, is this or ain’t this part of the Audacity distribution? I still can’t find it in the latest svn repository.
Does this mean, you are using a gain function with two segments and a hard knee, just like the built in compressor, but the threshold and slope vary adaptively to regions of input?
Have you shared the code for these experiments?
This would make a lot of narrators happy. It would take the guesswork out of hitting ACX guidelines.
I have read compress.lsp and I think compress-map may be a useful piece of a compressor-expander, but I think there are some confusions in compress and agc. It looks like they introduce undocumented delays in the signal. And in compress, the conversion of gain from dB back to linear omits a factor of (log 10).
I think I’ve encountered the same issues some time ago.
I had e.g. to correct for the delay in the follow function.
I am confused, is this or ain’t this part of the Audacity distribution? I still can’t find it in the latest svn repository.
I meant the CMU libraries documentation that includes naturally the libs of Standalone Nyquist.
Does this mean, you are using a gain function with two segments and a hard knee, just like the built in compressor, but the threshold and slope vary adaptively to regions of input?
The Rms vector is iteratively searched for the longest segments that ar above the threshold (e.g. -20 dB) and this section will be attenuated by 1 dB. It then begins all anew.
In this fashion, whole phrases are corrected, followed by words and syllables.
The slope is at the moment just one frame/block length (linear transition).
Have you shared the code for these experiments?
No, it is still under development.
The code is relatively slow because the input is alternatively analysed and multiplied by the correction envelope.
Of course, one could work with the initial Rms vector alone.
For the time being, I want to explore different block lengths, for example vary them by the golden ratio, in order to have the frame boundaries optimally distributed.
Also, since start and ending have a short taper, the Rms value of the analysed audio can be different from the calculated one (0.5 dB cut at those points).
The nice thing of this random access method is that you can make an inverted copy and run the effect on it. All will be silent, except the corrected expressions.
It also works with peaks although they are generally very short.
[quote=“Robert J. H.”]
In my opinion, the greatest improvement would be to let the plug-in find the optimal settings by itself e.g. by the means of Hidden Markoff Models.
Of course, I’m always refering to single-voice compression and not music. Because only the former offers the chance to employ a general energy distribution, i.e. a modelled histogram.
The user would thus only set the target values, e.g. -3 dB peak, -20 dB Rms, -60 dB noise floor and a general compression ratio or a style preset.
[/quote]This would make a lot of narrators happy. It would take the guesswork out of hitting ACX guidelines.
Yes, indeed.
My histograms show a relatively constant shape for different audio books. The trick will be to combine peaks and Rms in a meaningful manner.
It is also thinkable to use some filter curves as well, e.g. threshold of hearing for the noise floor calculation.
Lots of interesting thought experiments.
Hm, you take histograms of mean-square over 100 ms intervals, or something like that?
Does that tend to a bimodal distribution of “speech” and “pauses?”
Exactly so. I’ve been pondering about the problem of levelling a two-side conversation (Skype).
I wanted to study the histograms in order to apply categorization of the segments by a Gauss mixture model. It is clear that one voice + background themselves are already a mixture of two distributions (not necessarily Normal/Gauss).
The first peak in the histogram is at -inf dB, at least if a lot of digital silence is present. If dithered, the peak will be at -73 to -87 dB. Ordinary, stationary background voice can go up to ~-35 dB. The goal for a compression map would be to find the local minimum between speech (between -15 and -30 dB Rms) and the background to set the transition for the expansion part (if needed).
We will also have to rule out unvoiced segments in order to leave them be, by means of spectral flatness for instance.
However, the first step is in any case to normalize the whole audio to -20 dB Rms–a feature that I do actually miss in Audacity.
This is probably followed by a first limiter pass (very subtle of course). In any case, all has to be multi-pass in order to minimize distortion by expansion, compression and limiting.
Again, our advantage is that we don’t have to do it all in real time, otherwise we could simply inject the audio into a mobile phone and let the noise cancellation and the voice activity detection make one of those bubbly outputs.