Normalizing tracks doesn't "really normalize" them

There is something that I can’t quite understand about the “normalize” function. The attached screenshot shows two tracks that were individually normalized using the same settings. However, it is clear from the screenshot that the upper track is louder than the lower one. Isn’t this what normalization is supposed to solve? If not, how does one normalize the volume of two tracks of an interview?
Screen Shot 2021-09-23 at 7.04.56.jpg

Normalizing isn’t what you think… :wink: It’s a mathematical/statistical concept.

Normalizing adjusts the volume for (approximately) “maximized” 0dB peaks. There is a “competitor” to Audacity called GoldWave and they call it “maximizing” which is a better English word but “normalizing” is the correct audio terminology. All it takes is one short peak that might not even sound that loud to limit/control how loud your file goes.

Peak levels don’t correspond well with perceived loudness so if you normalize all of your files they won’t necessarily be equally-loud.

Audacity does have a [u]Loudness Normalization[/u] effect which adjusts-for a perceived loudness.

These are both “linear” adjustments that apply the same adjustment to the whole file (or the selection) as-if you adjusted the volume up or down before starting playback. They are NOT automatic volume control or “leveling”.

There is a potential issue with Loudness Normalization - You can end-up pushing your peaks into clipping (distortion) depending on your particular audio file and your loudness setting. You have to watch-out for that.

The recommended audiobook procedure applies limiting after loudness adjustment to prevent clipping (and to meet the audiobook spec of -3dB maximum peaks). A limiter might also work for your podcast. If your peaks are below the limit, the limiter won’t do anything.

What, then, is the intended use of the “regular” normalization?

To amplify up to a specified level.

A lot of people (myself included) like to amplify the audio close to maximum before exporting (For WAV files, I generally normalize to -1 dB).

What, then, is the intended use of the “regular” normalization?

GoldWave calls it “maximizing” which is a better English word for it but normalizing is the proper audio terminology. …There are 3rd-party “maximizer” effects, but these are non-linear (using compression and limiting) and they alter the character of the sound.

Most commercial recordings are normalized/maximized. Plus, most commercial recordings use compression and limiting to boost the loudness without clipping the peaks.

You can record at a low-enough level to leave headroom for unexpected peaks and then normalize later… The most important thing is that you don’t “try” to go over 0dB and [u]clip[/u] (distort) during recording so it’s better to amplify later.

For example, if you are digitizing a vinyl record, you can normalize a whole side (or both sides combined) to maximize the volume of the album while maintaining the relative loudness of loud & quiet songs as was originally intended.

With a podcast you might want to manually adjust the volume here-and-there as necessary, or you can use volume-leveling or dynamic compression, etc., to even-out (and possibly boost) the volume.

how does one normalize the volume of two tracks of an interview?

Darn good question. Most of these tools are generically intended to set volume for completed projects. Completed projects don’t have twenty or thirty seconds of dead silence.

The legacy answer is turn the show over to the recording engineer who has enough smarts to know to set the host volume at a specific volume no matter how long the guest takes to answer. The guest is on a different fader on the mixing console and here, too, the recording engineer sets a good volume no matter how long the guest talks.

Since you have the host and the guest on different tracks, it should be possible to select each track and set the volumes manually. Audacity will play back everything at once (unless you prevent it with the SOLO and MUTE buttons) so it should be a simple matter to play the combination show to your headphones to make sure the balance is good.

Audacity will push everything together into one single show when you export the WAV Edit Master. It’s at that magic point you can apply automated tools so the posted MP3 podcast comes out the overall right volume.

This is also the place to apply broadcast limiter simulations such as Chris’s Compressor which will be happy to “ride gain” on the show and even out anything you may have missed earlier.

Koz

how does one normalize the volume of two tracks of an interview?

By ear of by software. If you’re lucky to have the two speakers in different tracks, then it’s not much work to do it “by ear”, as long as there’s not much intra-track variation. If they are in the same track, or you do have lots of intra-track variation, like one person kept changing their distance to their mic, then… oh boy… you need (1) segmentation into clips, (2) [auto-]normalization of clips relative to each other. Software that supports non-destructive envelopes across clip groups makes this latter bit a little easier to experiment with, although it’s not that big of a deal in Audacity since you can undo changes, but you can’t really have that enveloping done on non-contagious clip groups in Audacity unless you further bounce them to yet more tracks or write your own fx (e.g. in Nyquist) that process just some clips, for instance based on labels. The lack of a relative normalization feature (clip to clip) is a bit more of slow-downer in Audacity. Alternatively, you can just use a compressor and hope for the best, as it is essentially a fully dynamic version of clip-to-clip normalization, without even having to define anything like clips, but you’d probably have to tune it a little bit, in terms of parameters.

I could add that some tools like Izotope RX try to have it both ways, in that they’ll produce a user-editable envelope from what’s essentially their compressor (they call it leveler IIRC), but the catch with that approach is that the envelope is going to have so many control points that you’ll die of old age adjusting them manually if the software didn’t get them right automagically, so it’s mostly an illusion of control compared to manually splitting the offending regions that didn’t come out right.