Signal to Noise Ratios / Intensity Normalisation

This section is now closed.
Forum rules
Audacity 1.2.x is now obsolete. Please use the current Audacity 2.1.x version.

The final version of Audacity for Windows 98/ME is the legacy 2.0.0 version.
steve
Site Admin
Posts: 80752
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Signal to Noise Ratios / Intensity Normalisation

Post by steve » Wed May 04, 2011 8:30 pm

RobH_Lab wrote: Is there anyway of editing this code so that it sets the mean RMS level of a track to the specified value?
For the noise track, that's not a problem.
Create a noise track (Generate menu > Noise) as long as you like (within reason)
Select the entire track and apply the effect.
The "gain" (amplification) is calculated based on the first one second, but is then applied to the entire selection. Because the noise is constant throughout the duration of the track, all of the track will have the same RMS level.


For the "words" recording, it is a bit more complicated.
Lets say that you have one track that is a recording of someone quietly saying:
"mumble mumble mumble whisper whisper mumble mumble mumble mumble whisper whisper mumble mumble mumble"
Let's say you have another track where someone shouts:
"ONE......................................(long pause)...........................................................TWO"
Question: which is the loudest?
Clearly the second one will be a lot louder for the duration of each word, but the overall average may be greater for the quiet but continuous mumbling.
For normal speech, much of the recording will be silent, so measuring the "mean RMS level of a track" is less about the loudness of the voice, and more about the proportion of words to silence. I chose "1 second" as an approximation to "an average word length" so that it will provide a reasonable, quantifiable measure of the loudness of a word.
RobH_Lab wrote: is there any way of changing the length of section it works on (i.e. from 1s to 2s)?
Yes. The last number (100) is the length of the section that is measured in hundredths of a second.

The code works like this:
  • The RMS level is calculated 100 times per second for a given period. In effect, the audio is "sliced" into sections of 0.01 seconds duration, and the RMS level is calculated for each slice.
  • For the code given there are 100 "slices" of 0.01 seconds each
  • The maximum of these RMS values is then found.
  • An "amplify" amount is then calculated, based on the RMS value found in the previous step and the "target" level.
  • The entire audio selection is then amplified by that amount.
The result of this is that maximum RMS level in the first 100 slices will be equal to the "target" level.

To test the first 200 slices (2 seconds), just change the last number to 200 like this:

Code: Select all

(setq target-level -12)
(mult s (/ (db-to-linear target-level)(peak (rms s) 200)))
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

RobH_Lab
Posts: 6
Joined: Thu Apr 28, 2011 8:14 am
Operating System: Please select

Re: Signal to Noise Ratios / Intensity Normalisation

Post by RobH_Lab » Thu May 05, 2011 8:19 am

steve wrote:
RobH_Lab wrote: Is there anyway of editing this code so that it sets the mean RMS level of a track to the specified value?
For the "words" recording, it is a bit more complicated.
Lets say that you have one track that is a recording of someone quietly saying:
"mumble mumble mumble whisper whisper mumble mumble mumble mumble whisper whisper mumble mumble mumble"
Let's say you have another track where someone shouts:
"ONE......................................(long pause)...........................................................TWO"
Question: which is the loudest?
Clearly the second one will be a lot louder for the duration of each word, but the overall average may be greater for the quiet but continuous mumbling.
For normal speech, much of the recording will be silent, so measuring the "mean RMS level of a track" is less about the loudness of the voice, and more about the proportion of words to silence. I chose "1 second" as an approximation to "an average word length" so that it will provide a reasonable, quantifiable measure of the loudness of a word.
As the stimuli are single words I was hoping to remove any silence before/after the word and then apply a mean value to just the word on it's own.

steve
Site Admin
Posts: 80752
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Signal to Noise Ratios / Intensity Normalisation

Post by steve » Thu May 05, 2011 5:28 pm

Here's another version of the code.
This version will set the average (mean) RMS level of the selected audio to the "target level".

Code: Select all

(setq target-level -18)
(if (<= (get-duration 1) 10)
	(let* ((ssq (mult s s))
				(step (round len)))
		(mult s (/ (db-to-linear target-level)
		(snd-fetch (snd-sqrt (snd-avg ssq step step op-average))))))
	(print "Error.nSelection must be no more than 10 seconds"))
Note that if your selection includes some silence, then that will be taken into account when calculating the mean RMS, so it will make the mean RMS level of the selection will measure lower, thus after processing the waveform will be bigger.

I've set the "target level" to -18 dB, which should avoid clipping when processing single words. If you require a different RMS level, change the "-18" in the first line.

Again this only works on mono tracks.
The maximum length selection that can be processed is 10 seconds.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

Locked