Signal to Noise Ratios / Intensity Normalisation
Forum rules
Audacity 1.2.x is now obsolete. Please use the current Audacity 2.1.x version.
The final version of Audacity for Windows 98/ME is the legacy 2.0.0 version.
Audacity 1.2.x is now obsolete. Please use the current Audacity 2.1.x version.
The final version of Audacity for Windows 98/ME is the legacy 2.0.0 version.
Signal to Noise Ratios / Intensity Normalisation
Hi, hope this is the correct forum for this sort of question.
I'm looking at producing some stimuli for an experiment on auditory processing. Each stimulus will be a spoken word which is masked by white noise. To this end I have recorded separate .wav files contain various spoken words, and I have also generated a separate .wav file of white noise. Its easy enough to 'mix' the noise and the each word together into one .wav file, however I also need to do the following
1) Normalise the intensity levels of each track (i.e. so each stimulus will have the same intensity level). As each word track will vary in intensity over time, I suspect normalising will involve ensuring that the average intensity of each track is the same. Is there any way of 'snapping' a set of tracks to the same intensity level - so that they have the same average intensity?
2) I need to play around with the signal to noise ratio (i.e. the relative intensity of the white noise vs the word) of each stimuli. The only way I can find to do this at the moment is to play around with the gain slider bars of each track (i.e. the white noise and the word) before they are mixed together. Is there a more accurate way of doing this in audacity? My objective here is to make the words hard, but not impossible to hear through the white noise.
I am using version 1.2.6 on the lab computer (which is part of a managed service, so I can't myself install ver 1.3). However I have ver 1.3 at home on my laptop so I could use that if there are features in that version that would be helpful for this problem.
Thanks
Rob
I'm looking at producing some stimuli for an experiment on auditory processing. Each stimulus will be a spoken word which is masked by white noise. To this end I have recorded separate .wav files contain various spoken words, and I have also generated a separate .wav file of white noise. Its easy enough to 'mix' the noise and the each word together into one .wav file, however I also need to do the following
1) Normalise the intensity levels of each track (i.e. so each stimulus will have the same intensity level). As each word track will vary in intensity over time, I suspect normalising will involve ensuring that the average intensity of each track is the same. Is there any way of 'snapping' a set of tracks to the same intensity level - so that they have the same average intensity?
2) I need to play around with the signal to noise ratio (i.e. the relative intensity of the white noise vs the word) of each stimuli. The only way I can find to do this at the moment is to play around with the gain slider bars of each track (i.e. the white noise and the word) before they are mixed together. Is there a more accurate way of doing this in audacity? My objective here is to make the words hard, but not impossible to hear through the white noise.
I am using version 1.2.6 on the lab computer (which is part of a managed service, so I can't myself install ver 1.3). However I have ver 1.3 at home on my laptop so I could use that if there are features in that version that would be helpful for this problem.
Thanks
Rob
Re: Signal to Noise Ratios / Intensity Normalisation
The main problem here is the word "intensity". What exactly do you mean?
"Intensity" could mean:
"Intensity" could mean:
- Peak level
- Average peak level
- RMS level
- Maximum RMS level with a specified window size
- Band limited peak or RMS measurement
- Any of the above while ignoring silences between words
- Loudness
- Something else entirely
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
Re: Signal to Noise Ratios / Intensity Normalisation
The study I'm trying to replicate says this:
In addition, all sound stimuli were normalized to the same decibel level, such that all ‘‘noise-word’’ and ‘‘noise-alone’’ stimuli were matched on sound level.
I need an objective measure of loudness - I assume sound pressure measured in decibels would suffice. Given that this is going to fluctuate over the verbal stimulus I think average sound pressure over the course of the spoken word would be good enough. Is there a way of measuring this in Audacity?
Thanks
Rob
In addition, all sound stimuli were normalized to the same decibel level, such that all ‘‘noise-word’’ and ‘‘noise-alone’’ stimuli were matched on sound level.
I need an objective measure of loudness - I assume sound pressure measured in decibels would suffice. Given that this is going to fluctuate over the verbal stimulus I think average sound pressure over the course of the spoken word would be good enough. Is there a way of measuring this in Audacity?
Thanks
Rob
Re: Signal to Noise Ratios / Intensity Normalisation
That is horribly vague.RobH_Lab wrote:all sound stimuli were normalized to the same decibel level, such that all ‘‘noise-word’’ and ‘‘noise-alone’’ stimuli were matched on sound level.
"Loudness" is (by definition) subjective. There are "standardized" scales of loudness (see "Equal Loudness Curves" in the "Loudness" link from my previous post), but there's several different "standards". Similarly there are different ways of measuring sound pressure level. The best method would probably be to use a SPL meter and to fully specify the SPL level and "weighting" that is being used, but even this method has limitations. We can't do that in Audacity as it would require calibrating the microphones and playback system, but we can probably come up with a "close enough" approximation.RobH_Lab wrote:I need an objective measure of loudness - I assume sound pressure measured in decibels would suffice.
Are the "spoken words" just single words? That is, you have one test with white noise and someone saying "banana", then you have another test with someone else saying "banana"?
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
Re: Signal to Noise Ratios / Intensity Normalisation
All the relevant literature is vague - most don't even specify the signal to noise ratio or even the software they used to create the stimuli!.
Basically what I want is to try and ensure that all the stimuli that we subject the participant to are the same 'loudness'. As percieved loudness is subjective and therefore unmeasureable accurately, I need to match on the closest objective measure - which would I suppose be sound pressure.
The stimuli are a selection of different single words in white noise. I want to make sure that none of the individual stimuli are different in terms of sound pressure, and I also need to be able to calculate/adjust a signal-to-noise ratio between the white noise part of the track and the speech part of the track, so that the SNR is constant across the different stimuli. SNR is reported as the different in relative DB between two stimuli, so a SNR of -5 relates to the noise having 5db more power/intensity(?) than the signal. Messing around with the gain sliders for the noise and the signal on each track does an approximation of this, but I don't know how to calculate the SNR from this, and it would be invalid if the signals for each stimuli start at different db levels anyway (the same snippet of white noise would be used in every stimuli).
Basically what I want is to try and ensure that all the stimuli that we subject the participant to are the same 'loudness'. As percieved loudness is subjective and therefore unmeasureable accurately, I need to match on the closest objective measure - which would I suppose be sound pressure.
The stimuli are a selection of different single words in white noise. I want to make sure that none of the individual stimuli are different in terms of sound pressure, and I also need to be able to calculate/adjust a signal-to-noise ratio between the white noise part of the track and the speech part of the track, so that the SNR is constant across the different stimuli. SNR is reported as the different in relative DB between two stimuli, so a SNR of -5 relates to the noise having 5db more power/intensity(?) than the signal. Messing around with the gain sliders for the noise and the signal on each track does an approximation of this, but I don't know how to calculate the SNR from this, and it would be invalid if the signals for each stimuli start at different db levels anyway (the same snippet of white noise would be used in every stimuli).
Re: Signal to Noise Ratios / Intensity Normalisation
If you run the following code in the "Nyquist Prompt" (in the Effect menu), it will amplify the selection so that the first 1 second has a peak RMS level of -12 dBFS. (only works with mono tracks).
If you require a different RMS level, change the "-12" in first line to whatever you want the target RMS level to be.
How to use it:
Select the part of the audio track that contains the "word".
Ensure that the selection is at least 1 second, and that the word is in the first 1 second of the selection.
Select "Nyquist Prompt" from the Effect menu.
Copy and paste the code (above) into the Nyquist Prompt box.
Click OK.
The entire selection will be amplified such that the maximum RMS level in the first 1 second is at the target level.
Technical detail: The RMS value is calculated with a window size of 0.01 seconds.
You can also use this on your noise sample (mono only).
Note that different types of noise with the same RMS level will have different peak levels and will have different masking effects.
Audacity 1.3.13 contains a noise generator that can produce white, pink or brown noise.
I've only tested this in the current 1.3.13 version of Audacity. No guarantees that it will work correctly in older versions.
You can get Audacity 1.3.13 from here: http://audacityteam.org/download/
Code: Select all
(setq target-level -12)
(mult s (/ (db-to-linear target-level)(peak (rms s) 100)))
How to use it:
Select the part of the audio track that contains the "word".
Ensure that the selection is at least 1 second, and that the word is in the first 1 second of the selection.
Select "Nyquist Prompt" from the Effect menu.
Copy and paste the code (above) into the Nyquist Prompt box.
Click OK.
The entire selection will be amplified such that the maximum RMS level in the first 1 second is at the target level.
Technical detail: The RMS value is calculated with a window size of 0.01 seconds.
You can also use this on your noise sample (mono only).
Note that different types of noise with the same RMS level will have different peak levels and will have different masking effects.
Audacity 1.3.13 contains a noise generator that can produce white, pink or brown noise.
I've only tested this in the current 1.3.13 version of Audacity. No guarantees that it will work correctly in older versions.
You can get Audacity 1.3.13 from here: http://audacityteam.org/download/
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
Re: Signal to Noise Ratios / Intensity Normalisation
Hi
I tried the code, by splitting my audio tracks to mono. It seems to have worked for most of the word stimuli, apart from a couple which didn't seem to change at all and (subjectively) seem a bit quieter than the rest - however that is probably due to the pattern of lexical stress on those words so I might just re-record them and see if I get a better result.
To mix the noise and words together I was just importing both tracks into the same project and editing the volume levels by moving the 'gain' sliders to the left of each track. Is this the best way of varying signal ratios when mixing two tracks together? Also if I (for example) move the noise track to a gain of 6db and leave the signal at 0, does this equal a SNR of -6, or does the gain not work in that way?
Thank for your help
Rob
I tried the code, by splitting my audio tracks to mono. It seems to have worked for most of the word stimuli, apart from a couple which didn't seem to change at all and (subjectively) seem a bit quieter than the rest - however that is probably due to the pattern of lexical stress on those words so I might just re-record them and see if I get a better result.
To mix the noise and words together I was just importing both tracks into the same project and editing the volume levels by moving the 'gain' sliders to the left of each track. Is this the best way of varying signal ratios when mixing two tracks together? Also if I (for example) move the noise track to a gain of 6db and leave the signal at 0, does this equal a SNR of -6, or does the gain not work in that way?
Thank for your help
Rob
-
kozikowski
- Forum Staff
- Posts: 68939
- Joined: Thu Aug 02, 2007 5:57 pm
- Operating System: macOS 10.13 High Sierra
Re: Signal to Noise Ratios / Intensity Normalisation
All that and you may already be up to your ankles in the "loudness war" that TV commercials have. All the sound meters read proper, but somehow the commercials always seem to be louder than the show.
At one time, the advertisers pushed the idea that they could be as loud as the loudest part of the show -- say a gunshot or explosion. That was a good try. Didn't work.
Koz
At one time, the advertisers pushed the idea that they could be as loud as the loudest part of the show -- say a gunshot or explosion. That was a good try. Didn't work.
Koz
Re: Signal to Noise Ratios / Intensity Normalisation
Probably easier to just record the audio track in mono (unless you already have all of the "word" recordings).RobH_Lab wrote:I tried the code, by splitting my audio tracks to mono.
Alternatively you can use "Tracks menu > Stereo Track to Mono" which will "mix" the two channels to to a mono track. This may work better than splitting if the channels are not identical, and is probably a bit quicker to do (probably only available in Audacity 1.3.x and not 1.2.6).
"Subjective" is the important word.RobH_Lab wrote:(subjectively) seem a bit quieter than the rest
"Loudness" is always subjective. The measure provided by the code posted will usually provide an approximation to "loudness", but it is an absolute (objective) measurement and will not always correlate to (subjective) loudness.
When using the code it is important that the entire word fits into the first 1 second of the selected audio. If some of your words are longer than 1 second, I can modify the code to measure over a longer period. (1 second is probably about enough to accommodate most words).
If you move the gain slider of one track to -6 dB, then that track will play 6 dB lower than before.RobH_Lab wrote:Also if I (for example) move the noise track to a gain of 6db and leave the signal at 0, does this equal a SNR of -6, or does the gain not work in that way?
Signal-to-noise ratio is defined as the power ratio between a signal (meaningful information) and the background noise (unwanted signal).
As the level of noise is essentially constant throughout the duration of the word, but the level of audio varies as the word is being spoken, the SNR will be constantly changing.
Before the word is spoken, and after the word has been spoken there is only noise, so the SNR is 0:1 (all noise).
If the average power while the word is being spoken is equal to the average power of the noise, then the SNR is 1:1
Measured in dB, SNR is P(signal) - P(noise)
This value is usually an "average power" measurement over a specified period (within a specified frequency range).
In the case of the code provided, the specified period is 1 second.
If both the noise and the word are "normalised" using this code to -12 dB, then the SNR measured over the first 1 second will be:
(-12 dB signal) - (-12 dB noise) = -12 + 12 = 0 dB SNR
If the noise is reduced by 6 dB, then the SNR becomes:
(-12 dB signal) - (-18 dB noise) = -12 + 18 = 6 dB SNR
Reducing the noise by a further 6 dB gives:
(-12 dB signal) - (-24 dB noise) = -12 + 24 = 12 dB SNR
See here for more info: http://en.wikipedia.org/wiki/Signal-to-noise_ratio
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
Re: Signal to Noise Ratios / Intensity Normalisation
Is there anyway of editing this code so that it sets the mean RMS level of a track to the specified value? Also is there any way of changing the length of section it works on (i.e. from 1s to 2s)?steve wrote:If you run the following code in the "Nyquist Prompt" (in the Effect menu), it will amplify the selection so that the first 1 second has a peak RMS level of -12 dBFS. (only works with mono tracks).If you require a different RMS level, change the "-12" in first line to whatever you want the target RMS level to be.Code: Select all
(setq target-level -12) (mult s (/ (db-to-linear target-level)(peak (rms s) 100)))
Thanks
Rob