Need to create specific SNR for Hearing Research Project

I apologize, I am very new to Audacity.

I am doing a research project where we would like to present speech stimuli (short sentences) amid speech babble at a specific signal-to-noise ratio. The sentence files are mono tracks and 4 seconds long. The sentence presentation is only about 2-2.5 seconds so there is some silence on either side of the actual presentation. I also have a 6-talker speech babble file. For this experiment, I want the sentence to be 3 dB louder than the babble (+3dB SNR). I understand you could “Normalize” each track so that the sentence track is -12 dB (for example) and the babble is -15 dB, but if I understand correctly, this will normalize to the highest peak of the signal, correct? I don’t think I want to do that because the babble has several high peaks, whereas the sentence only has a few. The noise ends up sounding louder than I feel like it should. I’ve learned that the RMS value is a more accurate way of describing the loudness of a signal because it takes into account the fact that there are peaks and quieter parts of the signal.

I found a nyquist prompt code (from an old post: that supposedly changes the RMS value of the track:
(setq target-level -12)
(mult s (/ (db-to-linear target-level)(peak (rms s) 100)))
According to this post, as long as the entire presentation of the word was within the first second, then it would change the RMS to -12 dB. I assume I need a code that would change the RMS value of the sentence (longer than one second) to -12 dB and a different code to change the RMS value of the babble to -15 dB.

Yes, Normalize and Amplify work on the highest peak.

Change the values of “12” and “100”.

(setq target-level -15)
(mult s (/ (db-to-linear target-level)(peak (rms s) 500)))

amplifies such that the maximum RMS level in the first 5 seconds is at the target level of -15 dB.


To make the RMS level of the “selected” audio a specified level, try this code:

(setq target-level -18)

(defun rms-normalize (sig target-db)
  (let ((rms-db (rms sig (/ (get-duration 1))))
        (level (db-to-linear target-db)))
    (mult sig (/ level (snd-fetch rms-db)))))

(if (> len 1000000)
    "Error\nMaximum selection length 1000000 samples."
    (multichan-expand #'rms-normalize *track* target-level))

This requires that the selection is no more than 1 million samples duration (a bit over 20 seconds for a sample rate of 44100 Hz).
Use this code in the Nyquist Prompt effect (requires a recent version of Audacity with the “Use legacy syntax” option NOT selected.
If applied to a stereo track, each channel will have the specified RMS level.

Thanks for responding. After some more researching and playing around with a sound level meter in the lab (we are presenting from a single speaker in sound field), the track I created where the mean RMS value of the babble was 3 dB lower than the sentence did not actually read out on the sound level meter as such (I measured each independently and it wasn’t near 3 dB difference). It makes me think I’m manipulating the wrong parameter of the signal. Should I just create each and every sentence and measure the actual SPL output with the sound level meter, simply use Audacity to make changes to the signal until I reach the desired SNR (regardless of what the wave statistics say), and go from there?

How reliable do you think the SPL meter measurements are, if they are of Audacity playing the tracks?

Are all the factors in the experiment controlled - same voice, same babble, same computer speakers, same room, same distance of the meter from the speakers, and so on?

If you can say in Audacity that the sentence has 3 dB higher RMS than the babble, then that is a quantifiable fact.


That’s a good point… I don’t know how reliable it is because there a lot of things I can’t control for. I think I can defend using the mean RMS.