Understanding Audio and Amplitude Levels

Hello. I am interviewing people with my iPhone, and need help better understanding at what levels things should be as far as the audio goes.

First off, when I listen to a final edit of an audio interview, at what level should the volume be set on my MacBook Pro?

I guess I assumed that if the volume on my laptop goes from 1 - 10, then you would want your final audio recording to sound good with your laptop’s volume set in the middle at “5”.

Is this a correct assumption?

Secondly, when I am working with the audio file (.m4a) in Audacity - that I separated from my .mov file using QuickTime - at what level should the amplitude be set on the audio of me interviewing someone? (Or really, this applies to music too.)

I assume that you want to “amp up” things so that the amplitude is as high as you can get and not clip most of the recording.

Is this correct?

I am asking these things, because it seems like the audio I have been recording is too quiet…

If I listen to interviews that I have done on my iPhone with my laptop’s internal speaker/volume set to “10”, then the interviews sound okay, but if I have the volume set to “5” then I personally can’t hear things - of course my hearing isn’t so great…

Can someone help me better understand the whole process of recording and then editing and then listening so I can get usuable recordings to play on my website??

“Loudness” is a bit complicated… Perceived loudness is related to the short-term average (or RMS) and the frequency content. (The frequency content isn’t a big issue when comparing voice files.) But digital levels are limited by their peak level which shouldn’t exceed 0dB. (Some formats can go over 0dB and Audacity can “internally” go over 0dB but your digital-to-analog converter can’t.)

I am asking these things, because it seems like the audio I have been recording is too quiet…

The 1st thing is to run the Amplify effect and accept the default. Audacity has pre-scanned your file and the Amplify effect will default to whatever gain (or attenuation) is needed for “maximized” 0db peaks. (The Normalize effect is similar.)

If it’s still not loud enough, try going-through the [u]Recommended Audiobook Mastering Process[/u], except you set the limiter to 0dB. (The -3dB Audiobook standard is an “odd” requirement that only applies to audiobook publishing.) The RMS setting determines the loudness and the limiter “pushes down” the peaks (if necessary) to prevent clipping (distortion).

If it’s still not loud enough you can try again with a higher RMS level.*

  • P.S.
    Remember these are negative numbers so -17dB is +3dB louder than the standard -20dB audiobook recommendation. And, you’ll probably want to go at least 3dB louder… That’s a “noticeable difference”.


Thanks for the reply!

Where can I learn more about “loudness theory”?

From the little I know, and what you said above, I’m guessing that the lower the frequency the more you perceive something being loud, right? (Which is why HipHop/Rap music is so annoying…)

Are human voice considered high or low frequency?

Yes, that is on of the few things I know how to do in Audacity, which I started using maybe 12 years ago to edit radio shows that I record off the Internet.

When I “amp up” my Internet stream recordings, I know enough by looking at music/radio streams to visually know how much to amp things up to a level that doesn’t clip.

The problem with my recordings is that if I am working with the audio track from my iPhone, then I have two people speaking, and my voice tends to be louder since I annunciate and project better than my interviewees.

Yesterday after posting, I played around with a few files, and if I ignored mini spikes and let them clip, and manually amped up my interviewee’s spoken parts word-by-work or by phrase, and then I DE-amped my parts, I made decent progress getting my interviews to sound good at say vloume 7 of 10.

But that would be a pain-staking process if I have to do a 1 hour interview, right?

Is there a way to use Audacity to help me automate some of this?

I am out in the field and traveling to Montana today, but will read that as soon as I can.

Can you explain more what that means?

Why are decibel numbers negative? I do not understand that at all?!

And when you are editing music or spoken tracks, is amping things up so the peaks are a tad under max (i.e. 0db) okay?

I don’t have Audacity on this notebook, but when I “amp up” my radio shows, or when I played with my interviews yesterday, I just amp things up to where the peaks are maybe at 90% of the scale which on my Audacity isn’t in dB but I think it goes from something like -1 to 1 or maybe -5 to 5 or somerthing weird like that? (Will have to check it out when I have my laptop booted up.)

Finally, can you please address my question about, “At what volume level should you be listening to music or in my case interviews on your laptop?”

It seems thatg you would want your audio files to be able to be heard when a listener’s volume is set to “5”, because if people have to crank up the volume to “10” then there is no way to make things really loud, right?

Also, if you have to crank up your laptop/computer’s volume to “10” to be able to hear an interview, what does someone who is hard-of-hearaing do? Set things to “11”? No, they would be screwed…

This is a good place to start: https://en.wikipedia.org/wiki/Loudness

The ReplayGain specification provides an interesting perspective of how “loudness” relates to digital signals: https://wiki.hydrogenaud.io/index.php?title=ReplayGain_specification

Human voices are mostly in the range 300 Hz to 7000 Hz, which is approximately in the middle of the human hearing range (usually quoted as 20 Hz to 20,000 Hz)

I’ve been wondering that too. I found this thread while searching the forums for the answer, but actually found the answer, such as it is, on Quora:

To me, that still sort of begs the question of why the top is 0. Maybe because there is a practical maximum value and 0 is a slightly more logical choice of arbitrary number than any positive one would be?

To me, that still sort of begs the question of why the top is 0. Maybe because there is a practical maximum value and 0 is a slightly more logical choice of arbitrary number than any positive one would be?

Even before digital 0dB was the the approximate maximum for electrical audio signals. But it is an “arbitrary” level. 0dBV is 1 Volt, etc. 0dB VU is even more arbitrary but it’s a defined standard. (A lot of what we call “VU meters” aren’t actually measuring VU.) [u]Wikipedia[/u] has a picture of an analog VU meter.

The digital maximum is important because you’ll get clipping (distortion) if you “try” to go over. The actual maximum “count” depends on the number of bits. i.e. With 16 bits you can count from −32,768 to +32,767. If your positive & negative peaks hit those values that’s 0dB. A flat-line at zero (“dead digital silence”) is negative-infinity dB so you can’t use the value of zero as the 0dB reference.

0dB SPL is approximately the quietest sound you can hear. But dead-acoustical silence is also -infinity dB SPL but that doesn’t exist on earth.

Annoying is not thumping bass. Annoying is baby screaming on a jet. Most of that energy settles in around 3000Hz the “sweet spot” for human sensitivity. Also see: fingernails on blackboard.

There is a positive usage of this sensitivity. Wired telephones cram as much energy as possible into the pitch range around 3000Hz to make people, not easier to hear, but easier to understand.

The standard rock band microphone, the Shure SM58 …

… is not a “flat” microphone. It has a loudness boost roughly between 2000Hz and 7000Hz. Few people use an SM58 in a studio unless it’s for special effect.


A lot of what we call “VU meters” aren’t actually measuring VU.

Volume Units are a little strange. I got to ride technical for my TV Station Air Show and I sat down to a sound mixing board with “real” VU meters. It was like learning sound all over again.

Both dB and dB-SPL use convenient limits. Sound Pressure level sets 0 where it’s so quiet most people’s hearing can’t find it and then works up. Most people aren’t interested in the racket their bacteria are making and would just as soon not find out how loud a thermo-nuclear explosion is (but probably is a real number).

DB 0 is where the digital transmission/storage system is so loud it runs out of numbers. This is an important limit because storage/transmission systems have limited room. The Audio CD format 44100/16-bit/Stereo was a massive fist-fight over show length, sound quality, and storage capacity.

There are variations. Audacity uses 32-Bit Floating internally. It’s a special digital format that in essence, doesn’t overload. It keeps making up new digits as the sound gets louder. This is needed so you can experiment with your edit and make mistakes without flushing your show down the drain. Audacity will allow you to make a sound file to that standard, but for the most part, you can’t, with assurance, use that format anywhere else.

You should stick with plain, recognized digital formats if you plan to send your show out. That’s the limitation of New and Improved formats. You have to convince everybody else to use them.

There was one explanatory web page that decided to abandon the negative decibels in favor of positive ones. That may have helped their immediate users, but it wouldn’t be of any use if they went anywhere else, and it made my head hurt.