Audio export, sampling rate and quality?

I’m a little confused about Audacity audio export. I am asked for sampling rate and quality, both in kHz. But sampling rate IS quality, as far as I understand. What exactly is “quality”, as I’m being for? How does resulting file size depend on the two?

I manage hour-long seminars. My recorded file size for each is 50MB, and Mac Finder “info” tells me it has 48kHz sampling rate. I trim a little bit, and export at 24 kHz sampling rate, and 16 kHz quality. My resulting file size, is 5 MB, which is handy, and the audio is quite acceptable. I would have thought that half the sampling rate would export to a file half the size. but that’s not the case. Something else is going on.

What is the bit depth of the raw and converted files? It also makes a difference.

Others will probably explain it better than me.

I’m using a Mac, and all I’m told in “Get Info” is the sampling rate. How to you inspect a file to get bit depth? That makes sense that file size will depend on both sampling rate and bit depth. Also, as I asked, what is “Quality” in Audacity export, and, for that matter, how do you specify bit depth in Audacity export?

Long answer…

This little tutorial explains how digital audio “works”.

FILE SIZE -
For uncompressed files (i.e. WAV) it’s easy to calculate file size. There are 8-bits in a byte so “CD quality” is 16-bits/8 x 44.1kHz x 2-channel stereo = 176kB per second, or about 10MB per minute.

FLAC is lossless compression and it’s usually about half the size of the uncompressed data.

MP3 and MP4 (aka M4A or AAC) is lossy compression and data is thrown away to make a smaller file. A good quality (high bitrate) MP3 is about 1/5th the size of CD quality files. It’s “smart” and it tries to throw-away details you can’t hear and it can often sound as-good as the uncompressed original but information is being thrown away. As you go lower in bitrate, more data is thrown away and at some point you’ll notice quality loss.

MP3 doesn’t store individual samples so it doesn’t have a “bit depth”. It does have a sample rate (up to 48kHz).

Audacity AUP3 project files use 32-bit floating point data, plus it collects “undo” information so project files can be very large.

The bitrate is also related to file size. There bitrate for audio files is usually expressed as kbps (kilobits per second) and it’s often used as an indication of quality for compressed files. You can divide by 8 to get file size in kilobytes per second. We don’t usually “talk about” the bitrate for uncompressed audio, but CD audio is 16 x 44.4 x 2 = 1411kbps. The highest bitrate for MP3 is 320kbps.

Audio files often contain metadata (“tags”) and any embedded artwork adds to the file size and it’s not included in the bitrate calculation. (Audacity doesn’t support embedded artwork.)

QUALITY -
CD quality (16/44.1) is generally better than human hearing but some “audiophiles” like higher resolution and the only downside to higher resolution is bigger files.

But up-sampling doesn’t automatically improve the sound. It’s sort-of like copying a VHS tape to Blu-Ray… That doesn’t give “Blu-Ray” quality,

Audio is two-dimensional. The bit depth represents the amplitude resolution With 16-bits you can “count” from -32,768 (for the negative half of the wave) to +32,767 (for the positive half). With 8-bits you can only count to 255.(1)

The higher the resolution the more “steps” so the finer the resolution. (The “steps” are filtered-smoothed at the output of the DAC.)

If you want to know what “low resolution” audio sounds like you can export as 8-bit WAV. (2)

The sample rate (kHz) determines the frequency resolution. The audio frequency is limited to half of the sample rate. i.e. CD audio can’t go above 22,050Hz. The filtering/smoothing isn’t perfect so you can’t quite go that high but CDs usually to go to 20kHz which is the “traditional” human hearing limit.

If you start with a good music file and export at 8kHz you’ll notice the loss of highs.

(1) Just to complicate things… 8-bit WAVs don’t store negative values. The data is biased or offset but it’s “corrected” when played so the electrical signal coming-out of the DAC does go negative.

(2) If you do that, go to Edit → Preferences → Quality and set both dither settings to “None”. Dither is added noise that’s supposed to sound better than the “natural” quantization noise. So dither will “mess up” the low-resolution experiment.

I’m using a Mac, and all I’m told in “Get Info” is the sampling rate. How to you inspect a file to get bit depth?

I’m a Windows guy but MediaInfoOnline may give you more details.

Also, as I asked, what is “Quality” in Audacity export, and, for that matter, how do you specify bit depth in Audacity export?

-Quality (bitrate) shows-up for for certain compressed formats.
-For WAV you’ll see a bit depth option.
-For FLAC you’ll also see “level” which doesn’t affect quality, since FLAC is always lossless. At higher “levels” there is more computing, and it takes longer, as it tries to make a smaller file. Usually it goes fast so there’s no reason to use a low level.

Thanks to you both. That makes a lot of sense. I think the key point is that mp3 is a compressed and lossy medium, so bit depth is kind of meaningless. That being the case, I’m still a little confused about Quality in Audacity. I am just exporting spoken words, so audio quality is not that important. My export default in Audacity is 24 kHz “Sample Rate” and 16 kb/s for “Quality”. Now, if Audacity “Quality” were really bitrate, that wouldn’t make a lot of sense. That would mean I’d be talking about on the order of one bit per sample.

Don’t forget It’s compressed so the individual samples aren’t stored. When you play it, it’s decompressed and it will be 24,000 samples-per-second, with a bit depth to match your DAC (your soundcard/hardware).

Theoretically, you could compress a very-long silent file, or a “simple tone” with a few bytes but audio compression formats aren’t THAT good. In VBR mode, MP3 does make smaller files with silence or simpler sounds. And it uses fewer bytes for any silence between words.

How does it sound?

16kpbs is “low quality” but voice compresses easier than music so it may be OK. if it sounds OK and you want a small file, fine. If you are starting-out with a better quality (and better sounding) file you’ll probably hear a difference.

If you had a good-sounding music file you’d certainly hear the difference and you probably wouldn’t like it. Although on a phone-speaker or laptop speakers you’ll get limited sound quality in any case.

There are some special formats for voice that can make even smaller files but of course your software has to be able to play it. (I don’t know much about those formats.)

If by quality you are referring to the presets in Audacity, I think of them as kind of arbitrary, but they were borrowed from the LAME documentation.

The issue isn’t what gives me the best quality. I KNOW that 24 kHz “Sample Rate” and 16 kb/s for “Quality” works fine for me, since my recordings are voice. I’m just trying to understand how you can ask for 16 kb/s data rate when the sample rate is 24 kHz. That would be about one bit per sample. Compression can’t be that good.

With MP3 we are not talking about a one-to-one bit comparison. Maybe this page will help.

Well, it doesn’t really help. I use 24 kHz “Sample Rate” and 16 kb/s for “Quality”, and it works fine for voice. But if you sample 24k times per second, and you spit out 16k bits/second, your spitting out less than one bit per sample. Just not sure how any compression can manage that.

Like I said, there are no “samples” in the MP3 compressed file. The samples come-back when it’s played or otherwise decompressed, But since it’s lossy you’re not getting the exact-same samples back.

You can use a converter program to decompress the file to 32-bit, 192kHz WAV, or any other format.

If you re-open then file in Audacity you’ll get the usual Audacity default of 32-bit floating point at 24kHz because MP3 “remembers” the original sample rate and Audacity (and most decoders/decompressors) will use that as the default.

I could say “play a 1kHz tone for one hour” and that just takes a few bytes so that could be considered a simple type of compression for simply-defined audio.

Have you ever used ZIP compression? Normally every text character requires one byte but with ZIP compression you can compress a simple text file and there can be more characters than bytes. (Word processor formats have more overhead but “zipping” still makes a smaller file than the original format.)

ZIP is a completely different algorithm and it’s lossless.

There was once a really good website about how MP3 works but it’s gone. :frowning: I didn’t understand it completely… It’s a super-complicated algorithm with lots of “parts”.

Thanks. That does make sense. Though Mac Finder does tell me the “Sample Rate” when it looks at an mp3 file. Not sure what it’s measuring. And when a conversion is being done, the input must be sampled at a certain rate. I use zip compression all the time, but if I’m lucky, I only get a factor of two compression for plain text.

I should add this. I also have Switch Sound File Converter app, and when I use it (set to bit rate 16 kb/s, no allowance for sample rate) it produces a file that is half the size of that with Audacity export (set to bit rate 16 kb/s AND sample rate 24 kb/s) for which the audio is substantially worse. So the compression seems to depend on a lot more that resultant bit rate. Not clear how to get them to do the same thing.

I probably should have stopped at my first post, but here we are. In for a penny, in for a pound.

I found a web page that describes Run Length Encoding. It is a different means of compression than MP3, but it is easier to understand. It is a good example of how compressed data looks nothing like the uncompressed data until it is decoded.

LAME can re-sample before compressing depending on the settings,. Then it “remembers” so that’s normally what you get back when you decode.

There are LOTS of LAME settings but Audacity doesn’t allow access, or easy access, to most of them:
LAME command line usage

I assume the developers know what they are doing so I don’t mess with the defaults. But I do have a Windows LAME utility with a GUI interface that makes it easy to make an MP3 with any setting you want.

MediaInfoOnline should give you the TRUE bitrate of both files that are SUPPOSED to have the same bitrate. Assuming no embedded artwork the bitrate is directly related to file size. 24kbps is 3k byes per second… Multiply the time in seconds by 3 and you have the file size. Or divide to get the true bitrate.

LAME has a VRB (variable bitrate) option where you choose a quality setting (0-9 with 0 being the best) and the encoder adjusts then bitrate moment-to moment depending on how complicated the sound is. (Silent parts use a lower bitrate, etc.)

There is also an AVR (average bitrate) where you give the overall bitrate and again, it uses a higher bitrate when needed and a lower bitrate on easier-to-compress sounds.

I assume your file is not stereo but another option that Audacity uses is Joint Stereo. Joint stereo takes advantage of information that’s common in both channels and it only encodes that information once. That makes better-use of the limited bits, giving better quality than the regular stereo option.

That is handy, and true for Audacity. Of course, if you’re using another bitrate, it scales accordingly. But as noted, that rule doesn’t work with Switch. VERY useful about MediaInfoOnline! Thank you.

This is how bitrate is mathematically defined so maybe there’s something wrong with Switch.

Audio/Video files are the same except the calculation will give you the total combined audio & video bitrates.