bit rate/sample rate for spoken word

DougL · September 23, 2021, 3:15pm

I am offering mp3 audio from a seminar (spoken word) with Big Sur and Audacity 3.0.4, and I’d like to be maximally economical on file size. Not entirely clear how to decide on which bit rate/quality and sample rates to use. 16 kbps/constant “quality” and 8 kHz sample rate yield reasonable audio (with a very slight echo) and OK file size. How do I choose these two quantities to get what I want? I used to use Switch, which asked for just bitrate. I used 16 kbps on that, and the results were very nice. Why do I need two numbers to specify bandwidth?

steve · September 23, 2021, 3:43pm

Does it have to be MP3. You could achieve better quality and smaller file size with Opus, but Opus is not supported by some audio players.

MP3 has lots of options for optimising according to need (https://svn.code.sf.net/p/lame/svn/trunk/lame/USAGE).

If the files will be played by an app / device that supports VBR, then that is likely to provide slightly better quality than CBR. Some Apps / MP3 players have poor support for VBR, so if in doubt it is safer to use CBR.

For reasonable quality and good compatibility, I’d suggest:

Bit Rate Mode = Constant (CBR)
Quality = 32 kbps
Channel Mode = “Force Export to Mono”

DougL · September 23, 2021, 5:50pm

OK, but when I do

Bit Rate Mode = Constant (CBR)
Quality = 32 kbps
Channel Mode = “Force Export to Mono”

I get somewhat higher quality than I need and larger files than I’d like. So I drop down to Quality = 16 kbps. That gives OK audio quality. But then Audacity says, gee, you need to choose another sample rate! The “project sample rate” of 48 kHz is too big! So I try 24K, 12K, and 8K for sample rates. Same file sizes come out for each of these, and they all sound the same. So what’s the deal with sample rates?

Also, when I do this with Switch (I’ve been using an old one -1.50), specifying 16 kbps (higher quality) bit rate, I get a slightly better sounding audio, and a file size that’s about 2/3 as big as I get with Audacity with 16 kbps. How does that old Switch manage to do the conversion better for a given bitrate? Also, Switch never asks me or complains to me about sample rates. FWIW, the newer version of Switch (9.34) makes a 16 kbps file size that is half as big as I get with 16 kbps Audacity, and awful sound quality. Duh?

Why, when I use the SAME PARAMETERS for the conversion, do I get very different results with three different applications?

steve · September 23, 2021, 6:01pm

Be aware that some people find bad quality audio more annoying than others.

MP3 cannot encode 48 kHz or 44.1 kHz audio at less than 32 kbps. That’s why it is prompting you to select a different sample rate.

The sample rate specifies the available audio frequency range. The highest possible audio frequency that digital audio can represent is half the sample rate. Thus, for a sample rate of 8000 Hz, the maximum possible audio frequency is 4000 Hz.

A sample rate of 4000 Hz will make speech sound very dull, and it may be difficult for listeners to distinguish “F” from “S” and “T” from “B”.
A sample rate of 16000 Hz supports audio frequencies up to 8000 Hz, which is sufficient for clear speech recordings.

The sample rate does not affect the size of an MP3. The size of the MP3 is determined by the “kbps” (kilobits per second).

Perhaps it picks the sample rate for you when the sample rate is too high. Audacity gives you the choice.

DougL · September 23, 2021, 9:08pm

Be aware that I never said that 16 kbps yields “bad quality audio”. Might be bad to you, but it isn’t to me and my customers, and that’s what counts.

I’m still gobstruck by the fact that a given bit rate yields widely different results from different apps.

But that’s useful to know that the final files size depends only on bit rate and not sampling. That’s what I would have guessed.