So I export a 32bit audio (MP3) and then save it to a mkv. When I drag the mkv into audacity - it picks this up as a 16-bit file…However if I use MKVExtract to EXTRACT the audio from the mkv and open in audacity - comes out at 32 bit.
Internally, Audacity always works with 32 bit floating numbers for precision. If you import a 16 bit audio file, it will be converted to 32 bit float. Even 8 bit audio will be converted to 32 bit on import.
When exporting, you can choose the bit depth. 16 bit is sufficient, unless you want to work with the audio in another DAW. In that case, choose 24 bit.
An MP3 is always 16 bit max. MP3 is a compressed, lossy format, so some samples might be a lot less than 16 bit. You can’t choose the bit depth. The compressor chooses that for you, according to the content of the sample.
WAV is an uncompressed format. It can be 16, 24 or 32 bit. 32 bit is rarely used as an export format, as it creates huge files and is only needed when going from one DAW to another, fi for mastering.
FLAC is a compressed, lossless format and will take any bit depth and resolution you throw at it. There are others, like Wavpack.
The import behaviour depends on the importer. You are using FFmpeg to import MKV, and for MP3 content, that means the MP3 audio in the MKV will import as 16-bit resolution PCM audio.
MP3’s are a 32-bit float stream as I understand it. The author of the libmad MPEG decoder library told me so. Regardless, most applications will just report MP3’s as “16-bit”. What bit depth they are decoded at depends on the application playing it. Libmad provides 24-bit PCM output.
While that is technically true, Gale, it has nothing to do with the audio. It’s the stream that is transported in 32 bit morsels.
That’s why most programs will report it as 16 bit, although it is really 15 bits. The 32 bits in an MP3, are data bits, not audio samples. MP3 uses Huffman symbols to store audio. As such, there is no “bit depth” in an MP3. The frames an MP3 is holding, can not be individually decoded, as some information may reside in previous frames.
Unfortunately, the only site I know that has some docs about it, xingtech.com, seems down for me at the moment. Unless you want to go to Fraunhofer, and pay $$$ to read the original specs. There’s some good info here, though:
That kind of confusion is also why I don’t applaud it when Audacity imports a 16 bit audio file and shows 32 bit in the left window pane. In my mind, it should report the original bit depth, as the conversion only adds zeroes. Too many people look at the numbers without understanding and suppose there’s some magic that adds resolution.
The bit -depth code I’ve provided at another place does usually report 29 bit for an imported Mp3 file. This corresponds to a bit depth of 14.5 bit which is in accordance with the specs.
Btw, single frames can be read if the bit reservoir had been set to false during encoding.
Robert
I feel with you.
The most efficient compression algorithms work with fractal bits, such as https://en.wikipedia.org/wiki/Arithmetic_coding
However, the reason for the Mp3 oddity is based on the spectral representation/quantization, rather than the coding itself.
Robert
I am not totally clear about “corresponds to”, Robert. Is that about the capability of the decoder, which would often be 16-bit?
It seems ambiguous to me to call MP3’s 16-bit. Given the “bit depth” of the audio samples isn’t constant, how would you describe their “bit depth”, and is it float or integer? What provides the ability of MP3 to store audio above 0 dB?
The samples are stored as quantized frequency bins after going through a modified Discreet Cosine Transform. The whole compression takes place in this domain.
There, we don’t have the +/-1 restriction as in the time domain. the peak can therefore go over 0 dB once the samples are transformed back.
It is nigh impossible to relate the different transforms in a 1:1 manner to the impact on bit depth. I can only provide the empirical observation that the audio has 29 bit after decoding. Or in other words, 3 bits of the possible 32-bit float number aren’t used. I have no idea what happens in the final decoding state, whether zero padding is applied or whatever. There are cleverer guys than me out there that can do the reverse engineering if they want to.
I’m waiting for a lossless 32-bit float compression format. That would be beneficial for Audacity as well.
So from this discussion I got MP3s are lossy…always 16 bit maximum bit depth anyways…but my whole point is - when importing a .mp3 outside of a .mkv it imports at 32 but not when dragging the MP3 actually into audacity from the .mkv itself using ffmpeg.
Why is it this way? You guys are kind of telling me what I already knew from the OP (that it imports at 32 bit) which it’s not doing unless the .mp3 is extracted. In any case I don’t think it would be wise to change the bit depth of the .MP3 from a .mkv import to 32 bit…don’t think that makes sense if the Importer for MKVs is screwy and importing only at 16 bit.
There are significant benefits to working in 32-bit float format.
Processing is more precise (better quality)
Processing does not produce quantization or dither noise (better quality)
If the peak level during processing exceeds 0 dB, the audio is not clipped.
If the peak level after processing exceeds 0 dB, the audio is not clipped and may be brought back below 0 dB with Amplify or Normalize without damage.
and a less significant benefit.
5) A properly decoded MP3 can go a little over 0 dB, which will be clipped (very slightly) if decoded to an integer format.
Audacity’s built-in MP3 decoder is able to decode to 32-bit float format, so when the Quality Preferences are set to 32-bit float (recommended), that’s what it will do.
I’ve not tested, but sounds like a limitation of FFmpeg import that your mkv files are imported as 16-bit. If you intend to process the files (not just cut/copy/paste type editing), I would recommend that you manually convert to 32-bit float format immediately after importing. If you only intend to do cut/copy/paste/delete type editing, then it does not matter - just leave it as 16-bit.
According to http://www.underbit.com/products/mad/ libmad (Audacity’s built-in MPEG decoder) has “24-bit PCM” output. So is it not Audacity’s choice to expand that to 32-bit float?
Not probably. I’ve already said that using FFmpeg is why the MKV containing MP3 audio is imported as 16-bit. Audacity chooses not to expand formats to 32-bit float that FFmpeg decodes as 16-bit.
There is a simple proof. With Audacity’s Default Sample Format set to 32-bit float, set the “Files of type” filter in the import window as “MP3 Files”, and the MP3 imports as 32-bit float. Set the filter to “FFmpeg-compatible files” and the MP3 imports as 16-bit.
Thanks, Robert. So I read from that, that “final decoding” means playback, and it remains ambiguous for a media inspection utility to describe MP3 as “16-bit”.
That doesn’t happen here.
I get 32-bit float tracks in Audacity when importing MP3s regardless of whether I set the filter to “FFmpeg-compatible files” or not.
I’m using Audacity 2.1.3 alpha, compiled with “–disable-dynamic-loading” on Debian Linux.
I was describing what happens using FFmpeg 2.2.2 on Windows, which is (presumably) the system that TheLastOfUs is using, given (s)he posted in the Windows board.
I believe the behaviour is dependent on FFmpeg version.
After Windows versions of Audacity upgraded from FFmpeg 0.6.2 to the current 2.2.2, some formats imported via FFmpeg that used to import as 16-bit changed to importing as 32-bit float, if Audacity’s Default Sample Format was set to 32-bit float.
I assume you are using system FFmpeg on Debian which will be later than 2.2.2? On Ubuntu 14.04 or 16.04 I build Audacity linked to a self-built FFmpeg 2.2.3, which is recommended given Audacity does not officially support greater than FFmpeg 2.3.x. For me on Ubuntu 14.04 or 16.04, MP3’s imported with “FFmpeg files” import as 16-bit.
I can concur with Gale that when I have file type set on import to FFMPEG Compatible files on Windows it imports it only as 16-bit. When I change this to “All files” it goes to 32-bit float on import.
It definitely seems to be an issue with Windows Audacity + FFMPEG Importing. Audacity 2.1.0 fyi.