It’s because the MP3 format does not specify encoder / decoder delay.
When audio is encoded to MP3, it is processed in “frames” (chunks of audio data). The first frame must include the start of the audio, and the final frame must include the end of the audio, but the format does not specify how much delay there is in the first frame before the start of the audio, or how much space there is in the final frame between the end of the audio and the end of the frame. Consequently MP3s have a bit of “padding” at the ends of the file.
Encoder/decoder overall delay is not defined, which means there is no official provision for gapless playback. However, some encoders such as LAME can attach additional metadata that will allow players that can handle it to deliver seamless playback.
Some older versions of Audacity used the metadata from LAME to automatically trim off the padding at the ends (you can observe this in Audacity 2.4.2), but in Audacity 3.2.0 the MP3 decoder was changed and this no longer happens, so the padding is still present at the start / end of the track when imported. I logged this as a regression back in October: https://github.com/audacity/audacity/issues/3778
Even after many years, I learned something new about MP3. Would I face the same problem when I use the WAV format?
My use case is that I have a set of sounds that I stitch together and add some silent parts in between. The total length needs to be smaller than a specific threshold. And for the silent sounds, I assumed a fixed length, which of course, doesn’t fit as we see.
Getting the correct length is, of course, possible, but the silent pauses are not exactly the length I need.
I can access the frames of my audio. Would it solve the problem when I skip the first and last frames when stitching things together?
If I open the files in Audacity, trim them to the exact length, and save them again, does that only change the padding, or will that shorten the audio content? I’m wondering how anyone can work with a set of audio files that are not precisely the length as generated/expected.
No, the issue with padding only applies to some compressed formats, most notably MP3. It doesn’t happen with lossless formats like WAV, AIFF or FLAC, and it doesn’t happen with some more modern compressed formats such as OGG or Opus.
In an ideal world, no-one would ever use MP3 during production.
MP3 is a convenient “delivery format” because of the relatively small file size and because it has been around for so long that it is very widely supported, but it’s a terrible format for working with. Best to work with lossless formats (such as WAV) from start to end, and only convert to MP3 after saving a high quality archive / backup copy of the work in a lossless format.