Time mismatch in exported file


I am using Audacity to clear up and enhance the audio for my tutorial videos. Because my usual video length is not longer than 20 minutes, this is the first time I’m doing the same on a video that is 50 minutes long, but I think it happened before I just didn’t notice.

Here is my workflow:

I record the videos in OBS, as mkv files with MPEG AAC (mp4a) audio, 44100 Hz, 32 bit.
I use FFMPEG scripts to cut and convert to .ts files (now the audio is ADS, 44100 Hz, 32 bit), the .ts files are concatenated via FFMPEG into a .mp4 file with MPEG AAC audio, still 4100 Hz, 32 bit.
I import the concatenated video to Audacity.
I use a few effects (noise reduction, compressor, filter curve), and adjust volumes manually - neither should change the timing.
I export the audio to .ogg (but tried other formats such as WAV and mp4a), still 44100 Hz, 32 bit.
When I use FFMPEG to get the concatenated file’s video and the new audio file and make a single video file (MPEG AAC audio, still 4100 Hz, 32 bit), at the end of the video, the audio becomes unsynchronized.

To figure out what’s going on, I imported the video and the Audacity exported audio into Blender (video editor mode). What I see is that when I click “show waveform”, the audio from the video and the audio file seem to line up perfectly. but, upon closer inspection, the audio file is shorter with a whopping 0.0121%. Well this is a small number, but at a 50 minute video, it amounts to a third of a second. And if I move my cursor to somewhere in the file and press play, the two audio tracks are clearly out of sync, the more ahead I am in the timeline, the bigger the mismatch.

This is very fishy, because visually, the waveform of the two tracks are displayed to be the same in Blender. If I add the FFMPEG final result video to the timeline in Blender, I can see, that the two videos are frame by frame perfect, and the audio track from the new video matches perfectly to the audio from the exported file. Neither matches the audio track from the video. I think, FFMPEG is not the culprit here, based on this.

For further testing, I imported the three files to Audacity: original video, Audacity exported audio, FFMPEG final result video. The only difference is, that each video seems to have a tad more silence added to the beginning, but the difference between the three files doesn’t change by the end of the video. So, by Audacity’s opinion, all files are fine. However, when I watch the video, there is a worsening time lag between me talking and my mouth moving.

Could you please suggest me further troubleshooting steps to find out where and why and how things go wrong?

Big thanks!

EDIT: this just in: in Blender, actually the faulty visualization of the waveform is coming from the original file’s audio track. So maybe it’s an import problem to Audacity?

Are you able to devise a test in which you can predict the exact length that the mp4 file will be?