I don't know if any of this will help...
There is ALWAYS recording latency plus playback latency. The buffers (and associated delays) are required for "smooth" audio in-and-out when you have a multitasking operating system. Big buffers (and long latency) are generally
good things, and a bigger buffer can often avoid or prevent "glitches" in the audio.
Latency is constant (assuming you don't change the hardware or software setup). Normally latency can be compensated for automatically, but since it's constant throughout the recording it's no big deal if you have to adjust for it manually.
Sometimes there is several minutes, or days or weeks, of "latency" between the time you record and play back, so a few extra milliseconds are no problem!
Latency can be a BIG problem if you are monitoring yourself through the computer with headphones and the delay makes it difficult to perform. If your "live monitoring" path goes through the computer, this is when users try all kinds of "tricks" and adjustments to minimize latency. If the performer doesn't need to monitor himself/herself through the computer, latency is simply NOT an issue to worry about. (Again, the latency from a backing track coming from the computer can be compensated for.)
Some effects (such as reverb) will sometimes introduce a delay and you may run across something called "Plug-in Delay Compensation". That's used to keep tracks in sync when applying different effects to different tracks.
------------------
MP3, and MP4/AAC are "lossy" compression formats... The audio data is altered... When you compress there is a small amount of silence added to the beginning (and maybe the end). That's a side-effect of these formats.
WAV is lossless. If you open/import a WAV file into Audacity and export to WAV (with the same settings), NOTHING WILL CHANGE and you'll have an exact copy of the original. Even if you change the WAV settings or make some editing changes to the file, the timing will NOT change unless your edits involve timing, or cuts & splicing, etc., that can change the timing or length of the file.
Lossless compression (FLAC, etc.) results in the exact original data after decompression.
------------------
I'm trying to send this to a friend so he can add a part in Audacity.
There is another potential issue that can come-up when mixing-and-matching different hardware. The clocks (oscillators) in two different soundcards are always slightly different... Just like two clocks on the wall will always run at
slightly different speeds.
You can end-up with a situation where the two tracks start-out in sync, but slowly drift apart.
Sometimes "consumer" soundcards can be quite bad, causing the pitch to be off and/or causing the tracks to be out-of-sync by the end of a song. Sometimes you'll see this problem when someone is recording with a good quality USB mic while monitoring a backing track on a consumer soundcard. The USB mic has it's own clock, and when mixed the two tracks can be mis-matched.
Good audio interfaces usually have more precise clocks, and some regular soundards are just fine (or plenty good enough). Again like clocks on the wall, you
shouldn't get a
noticable time difference during a 3-minute song... But some cheap soundcards
are that horrible!
Pros use a "master clock" (and interfaces with master clock inputs). With everything running off the same clock, everything stays exactly synchronized.