How do I best sync tracks with different, unknown tempo?

swaan · March 1, 2014, 4:20pm

In theory it should be easy. I know. One track is PAL other NTSC. so a tempo correction of 24/1.001 / 25 = 4.095904096…% gets me into a precision of a few seconds / 100minutes. But I need sync. Is there a better way than trial and error? Perhaps it is a problem that I cannot seem to get more precise numbers into the Tempo change tool (it rounds off)?

Can’t I place markers on tracks that I want to be synced and let Audacity or some plugin figure out what the difference is and tempo+position-correct? DJ software can do that but not with sound effects tracks some 150min long.

How would you figure out the difference in Tempo?

Cheers,
Sven

Gale_Andrews · March 1, 2014, 5:40pm

Sliding Time Scale / Pitch Shift will be more accurate in giving the exact length from the exact percent change, but it will take a long time to process.

Perhaps you could do it in a couple of passes if the first pass in Time Scale still does not give you the exact length needed.

Gale

steve · March 1, 2014, 6:38pm

Use the “Change Speed” effect.
You need to correct both the tempo and the speed.
The “Change Speed” effect will not remember the full precision, but it will use the full precision if you type it in.

Gale_Andrews · March 1, 2014, 6:46pm

You know, I almost said that, but aren’t the two (PAL and NTSC) systems adjusted so the audio sounds at the same pitch despite the different speed? I don’t know - I’m asking.

For reference, do Change Tempo/Pitch and Sliding Time Scale also use (but not remember) full precision? This is quite important for very long tracks.

Gale

Gale_Andrews · March 1, 2014, 7:11pm

Answering myself, Change Tempo and Sliding Time Scale’s Tempo Change do seem to use (but not remember) the typed accuracy.

Gale

swaan · March 1, 2014, 8:10pm

Ok looks like the best way is still with the Change Tempo as both NTSC and PAL streams are in their original pitch. When I would change the speed of the PAL version to NTSC the sounds would have a lower (4%?) pitch. Looks like this kind of stuff is still trial and error.
Even though I had the tempo matched spec wise it was still 900ms /90min out of sync. 3 more runs (@ -0.001%) and I got an acceptable result. Maybe it was my fault if because I only went 4.0959% instead of 4.095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095905…%

Robert_J_H · March 1, 2014, 8:37pm

Or in the Nyquist Prompt (Effect menu):

(resample s (* *sound-srate* (/ 24.101 25)))

This will shorten the sound (otherwise change 24.101 and 25).

Gale_Andrews · March 2, 2014, 3:49am

It will also change the pitch.

On an hour of audio, that Nyquist snippet produces 57 minutes, 50 seconds and 23990 samples (using default libsoxr). But the equation should be 24/1.001 as I understand it.

Entering

4.095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904095904

as initial and final tempo in Sliding Time Scale gives 57 minutes 38 seconds and 33554 samples.

If you wanted to test, you could give Sliding Time Scale a go overnight. It should be more accurate than Change Tempo in giving you the required length of audio.

Gale

Gale_Andrews · March 2, 2014, 3:57am

I deduced that in so far that four, five and six decimal places in the percent change give different results.

The Manual for Change Tempo says:

while fractional values up to three decimal places are supported for BPM changes and up to two decimal places for length changes, the resultant stretch length may have lesser accuracy.

That now seems misleading, as the support is only restricted in the accuracy of the remembered value. The length of audio resulting may still be wrong, but it should be “differently wrong” for values with more than three decimal places of accuracy.

Is that a fair summary?

Gale

Robert_J_H · March 2, 2014, 8:09am

Gale Andrews:

Robert J. H.:
Or in the Nyquist Prompt (Effect menu):
(resample s (* *sound-srate* (/ 24.101 25)))
This will shorten the sound (otherwise change 24.101 and 25).
It will also change the pitch.

This is intentional, in the case that the original speed should be achieved.
“Sliding Time” is required when you want to transfer from the original into a new standard.
However, the formula was wrong, I mis-read it.
It should probably be:

(resample s (* *sound-srate (/ 24 1.001 25)))

However, I am a little bit confused about the frame rates used.
The Standards are:

Ntsc 29.97
Pal 25
Film 24

The reduction by 1/1000 is normally used to convert from film to Ntsc.
Does your original Video have this intermediate frame-rate?
Theoretically, you have to speed up by 1/1000 to get a frame rate of 24 with the original pitch (as in the motion picture).
And then the sliding time/pitch shift to convert from 24 to 25 frames per second.

Gale_Andrews · March 2, 2014, 3:49pm

Robert J. H.:

Gale Andrews:
Robert J. H.:
Or in the Nyquist Prompt (Effect menu):
(resample s (* *sound-srate* (/ 24.101 25)))
This will shorten the sound (otherwise change 24.101 and 25).
It will also change the pitch.
This is intentional, in the case that the original speed should be achieved.

Yes, If the pitch actually is incorrect in the video to be speeded up.

As I understand it, the 24/1.001 frame rate was a variant for colour compatibility. It all depends if the video that is at 24/1.001 is already pitch changed or not, however pitch differences between 24 and 24/1.001 are unlikely to be audible so probably not worth the extra step.

Gale

steve · March 2, 2014, 8:25pm

“Frame rate” is about video, not audio. You can have two videos, one PAL and the other NTSC, both perfectly synchronised to identical audio tracks.
I presume that the synchronisation problem is because one of the audio files was not extracted correctly.
If the audio from a 1 hour NTSC video is extracted, the length of the audio should be exactly 1 hour.
If the audio from a 1 hour PAL video is extracted, the length of the audio should be exactly 1 hour.
So what went wrong?

Robert_J_H · March 2, 2014, 9:36pm

We know that.
It seems that the video frame rate is simply misinterpreted somehow.
The best would probably be to convert the video track itself in a suited editor.
At least if the audio doesn’t show any pitch irregularity.

steve · March 2, 2014, 10:20pm

So the best solution would be to re-extract the audio from the video but do it without screwing up the sound and synchronisation.

swaan · March 3, 2014, 11:51am

Thanks for all the help everyone.

Just to clear up a few things. Quote from Handbrake Framerate guide:

Back in the day, NTSC TV video was indeed 30fps. However, video hasn’t “really” been 30fps since color TV broadcasts started. Before them it was 30000 frames for every 1000 seconds. But to accommodate the extra color information, the rate was very slightly dropped by stretching the frames to cover an extra second for every 1000 seconds, making it 30000/1001.

30fps == 30000/1001 == 29.97fps

24fps == 24000/1001 == 23.976fps

True, audio doesn’t flow in frames per second and our only mission when doing A/V sync is to make sure the length of both audio and video match. Since video does consist of frames you could alter FPS to match audio without quality loss.

However when your (my) goal is to present an audio track that must line up with NTSC video - I have to do the work. The problem was that my audio was made for 25FPS PAL encoding and was originally faster (shorter in length) the desired than desired NTSC reference (sync). So both my PAL and NTSC audio tracks were at normal pitch.

True, video editors are up to the task of converting between those formats but I don’t have a video stream, nor do I have software for it.

Hence, my problem was syncing two audio tracks of different length(leading,trailing silence) and tempo. I have them now synced using the change tempo tool by trial and error. By error I mean my math skills. NTSC*(100+4.0959…%) is not PAL-25fps. 4.2708333334376025…% is.

TL;DR: You can do it with the Change Tempo tool but you’d need a bit more precision than hundredths of seconds (wink at devs). Yes it accepts a greater precision but the measurement tools (seconds) also show only up to hundredths.

I also wish there was an option to mark cue-points to both tracks: point A1 is on track one where point A2 on track two. Point B1 is on track one where B2 is on track two and let it calculate the needed alignment and tempo change to sync them.

And finally: Thank you everyone involved in making Audacity a reality!

steve · March 3, 2014, 1:50pm

This is the part that I don’t get.
Why are the PAL and NTSC versions different lengths? Which of them is the correct length?
If the NTSC version is the correct length, how was the audio in the PAL version shortened? Was it by removing bits of the audio, time stretching, or simply speeded up?
How do you know that the PAL version is the exactly correct pitch?

swaan · March 3, 2014, 2:41pm

Both are correct You could say that the PAL version is “wrong” because the whole project (video) was sped up. There was no audio then. Audio was made for the “incorrect” PAL framing.

steve · March 3, 2014, 4:04pm

OK, I see.
You can get it to within about 16 ms per 5 minutes if you use a Chain command http://manual.audacityteam.org/o/man/edit_chains.html
Here is a Chain text file that you can put in your Chains folder
NTSC-PAL.txt (66 Bytes)
To get it bang-on, you will need to add a few ms of silence every few minutes (about 16 ms per 5 minutes if I got the calculation right).

Gale_Andrews · March 3, 2014, 4:46pm

Change Tempo is inaccurate due to limitations in the algorithm. If you want accuracy you must use Sliding Time Scale (and in my opinion it badly needs a “length” control like Change Tempo has, unless we give Change Tempo an option to use the slower/higher quality algorithm that Sliding Time uses).

You can right-click over the time digits in Selection Toolbar to change to highest precision (samples) formats, or to frames or other formats.

Do you want to vote for this which is one feature request we have?

Time Stretching: mouse tools for both pitch-variable and pitch-constant time stretching

So if you had two different length tracks you would drag one to the end of the other to make the appropriate tempo or speed change.

Gale

swaan · March 3, 2014, 9:54pm

Yes, selection toolbar gives me more precise options but what do I do with samples if,for example in Change Tempo I need to enter a percentage or seconds?

Yes I’d vote for that! That would make it more useful for making mixtapes. As long as I can see the waveforms while dragging. I’d love to get off the outdated MixMeister (one of a kind tool for timeline based DJing and mixtape creation). It’s not as precise as my request but still a move in that general direction - I can’t see working with long (music/beat-less) tracks unless you do it with markers as I mentioned.

Thanks for the chain tip, steve!