Pitch correction for video speedups/slowdowns?


Read a lot about the pitch correction for moving bars and notes, but nothing to fix pitch after changing the speed of the audio to match the video. Can anyone point me to this? Like a lot of you who are converting videos, I’m doing the usual Film to PAL, PAL to Film, and Pal to NTSC Dropframe. Incidentally, 23.976 is not implemented in Audacity, but that’s off-topic and for the Program development group.

Anyway, I’ve got a PAL stream and I’m converting it to Film/NTSC NDF. I’ve got the audio in sync using the standard -4.27% (it’s 25.000 to 23.976) slowdown, but the narrator is a woman and mildly approaches the Orson Welles range of vocals after the conversion. Therefore, I would like to pitch correct back to normal. If this were AVISynth (TimeStretch(pitch=104.27)), I’d be done, and I use it for several conversions like this and it works well, but I’m finding that Audacity is faster for audio edits and other things I need to accomplish with audio like this.

I’ve tried several of the effects - Change Pitch, Sliding Time Scale/Pitch Shift, Gsnap - with audio distortions out the yinyang. Any thoughts would be helpful!

Thank you!!

If AVISynth works better, I’d say go ahead and use it.

Changing speed and/or pitch independently is tricky, It requires FFT and you can get artifacts. There’s probably a VST plug-in with “better” FFT settings.

I’ve never worked with PAL or film, but with digital files I believe most video editors convert netween PAL and NTSC by altering the video framerate, adding or removing frames (or by interpolation), and the audio remains untouched.

(From what I’ve read, most PAL DVDs are simply sped-up from the film, and most people don’t notice the 4% pitch increase. With film to NTSC, the speed & audio remain the same and extra-repeated frames are added via “3:2 pulldown”.)

Film in Europe generally runs at 25-FPS, not 24. They have the advantage that their television and their film naturally run at the same framerate.

PAL/NTSC converters are an interesting lot. They have to write one frame on one side of a blackboard in one speed and read it in real time from the other side of the blackboard at a different speed. There is no even multiple. The problem is NTSC has great motion and lousy resolution, PAL has great resolution and terrible motion. The mixes tend to have fuzzy bad motion.

What they used to do to convert fast motion is create fuzzy video just there and nowhere else. Most people don’t notice if the football gets fuzzy for a split second. The much higher power converters analyze content and try to keep the video sharp.

Movie theater film and US television film run at slightly different speeds to make them an even multiple of each other. They don’t just duplicate frames here and there. They actually split frames carefully and manage the splits because it looks better.


Doesn’t one of the tools have the effect of dragging your finger on the record? Everything slows down – or speeds up. It should be the simplest of the tools. You could do it by resampling badly. Play a 48000 at 45000 and then permanently sample it at the new rate.

It’s doing one without the other that gives you terrific artifacts. Speed change without pitch change, etc.

Yeah. Here it is. Little down arrow in the panel on the left of the track. Change the sample rate. Both the pitch and speed will change in step.


In case you have trouble finding that, an alternative description of the same thing: Click on the “Name” of the track.