Use of time scaling to "reconduct" a piece of music?

I’m interested in developing a feature in Audacity to help me accomplish a certain goal, but am unsure which approach would be best.

I have a series of input audio files. The goal is to develop the software to speed up or slow down certain parts of the audio file to very specific guidelines, i.e. to match the length of each beat of the input wav file to the length of each beat provided by a template track (for instance a series of spikes, each one corresponding to the downbeat). The goal is to allow me to “reconduct” a piece of music by providing a series of taps corresponding to downbeats different from the original timing of the downbeats. I know there are good time scaling plugins available for Audacity, and I don’t want to redo any work, but when I think about getting the software to perform this goal, I think that I’ll end up using time scaling on each beat of each measure…and doing that manually for each individual beat would take much too long. I would like to automate the process somewhat but am unsure which approach is best…whether I’d have to make changes to the Audacity source code itself or whether trying to develop a plugin would be sufficient.

My thought about the easiest way to do this (not necessarily the most automated) would be to go through each original input audio track…listening to it and recording alongside it a track that just has a series of “blips,” one wherever there’s a downbeat. This could be done just by listening to the audio track externally and wherever there’s a downbeat, hit some key on the keyboard which makes a sound on the desktop and records via “stereo mix” as a blip in Audacity. From this additional input track to Audacity, there is a map provided corresponding to wherever there’s a beat in the original sound file.

To rescale the original audio sound file to a new series of “conducting” beats, provided by another audio track with the same format of blips, it seems only necessary to use some peak finder function for each audio file, then use time scaling (via plugin like “Sliding Time Scale/Pitch Shift”) on each segment between blips in order to make the distance between peaks on the original file match the distance between peaks on the template with the new series of blips corresponding to desired beats. I imagine there will be some “for loop” somewhere to accomplish this task, which is the main one that saves me a bunch of time in processing a bunch of audio files…manually going through and rescaling each beat precisely to a template would take forever.

So essentially the input and output I want as a result of this would be
Input: 3 audio tracks in Audacity

  1. Original wav file imported as track 1
  2. Track 2 - series of blips corresponding to the downbeats of track 1
  3. Track 3 - series of blips corresponding to the desired downbeats (new conducting pattern)

Output:
A new track 1 representing a “reconducted,” rescaled version of the original Track 1 with time scaling such that the downbeats of track1 have been scaled to go from the pattern in track 2 to the pattern in track 3. This involves speeding up some parts and slowing down others.

Does anyone have any advice? I haven’t developed for Audacity before and would like to accomplish this project in the most efficient way I can…if modifying the source code is necessary or is faster I will do that, but if a plugin approach is better/faster I will do that. If anybody has more specific advice as to what I would need to change or where to implement certain things, that would also be helpful.

Thank you!

The task would be much easier to achieve if you can use MIDI generated sound rather than time stretching an audio file.

For time stretching an audio file, the first thing that you would need to do is to accurately detect the beginning of each note. Even this first step is a non-trivial matter and beyond what Audacity is currently capable of. On the other hand, with MIDI music the start and end time of each note is defined in the MIDI data. The next problem for time stretching audio is that stretching audio is inexact and reduces the sound quality. When time stretching MIDI this is not the case as each note will be simply played with the new start and end times. And so it goes on.

If it is acceptable to use MIDI rather than audio then you may be able to achieve the task relatively easily by using Cakewalk or Sonar and developing a script programmed in Cakewalk Application Language (CAL) (certainly very much easier than time stretching audio).

Unfortunately I can’t use MIDI to accomplish the task. Detecting the beginning of each note in Audacity is not something I hope to accomplish, as I understand it’s difficult. I wanted to stretch and compress beat by beat, using a template that I provide. Track 2 provides the start of each beat in the original track. If this sounds confusing, what I intended would be manually creating a second track such that when I play the first and second tracks together, it sounds like a conductor is hitting the stand with a baton every time there’s a “count” or beat in the music. These two tracks would align exactly, so I wouldn’t need to actually detect any beats automatically. It’s less work to manually generate this second track than to go through and stretch/compress beat by beat.

Does this change the recommendation to use Cakewalk or Sonar? Is there a way to do this in Audacity with a little programming?

I can’t see how to achieve it with “a little programming”. It looks like a major and very complex programming task to me (but I’m not a programmer).

If you are an experienced C+ programmer then you could perhaps look at the code for the Time Track feature. It’s not the same as what you are asking as it changes both tempo and pitch, but it may give you some ideas as a starting point. Also look at the latest code for “Sliding Time Scale / Pitch Shift” effect as this is a much better quality time stretch algorithm than the standard Change Tempo effect. The latest version is available as a patch here: http://withunder.org/misc/sbsms2_TimeScale.patch

Using MIDI is considerably easier as it only involves creating a tempo map but Audacity currently has very limited support for MIDI (which does not include playing MIDI tracks).

Moved from the “Adding Features To Audacity” to the “Audio Processing” section.

WC