Okay, I figured out part of why I couldn't
figure things out: the code only seems to work with mono files and I was working with a stereo file.
A few questions:
- So the only two numbers I care about are the number of samples and the threshold, right? (The 36000 and 3600 aren't anything I'd want to change?)
- Also, is there an efficient way to do this with stereo tracks or had I just better change to mono to find reoccurring audio?
- Is there a way to change part of the code to insert bookmarks rather than deleting the non-sampled audio? (Otherwise it seems like the easiest way for me is to paste the analyzed track below the original to find the points of re-occurrence
I can help you the way through, I think.
1/36000 is just a scaling factor, we could as well have scaled the threshold to e.g. 12000 (instead of 0.3).
Other pattern durations need other scaling factors.
If we keep the current behaviour (the first x samples are the pattern), then the normalization can be done after the convolution by searching for the highest peak in those x processed samples and by multiplying the rest with 1/<this value>.
The reason is that we won't encounter any higher value since we have found the first perfect match already.
Thus, a threshold of almost 1 would only mark perfect matches.
However, this will probably only produce one match since even equal audio can be off by e.g. half a sample during recording.
3600 means that the whole audio to be analysed can't be longer than 1 hour--arbitrary set.
2. Stereo is rather an advantage than a drawback since we can throw out positives that are not common to both channels. However, time consumption increases naturally.
3. The found matches can be returned as Labels in a separate track.
This needs searching for the samples that are over the threshold and the highest ones in the neighbourhood of pattern-length.
It would be somewhat better to wait until the next release (due this month) in order to implement the code as a plugin.
The pattern could than be in another track or the first/last clip within the track itself.
A multiple choice would ask where to look for it.
Other controls would be pattern length (if not already known), the threshold and the kind of return (silenced audio or labelled occurrances).
Any other ideas?