Well I don’t think that’s quite right, is it?
computer are made for simple tasks and what we’re talking about here is simple. Much simpler than, say, face recognition.
There’s no comparison here between ‘nearly the same’. Here we’ve got ‘exactly the same’.
The track is a visual picture made by telling the video chip to display a pixel here and a pixel there. Each pixel defined exactly by x,y co-ordinates.
I’ve forgotten the name for it but something like the video memory has the complete VDU ‘picture’ in it ready to throw out to the screen.
Even in BASIC you could write to the screen by specifying locations like that. And write to video ram or whatever it was called.
So let’s make it up what the prog to do this would have to do:
it has to keep a record of the whole track as visually represented. With all the x,y commands. Locations.
Well it does that already. We know that because we can travel backwards and forwards along the track and it is not recalculating for every movement I’ll bet. It has processed the audio and created this ‘video file’, I’ll call it.
Then it has to know where the different locations are. Well it already does that, too, it is indexed by time for the full length.
So we tell it that the section we are interested in is between this time and that time.
So it finds that alright.
And it takes a note of that ‘pattern’.
In particular the value of the first ‘x’.
And then it runs along the track doing if/then/else all the way.
find first x to match the reference X.
Mark that location as the Start.
move to next x. If it is the same as the next reference x mark it as start + 1
if not then mark it as Start.
See? You increment Start until you reach End and then delete that section.
And start again.
Every series of tests that is not the pattern you want will fail before reaching End.
That’s the basic idea of this ‘pattern matching’ for a computer.
The waveform is symmetric about the horizontal axis, so it only needs one side to be tested.
Couldn’t be much simpler, I think. What do you think?