I would like to analyze the difference between two .wav files. The sample sets should be very similar. They will have some differences in harmonic content and so on but it can be assumed that the two sample sets will not be wildly different. So I have written a libsndfile program to do a basic sample-for-sample subtraction and then write the difference between each sample to a third wav file. So I can see and even listen to the difference.
But it is clear that this little program is deficient in a number of ways and it occurs to me that other people may have already written similar code. So I will try to describe what I think I need to do to compare two sample sets for “differences” and if it sounds like anything that Audacity already does or if there is a library that already does what I need, can you please direct me?
So the way the two sample sets are generated is to use PC software and a Fast Track USB external sound card to send a “stimulus” (a sequence of tones) out to the Device Under Test (DUT) and then simultaneously record the output. Then the DUT is changed in some presumably subtle way and the test is performed again. The two recorded files are the two sample sets to be compared.
The first problem is that even though the PC software starts recording as soon as stimulus starts, the latency between recordings is not constant. Meaning the samples in each set are not aligned. Ideally I should just use hardware that uses a fixed number of cycles between writing a sample and reading one so that each time I do a test the records are aligned in time to the nearest half clock cycle. But clearly my Fast Track USB is not up to doing this. The offset is at least several milliseconds. So to align the samples I was going to implement an alignment algorithm. Specifically, I was thinking about dividing each sample by the previous sample to get a “relative change” value. Then I would maybe log() and truncate those values to give a quantized set that can be efficiently compared. The algorithm would simply compare and shift to produce a new set of scores for offsets plus and minus a certain number of samples like say 1024 but depending on the sample rate should equate to 20ms or so. The index of the score indicating the minimum amount of change is the offset that is considered “aligned”.
The next problem is that even with the samples aligned, it does not mean the actual audio is aligned. It can be off by as much as one half of a sample period. I’m not sure about this (I normally write networking software so DSP is out of my range) but I think I need to do some kind of re-sampling with interpolation. Meaning create a buffer 16 times the size of the sample set and then generate 15 interpolated values for each original value. Then maybe I employ the same alignment algorithm again to re-align the sample sets again.
Then, I can perform a simple difference computation, reduce the interpolated values back to the desired sample rate and write the .wav diferrence file.
So is there any code that already does this type of thing?
Or more generally can you recommend a method for sending a “stimulus” out through a “device” and then recording the output in a way that repeated recordings can thereafter be compared so that I can see and listen to the “differences” attributed only to changes to the device?