How to compare two audio files ?

Kris_A · December 11, 2014, 6:03am

Hi all,

I have two audio files, for example ingress.wav and egress.wav. How can I compare these two files to see if the contents in them are similar.
I know there might be variations, so I am only looking for a value which is within an error threshold. I just want to make sure there are no dramatic changes.

Just to give more background , I am working on an IP phone and I want to ensure there are no audio issues like static noises, jitter, one way audio issues,etc. I play a simple music file from client and receive it in the server and play the received audio file back to the client, now I want to compare the sent and received audio file, they both need to be similar, they wont be exactly same but I just want to make sure they are similar.

Any ideas if audacity can do this ? I am looking into the audacity scripting documentation which talks about mod-pipe, currently i have an issue compiling (different thread). But I really would like to know your thoughts. Appreciate any inputs. Thanks.

Kris

Gale_Andrews · December 11, 2014, 8:54am

You mean apart from listening to the two files one after the other?

You can statistically compare the files in a hex editor.

Are the files exactly synchronised and mono? If so, Effect > Invert one of them and play to listen to the difference between them. To see the difference, select both tracks and choose either of the “Mix and Render” options in the Tracks Menu.

Where does the scripting come in? Because you have hundreds of these files?

Gale

steve · December 11, 2014, 12:24pm

What sort of similarity are you looking for? Obviously it is easy to compare the length of the two files, and the peak amplitude. It’s not much more difficult to compare the rms level, so that is three points of similarity. What sort of differences are you expecting might occur?

Kris_A · December 11, 2014, 5:42pm

Hi Gale,

Thank you for your reply. I have built the IP phone, yes, that part is taken care of. The issue is to automate the testing. I don’t want to manually make the call and ensure audio on either direction by speaking on the mic and listening on the speaker on the other side and vice versa. There would be a lot of call scenarios like hold, transfer, etc. which is why I am trying to automate this.

So i decided, if i can play the audio from the code from the client side and record it on the server side and play that back, then I will have 2 audio files (sent and received) which has to be “similar”. It wont be a bit by bit match, because there would be silence for a second or so in the beginning from the time recording starts to the time audio is received.

I cant use the audacity gui for this, a command line utility of some sort that can take in 2 audio files and give a value of the match then I can assume a tolerance and when the value is below or above then I can say there was something wrong with the audio which then can be manually looked into, etc. The audacity compareaudio in the scripting section seemed to do this from documentation but getting it to work is extremely hard atleast for me.

Any ideas ? Thanks again.

Kris

Kris_A · December 11, 2014, 5:48pm

Hi Steve,

Thank you for your reply. Similarity in terms of content, am sorry I am not very knowledgeable in audio details, but what I am looking for a big difference from one audio to the other, like if there is noise in one audio.
For example, if one audio has speech that is “H e l l o” and the received audio has “H l l o”. If these audio files were looked into in a wave form, the noise will show as a big spike. I want to catch that and similar like no audio or missing audio etc. I hope I made some sense ?

I cant compare size, the two files will never be exact same size. I am not sure of the peak amplitude and rms level, I will research and see how to do that. I have explained a bit more on what i am trying to do to Gale’s response. Please take a look and if you have any thoughts/idea, that would be very helpful.

Thanks again.
Kris

steve · December 11, 2014, 6:12pm

The problem is that computers have no idea what things “sound” like. They are able to store and manipulate audio samples, but are totally incapable of “hearing”. Specialist speech recognition software is able to apply complex pattern matching algorithms that can calculate a probability of a sound being a particular word, but really a computer has no idea if a sound is an angel singing or a dump truck applying its brakes.

I’m not sure what you are referring to there. Do you mean the “Audio Diff” proposal? Missing features - Audacity Support

The way that I would approach it would be to run specific tests with synthetic audio samples that will provide easily measurable results. For example, send a 1 kHz sine tone, and then look to see if the result still has a frequency of 1 kHz, how much harmonic distortion is there, how much noise is there. Try sending silence - does that come back as silence, or is there added noise, if so, how much. Try sending pulses of pink noise, what comes back? And then after those tests, do some listening tests with real speech.

Kris_A · December 11, 2014, 6:22pm

Hi Steve,

The problem is that computers have no idea what things “sound” like. They are able to store and manipulate audio samples, but are totally incapable of “hearing”. Specialist speech recognition software is able to apply complex pattern matching algorithms that can calculate a probability of a sound being a particular word, but really a computer has no idea if a sound is an angel singing or a dump truck applying its breaks.

I agree. If not for the content but just to detect if there are any anomalies ?

I’m not sure what you are referring to there. Do you mean the “Audio Diff” proposal? > Missing features - Audacity Support

No. Its here Audacity Manual , scroll down and under batch commands there is a “CompareAudio”. To me that looks like what i am looking for. Thoughts ?

The way that I would approach it would be to run specific tests with synthetic audio samples that will provide easily measurable results. For example, send a 1 kHz sine tone, and then look to see if the result still has a frequency of 1 kHz, how much harmonic distortion is there, how much noise is there. Try sending silence - does that come back as silence, or is there added noise, if so, how much. Try sending pulses of pink noise, what comes back? And then after those tests, do some listening tests with real speech.

Thanks for that. I will need to start reading about this, but this is a start. Thank you.

steve · December 11, 2014, 8:35pm

Unfortunately that will be no help. It is comparing sample by sample, so if the start position is just slightly off, the comparison will show that they are completely different (which they would be on a sample by sample comparison). Other differences, that could be inaudible, will show a huge difference when compared sample by sample, such as a high quality MP3 compared with the original uncompressed audio.

Kris_A · December 11, 2014, 10:33pm

Unfortunately that will be no help. It is comparing sample by sample, so if the start position is just slightly off, the comparison will show that they are completely different (which they would be on a sample by sample comparison). Other differences, that could be inaudible, will show a huge difference when compared sample by sample, such as a high quality MP3 compared with the original uncompressed audio.

I see. Thanks for pointing that out. Any ideas on how to go about this ? Thanks for all the help.

Kris

steve · December 12, 2014, 12:54am

I described the approach that I would take in this post: How to compare two audio files ? - #6 by steve

Gale_Andrews · December 12, 2014, 1:09pm

I am still not really clear why you need to check every call live, as opposed to something like Steve’s suggestion to pre-check the system for noise and distortion.

You could certainly use Analyze > Silence Finder… in Audacity to check for silences at the noise level in the recorded files. Silences longer than say a couple of seconds could indicate dropouts or completely silent recordings. I am not sure if Silence Finder can be handled well in scripting. Silence Finder writes labels, so if no labels are produced (which displays a warning), that indicates a positive test result.

How long are the calls? Analyze > Sample Data Export… writes text files containing sample levels but is limited to 1 million audio samples. You could write a script to check the text files for long stretches of audio below a threshold as another way to do a check for dropouts.

Gale

ansnum · March 3, 2015, 1:06pm

Hi,

I have the same need. have you found a solution ?

steve · March 3, 2015, 4:08pm

I described the approach that I would take in this post: How to compare two audio files ? - #6 by steve
As there was no further comment from Kris I assume that answer was satisfactory.

Mahender · February 1, 2021, 2:16pm

Hello ,

I am working on similar scenario. Did find any way to compare audio files?

steve · February 1, 2021, 3:12pm

Your question is too vague to give a precise answer (like asking: “How to compare two things”).

With Audacity you can easily compare:

Length of the audio
Peak level
RMS level
Frequency bandwidth
Sample rate

If you are working with generated test tones, you can also compare

Frequency
Wave shape
Harmonic content