audacity resampling sounds a lot better than ffmpeg - why?

Audio software developers forum.
Forum rules
If you require help using Audacity, please post on the forum board relevant to your operating system:
Windows
Mac OS X
GNU/Linux and Unix-like
Post Reply
tensorfoo
Posts: 8
Joined: Fri Jul 02, 2021 6:39 am
Operating System: Linux *buntu

audacity resampling sounds a lot better than ffmpeg - why?

Post by tensorfoo » Thu Jul 08, 2021 6:09 am

I am using ffmpeg-python because i have a lot of audio to process and it's very tedious to do all the operations by hand using audacity. But the quality difference between using audacity to resample (and normalize) is disappointing.

Just now came across Secret Rabbit Code, which i might try wrapping to use if it will help.

But would it be possible to find out what Audacity uses underneath and if i could maybe borrow that code instead?

steve
Site Admin
Posts: 81955
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: audacity resampling sounds a lot better than ffmpeg - why?

Post by steve » Thu Jul 08, 2021 8:30 am

Audacity uses "libsoxr", which is the sampling library used in SoX http://sox.sourceforge.net/

I've not tried it, but there is a Python wrapper available: https://pypi.org/project/sox/
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

tensorfoo
Posts: 8
Joined: Fri Jul 02, 2021 6:39 am
Operating System: Linux *buntu

Re: audacity resampling sounds a lot better than ffmpeg - why?

Post by tensorfoo » Thu Jul 08, 2021 8:53 am

Thanks steve, i am trying the libsoxr python binding now. I'm finding resampling to 16khz clips my output but audacity doesn't do that. Any ideas? It seem audacity is doing something smarter.

edit. oh! it turns out the clipping happens even before i resample. It happens when the data is loaded. So need to figure out how to tell ffmpeg not to clip my data.

steve
Site Admin
Posts: 81955
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: audacity resampling sounds a lot better than ffmpeg - why?

Post by steve » Thu Jul 08, 2021 10:28 am

tensorfoo wrote:
Thu Jul 08, 2021 8:53 am
I'm finding resampling to 16khz clips my output but audacity doesn't do that. Any ideas?
How much is it clipping? Does your input file go right up to 0 dB or have you left a little headroom?

If the input goes all the way up to 0 dB, then it is very likely that there will be a tiny bit of "clipping" when resampling (the audio may not actually be "clipped" even if Sox or Audacity detect it as clipped. It may just be "touching" the max / minimum values.)

Slight clipping is more likely when up-sampling to a higher rate, but may occur (with both Sox and Audacity) with any resampling if the input is at or very close to 0 dB, especially if the audio is heavily compressed (dynamic compression / limiter effect).

The solution is to allow a little headroom.
I've not used the python wrapper, but with the command line version of Sox you can do:

Code: Select all

sox input.wav -r 16000 output.wav gain -1
which resamples "input.wav" to 16000 Hz "output.wav" and reduces the gain by 1 dB (giving you 1 dB of headroom to avoid clipping a 0 dB input)


This is what happens to a square wave (a "worst case" example), generated at a peak amplitude of 1 (0 dB) with a sample rate of 44100, resampled to both 16000 and 192000 Hz (without any headroom).

square-wave.png
square-wave.png (39.65 KiB) Viewed 2720 times


This is a square wave resample to 192000 Hz using Sox (original 44100 Hz track at the top, resampled version at the bottom)

Code: Select all

sox input.wav -r 192000  192.wav gain -1
square-wave2.png
square-wave2.png (30.43 KiB) Viewed 2718 times
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

tensorfoo
Posts: 8
Joined: Fri Jul 02, 2021 6:39 am
Operating System: Linux *buntu

Re: audacity resampling sounds a lot better than ffmpeg - why?

Post by tensorfoo » Thu Jul 08, 2021 4:59 pm

steve wrote:
Thu Jul 08, 2021 10:28 am
tensorfoo wrote:
Thu Jul 08, 2021 8:53 am
I'm finding resampling to 16khz clips my output but audacity doesn't do that. Any ideas?
How much is it clipping? Does your input file go right up to 0 dB or have you left a little headroom?

If the input goes all the way up to 0 dB, then it is very likely that there will be a tiny bit of "clipping" when resampling (the audio may not actually be "clipped" even if Sox or Audacity detect it as clipped. It may just be "touching" the max / minimum values.)
Yes, you nailed it. It was going to 0 dB. By the way i still haven't got used to the idea of using negative db values .. it seems counterintuitive to me? Thanks so much. I have managed to clean up my audio by reducing the level as you suggested before resampling. Now it's nice even at 16khz! Was going crazy for the last few days.

Will study the rest of your post but it's a bit over my head at the moment. Appreciate it.

steve
Site Admin
Posts: 81955
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: audacity resampling sounds a lot better than ffmpeg - why?

Post by steve » Thu Jul 08, 2021 5:20 pm

tensorfoo wrote:
Thu Jul 08, 2021 4:59 pm
By the way i still haven't got used to the idea of using negative db values .. it seems counterintuitive to me?
:D yes it does look a bit weird at first, but really it has to be that way when dealing with signals.

The dB scale is a logarithmic "ratio" rather than a "unit", so dB is always measured relative to an absolute reference level. The absolute reference level is the "0 dB" level, so everything above that level is a positive value, and everything below is a negative value.

When measuring "Sound Pressure Level" (the level of a sound in air), the 0 dB level is set at "the threshold of hearing" ("20 micropascals" in SI units). Thus most "sounds" are measured with positive values (though audio labs for scientific research may be much quieter than this).

When measuring signals, there is no direct equivalent to "threshold of hearing", but it is essential to have a reference level. The reference level that everyone uses for audio signals is "full scale". That is, the full height of an Audacity track, or a linear value of +/- 1.0 is the 0 dB level. This scale is sometimes written as "dBFS" (dB with reference to Full Scale). As "valid" signal levels are below the 0 dB reference, they are negative.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

Post Reply