How to sample, if you must...

Dear Audio Processing Forum People,

sometimes we need to sample audio signal.
Maybe some old favorite vinyl LP?

The Shanon-Nyquist sampling theorem says,
that in order to get information about frequency f,
you must use sampling rate at least 2f.

Puting this backward, the upper half of your Fourier amplitudes
are false (in fact they are complex conjugates of the lower half).

So we think like this: the human built-in microphone,
ie, the human ear, recognizes frequencies from 20Hz to 20KHz,
therefore, the sampling rate 40Khz (or 44,1Kz) is just ok.

Similar thinking applies to voice recording.
Voice range is up to 4KHz, so rate 8KHz seems good.

But the actual experiments show the we can obtain better
quality if we sample the signal with much higher rate,
compute Fourier, and INVERSE Fourier, AND re-sample
with lower (final) rate.

For example, first sample voice audio by 20KHz (say),
compute FFT, the inverse FT and re-sample this inverse using 8KHz.

My question is, what is the reasonable and best choice
for this higher frequency?

It would be stupid and unpractical to sample voice using 1Mhz!

I think, that this is a fundamental practical question in
good quality sampling.

But I believe that this Forum is the best one in the world…

j.

So for a 20kHz sine wave that gives us 2 samples per cycle - not a particularly good likeness, and since attempting to sample frequencies above 22050 Hz will produce garbage, it is necessary for A/D converters to employ a low pass filter some way below the nyquist frequency. The slope of that filter will determine how close to the nyquist frequency we can practically get, which is usually around 17kHz. (There are some fancy mathematical algorithms that can produce pretty close approximations right up to 20kHz, but let’s leave that out of the discussion for now).

Apart from the cost, the required data rate and the huge file sizes, you would need to take some serious preventative measures to avoid recording RF.

For audio, 48kHz is usually considered a good compromise and is widely used in professional audio. Some audio equipment will double this up to 96kHz, but beyond 96kHz the “cons” outweigh the “pros”.

48kHz may seem like it is still cutting things a bit fine (less than 3 samples per cycle for a 20kHz sine wave), but in practice modern D/A converters can handle this pretty well. Also, although human hearing is quoted as 20/20kHz, “hearing” above 16kHz is not really comparable to “hearing” in the mid frequency range (even for young people with excellent hearing). Whereas the human ear can discern the difference between a 2kHz sine wave and a 2kHz triangle wave, it does so by responding to the overtones within the sound. With real world sounds (as opposed to signal generators) there is no discernible difference between an 18kHz sine and an 18kHz triangle since the overtones in the triangle wave are undetectable (first harmonic would be 36kHz).

Short answer - 48kHz (in my opinion).

Steve,
thanks for all those infos!

48000 is the television sound sample frequency. 44100 is the compromise the engineers made in order to get the most music on a music CD and satisfy most, but certainly not all, the people listening to it.

I use the older Nyquist value of 2.6 in calculations. 2.0 only works if you add “noise” (dithering) to the sound. Given that, 48000 results in very low distortion sound all the way up to 18.5 KHz. With graceful dithering, very much higher.

Koz