bit depth/sample rate: Audacity vis-a-vis Audio Midi Setup

ckeeb80 · July 12, 2013, 1:31am

OS X 10.6.8 (Snow Leopard)
Audacity 2.0.3 (installed from .dmg file)

Vinyl transfer rig (basic chain info):
turntable → phono preamp → tube buffer → stereo R/L RCA outs from tube buffer go into 3.5mm (1/8 inch) adapter → adapter plugs into 3.5 mm input (line-in) on left side of Macbook Pro (I use built-in Core Audio as sound card)

Hello

I’ve been recording vinyl to my macbook pro for a number of years; yet I just recently discovered “Audio Midi Setup” and the apparent ability to set bit depth and sample rate for “line-in.” I nearly always make my recording settings in Audacity 32-bit float/192kHz. Also, I typically export audio files at those same settings. I’ve never noticed any problems playing these exported, 32-bit/192kHz files through VLC; moreover, when “media information” is selected for a track playing in VLC, the 32/192 files show an input & stream bitrate at around 12,000 kb/s, much higher than what is shown for 24-bit/96kHz, for instance, which leads me to believe I actually am ending up with higher quality sound files when I opt for the 32/192 recording and exporting settings in Audacity.

However, I now see that in Audio Midi Setup my line-in settings only allow for up to 32-bit/96kHz. Does this mean I’m not really recording at 192kHz even though Audacity shows “Actual Rate: 192000” on the bottom right of the application window? Is the built-in “Intel High Definition Audio” (Core Audio) changing everything that’s coming in to make it match whatever the settings are in Audio Midi Setup? Even when I have line-in set to 24-bit/96kHz I have no problem recording from vinyl and exporting (both in Audacity) at 32/192.

I’m completely at a loss here.

steve · July 12, 2013, 1:01pm

Let me give an analogy to shed some light on this question.

Consider the value “5”.
“5” is an integer, so has a precision of one whole number (0 decimal places).
“5.0000” has a precision of 4 decimal places.
“5” is exactly equal to “5.0000”.

How that applies to digital recordings:

16 bit, 44100 Hz PCM is a reasonably high quality audio format (CD quality). It is capable of faithfully reproducing the full fidelity of a vinyl recording. The frequency range of audio that is sampled at 44100 Hz is about 20 kHz (theoretically up to 22050 Hz, but is limited by the practicalities of technology to around 20000 Hz). People cannot hear above 20 kHz (20000 Hz) - not even so much as a hint. 16 bit audio has a dynamic range of up to about 100 dB, which far exceeds all but the very best, pristine vinyl recordings assuming perfect reproduction and ideal conditions. In reality the dynamic range of a really good vinyl record played on really good equipment in a really good listening room is unlikely to be much over 80 dB.

So, we have analogue audio data from the vinyl, which has a finite frequency range and a finite dynamic range, both of which can just be reproduced digitally in full at 16 bit 44.1 kHz.

Other than the limits of technology there is nothing to stop us from storing the digital data at 10000000000 Hz sample rate and 100000000000 bits per sample, but that would clearly be overkill. What would all of those extra samples and bit actually be doing? They are like the extra 0’s in “5.0000” - they don’t actually do anything other than make the data bigger.

So if 16 bit 44100 Hz is “just enough”, do we ever need more?
Yes we do, and “more is better” is true up to a point.

When processing audio, there will usually be errors in the order of 1 LSB (least significant bit). What that means in practice is that if we process 16 bit audio, the result will probably only be accurate to around 14 or 15 bits, which is not quite enough for “pristine quality”. Audacity uses “32 bit float” format internally, which provides extreme accuracy when processing. In effect, processing in 32 bit float format is virtually perfect in terms of accuracy - you can apply thousands of processes with no loss of sound quality. at all.

44100 Hz is “just enough” to capture the full audio frequency range, but there may be a measurable amount of “ringing artefacts” in the extreme high frequency range due to the steepness of the anti-alising filters. Increasing the sample rate to around 80 kHz allows much more gentle anti-aliasing filters to be used, so that frequencies above 20 kHz are rolled off less steeply and so removing the possibility of ringing in the extreme high frequency range. (Note: There is no evidence that reconstruction and anti-aliasing issues are audible).

So “more is better” up to a point, and that “point” is in the region of:
For recording: 24 bit 80 kHz.
For processing: 32 bit float 80 kHz.

80 kHz, is not a standard rate and is not an exact multiple of the clock rate used in audio hardware, so to avoid resampling errors professional quality recording more often uses the nearest standard rate, which is 96 kHz. For audio, there are no benefits to increasing the sample rate or bit depth beyond 32 bit float 96 kHz (though there are disadvantages to doing so).

There is an interesting article about this and related issues here: Myths (Vinyl) - Hydrogenaudio Knowledgebase

ckeeb80 · July 12, 2013, 2:33pm

Wow, what a great explanation.
Thank you!

But is there any downside to recording and exporting 32-bit float 96kHz, opening the file in something like Izotope to run decrackle, then exporting as 24-bit 96kHz? According to your suggestions above, one should use 32-bit if any processes are going to be run; thus I assume recording 32-bit rather than 24-bit is preferable when one intends to run plug-ins, etc.

Also, a somewhat unrelated question: what’s your suggestion as far as dithering is concerned when one is exporting (and resampling) a 32/96 to 24/96? I’ve heard various contradictory opinions (don’t use any dithering; use shaped dithering; etc.)

Thanks again for taking the time to provide such an excellent reply.

steve · July 12, 2013, 3:01pm

Even very best analogue to digital audio converters are not accurate to 24 bits. If I recall correctly the state of the art is up to 23 bits.
Ideally you should set Audacity to record in 32 bit float. This guarantees that however many bits from the sound card are accurate, whether that be 15 bits or 23 bits, the data remains accurate throughout the editing and processing stages.
Ideally you should keep the data in 32 bit float format until all editing and processing is complete. Then, and only then, you would convert the data into the final destination format (16 bit for audio CDs, perhaps 24 bit for WAV or FLAC files if your audio player supports 24 bit).

Dither makes very little difference when converting from 32 bit to 24 bit. You are talking about noise levels that are comparable to the sound of someone breathing in the adjacent room. Theoretically, shaped dither will provide the lowest overall THD+Noise.
Dither should only be applied once, and that is when you do the final conversion from 32 bit float to whatever lower bit depth format.

ckeeb80 · July 12, 2013, 5:10pm

Thanks, Steve. This is all very enlightening.