Quality Settings for CD

Topic split from here: http://forum.audacityteam.org/viewtopic.php?p=184850#p184850

A very brief answer so as not to distract from the main topic.
Firstly, the difference in quality will be small. The most noticeable differences occur when the volume level (amplitude) is very low, such as at the end of a fade out.
Audacity works internally in 32-bit float format as this provides much better accuracy (and hence quality) and better performance (speed) than working in an integer format. This means that if you process the sound in any way (anything other than a simple cut/paste/delete type edit) in any format other than 32-bit float, Audacity needs to convert the audio to 32-bit float, process, then convert back to the original format. For best sound quality, this should be avoided if possible.

The hiss that you notice when exporting from 32-bit float to 16 bit WAV is due to dither noise that is intentionally added during the conversion from a high bit depth to a lower bit depth. You can read more about dither here: Missing features - Audacity Support

Standard audio CDs always use 16 bit 44.1 kHz stereo. This means that if processing the audio, converting from 32-bit float to 16 bit is unavoidable at least once somewhere along the line. The “trick” is to minimise the noticeable effects of the conversion. The optimum way to do this depends on what exactly you are doing with the audio.

  1. If you are only performing simple cut/paste/delete type edits, then the quickest/easiest methods are to either:
  2. set the Quality preferences to 16 bit, and/or
  3. set “dither” for the “High quality conversion” settings to “none”. These options are not recommended if you process the audio in any way.
  4. If you apply 1 and only one process to the audio, then the above method may be used, but gives little or no benefit over using the default (32-bit float with dither enabled) settings. This will incur 1 conversion from 32-bit float to 16-bit integer (the same as if you work in 32-bit float throughout and export to 16-bit integer).
  5. If applying more than 1 process, then for best quality the first method (1 a) should not be used because with each process the audio will be converted from 16-bit (lossless) to 32-bit float, then back from 32-bit float to 16 bit integer (not lossless). Method (1b), if used with the track setting to 32-bit float is better in that there will be just one lossy conversion (on Export). The choice of trade-off between using or not using dither depends on several factors:
  • the type of music,
    • whether you listen through headphones or speakers,
    • the volume level that you listen at,
    • which is more important to you, the sound quality of the sound, or the quality of the silences.
    • how much trouble you want to go to,

Regarding item 3.

  • Classical music tends to show up dither noise more than other types due to the extreme dynamic range that can be present.
  • Headphones tend to show up dither noise more than speakers.
  • Dither noise is at a very low level. Normalizing (in 32-bit float format) close to 0 dB directly before export will minimise the relative level of dither noise.
  • Dither noise extends the dynamic range for a given number of “bits” and reduces quantize errors. For “sounds” it is beneficial. The problem comes when the sound level approaches silence.
  • Without dither in 16 bit, there is no sound below -96 dB
    • Without dither in 16 bit, there is no meaningful sound below -90 dB
    • Without dither in 16 bit, there is no musical sound below about -80 dB
    • With dither, in 16 bit, random noise (hiss) is added with a level of around -80 dB
    • With dither, in 16 bit, musical sound can extend down below -100 dB, but against a background of dither noise. Below about -80 dB the noise will be louder than the music, but the music may still be there.

There are different types of dither that may be used. Rectangle dither is the least effective on normal level sound, but produces no noise when the sound level drops to complete silence. Shaped dither is one of the most effective, but produces the highest peak level of noise during absolute silence (but not the “loudest” as the noise is “shaped” to occur mostly at frequencies where low level hearing is least sensitive).

In my opinion, for very best quality, (this is where “how much trouble you want to go to” comes in), all work in Audacity should be:

  • done in 32-bit float format,
  • then “rendered” to 16-bit with dither enabled,
  • then rendered back to 32-bit float,
  • then short fade-outs to silence applied (either manually or with a Noise Gate at a very low level (usually around -70 to -80 dB. Fading out from a level below the dither noise level would need to be done manually).
  • then exported to 16 bit with dither disabled.

Less arduous methods of minimising dither noise during silences are to either:
a) Trim tracks tightly before export, then use the CD burning software to add absolute silence between tracks.
b) work in 32-bit float and use “rectangle” dither.

If you wish to discuss this further, please start a new topic so that we don’t get totally distracted :wink:
(yes, this was a brief answer :wink: )

At Steve’s suggestion, I am starting a new topic because I want to be 100% clear on my quality settings.

My tracks (WAV, FLAC) are almost always from classical music sources. These are either ripped from CDs or are music downloads. I always convert MP3 to WAV or FLAC before importing into audacity. And my exports are always burned to
CD-R and played through quality speakers.

First of all, in addition to basic editing (cutting, pasting, deleting, etc.), I do the following in audacity:

  1. Sliding Time Scale/Pitch Shift to slow down the speed of music without changing the pitch. Once in a while, I will speed up but not too often.

  2. Fade outs and, once in a while, fade ins.

  3. Insert silence.

  4. Normalization to increase the overall level (only if the source material was recorded at a very low level).

Now I always had my quality settings as:
Sampling 44,100 Hz 16 bit. I changed the 16 bit to 32 bit float in 2.0.1 to apparently solve an error condition when using Sliding Time Scale/Pitch Shift (refer to my other thread on this).

Real-Time Conversion High-quality sync Interpolation Dither None
High-quality conversion High-quality sync interpolation Dither None

Now Steve was kind enough to provlde an explanation of these settings and now they are used. I tried to follow but, since I am not that technical, find some of this a bit difficult to comprehend.

All I want is the best sound quality of my exports with no additional hiss competing with the music when it is low.

I printed out Steve’s reply and will try to understand it better. In the meantime, If anyone can give me some additional pointers on the quality settings and what changes I should make, I would greatly appreciate it.

Thanks.

mdubin

EDIT: After reading (and re-reading Steve’s excellent explanation), I think I understand at least one part of the quality settings. If the only thing I am doing is cutting, pasting, etc. with no other processing, then I can leave the sampling at 16 bit and the resulting export will be lossless. However, If I am doing any kind of processing, then I should change the sampling to 32 bit float so there will be no internal conversion during the process. The only loss will occur upon export which will be 32 bit float to 16 bit. What I am not clear on at all is dither. That is where I need some guidance.

I’ve merged your post mdubin with my previous reply to keep it all together.

There’s a fairly detailed explanation of “dither” here: Missing features - Audacity Support

The very short explanation is that digital audio is not a continuously variable waveform. The waveform goes up and down in little steps. The more “bits” in the format, the more (and smaller) the steps are. “Dither” helps to smooth out the steps, much like “dither” in graphics helps to smooth out curves.
The downside of dither (in audio) is that it “smudges” the audio a little (as dither in graphics “smudges” edges a little) and when the audio is extremely quiet, this “smudging” may be heard as a low level hiss if you have the volume loud enough.

There is also an article about dither on Wikipedia: Dither - Wikipedia

Steve:

Here is what I feel comfortable with:

When I am only going to do simple editing such as cutting and pasting with no other processing, I will set the default sample format to 16 bit since in this case, the export will be lossless (right?).

If I am going to do any other processing at all, I will set it to 32 bit float (since it would be only downconverted once when exporting)

In all cases, I will set dither to “none”. Dither will only apply when using 32 bit float and I am willing to accept the minor consequences. There are many classical passages which are low and I would rather not hear any additional hiss.

Thanks for all your help.

mdubin

One would expect so (I would have expected that) but actually, no. Whether it is a feature or a bug, when audio is exported, it passes through Audacity’s “rendering engine”. This is so that multi-track and multi-audio clip projects can be handled correctly, along with the track gain, track pan and Envelope tool settings. The “rendering engine” works in 32-bit float, so even if you export a 16 bit track to a 16 bit uncompressed format, the audio data still gets converted from 16 bit to 32 bit and back to 16 bit.

However, no need to despair - it can be done completely losslessly.
The way that I would recommend is that you set HQ conversion dither to none and leave the default sample rate to 32-bit float.

Explanation:
Every 16 bit value has an exact representation in 32-bit, so converting from 16 to 32 and back to 16 bit will be perfect as long as none of the sample values are altered by anything else (no processing and no dither).

This also has the advantage that if you forget about your settings and do some processing, you will not be creating multiple format conversions.


As a violinist, I completely understand your concerns, however it is worth considering what actually happens at very low volume levels.

in 16 bit arithmetic there are 2^16 possible values for each sample. That’s 65536 discrete values. 32767 positive values, 32768 negative values and zero.
When the signal is as big as possible (0dB), the signal swings from +32767 to -32768 and back.
Obviously, for smaller signals, the numerical range is reduced.
At about -6 dB the numerical range is 2^15 which works out as +16383 to -16384
At about -12dB it is down to +8191 to -8192
With each drop of -6 dB, the numerical range is halved.
By the time you get down to -78 dB the waveform looks like this (magnified)
firsttrack000.png
That is supposed to be a sine wave. Not surprisingly that does not sound very musical.

Here is part of a very very quiet mandolin note, recorded in 16 bit with a level of -84 dB.
firsttrack001.png
What it is supposed to look like is this (same thing but in 32-bit float format)
firsttrack002.png
I’m sure that you can appreciate that at very low levels, 16 bit format just does not have enough “bits” to give anything like a realistic sound.

Here is a longer section, first dithered, then (after a beep) not dithered. The signal has been amplified by +81 dB so that it can be clearly seen (and heard).
firsttrack000.png
And what does it sound like? Well, not very good of course because the sound has to be crammed into a mere 4 bits, but does one sound less terrible than the other?
Have a listen:

Steve:

Now I am confused. I thought from an earlier post that when the only editing being done is cutting and/or pasting, that 16 bit was acceptable as a quality setting because no processing was being done and that the result would be lossless.

So I have to pose the question again: How do I achieve lossless WAV (or FLAC) exports when the only editing I am doing is the cutting out of certain sections of audio or splitting a large WAV file into multiple smaller ones via Tracks - Add Label at Selection.

Thank you.

mdubin

That is strictly correct.
There will be no loss in quality whatsoever up to the point of Export, but that’s when it runs into this Audacity quirk of exporting via its 32-bit rendering engine.

As I wrote:
“The way that I would recommend is that you set HQ conversion dither to none and leave the default sample rate to 32-bit float.”

I agree that this issue is surprisingly complex, but it’s only when we get really picky about the ultra low level performance that it really matters. For 99.9% of users, leaving Audacity Quality settings at their default settings is more than adequate. If we come out of these with an easy way to explain the issues I’ll be more than happy :wink:

Ok, Steve.

I will use 32 bit float with HQ conversion dither set to “none”.

Will stick with 2.0.0 until the next release (2.0.2) where hopefully I will not get any errors.

mdubin

It is being treated as a bug that dither noise is applied (unless you turn it off in Preferences) to 16-bit audio when exporting to a 16-bit format (http://bugzilla.audacityteam.org/show_bug.cgi?id=22). It’s just somewhat hard to fix if we want to do all mixing in floats.

@Steve - maybe it would be worth mentioning on the Wiki Dither page that dither effectively increases the dynamic range for a given number of bits?



Gale

Steve and Gale:

We’ve been talking mainly about quality settings in audacity when working with WAV files.

But many times I will import and export FLAC since FLAC is also lossless and takes up less disk space.

Do the same “rules” we have discussed apply in regard to using 32 bit float if processing or 16 bit if cutting/pasting only?

I might try using “dither” when processing a track (such as slowing down the speed) but setting it to “none” when strictly cutting/pasting.

mdubin

Just to add my 2c worth here of real world experience.

I have transcribed a lot of LPs to digital format with Audacity and burned CDs with a lot of them.

I have my Preferences set to 32-bit float 44.1 kHz for capture and editing. For export downsampling to 16-bit 44.1 kHz I have my Preferences set to “Triangular” dithering (this from a suggestion that Steve made to me a while back - but for the life of me I can’t remember the reasoning).

I listen on high quality speakers QUAD electrostatics ELS-57 and on Sennheiser studio headphones HD 25-1 - and I must confess that I have never been bothered by, or noticed, dither noise including on the classical music transfers (maybe it’s just that ny ears are aging, along with the rest of me, so I’m losing the HF) :slight_smile:

WC

Ever thought of using Audacity’s tone generator to see just how much HF you’ve lost? :wink: :smiley:

It would compete too much with the tinnitus … :wink:

WC

“Dither” is all about noise shaping.

When dither is not used, there is still noise introduced when reducing the bit depth due to quantise errors. Dither is about redistributing that noise so that it is less noticeable/annoying.

Triangle dither has been around a long time and works quite well. The noise is never more than +/- 1 lsb (the lowest possible symmetrical peak amplitude) and has the frequency content shifted into the very high frequency range so as to be less noticeable at low volume.

Shaped dither is a later invention that distributes the noise according to a Fletcher/Munson type curve so as to be “least audible” at low volume. The noise level is up to +/- 2 lsb at the very highest frequencies, so it has a higher peak level than triangle dither. It has a different timbre (different sound) to Triangle dither and is intended to be “quieter” than triangle dither.

Both triangle dither and shaped dither work well but should be avoided on absolute (inter-track) silence. Unfortunately Audacity cannot do that automatically (other than with rectangle dither).

The choice between Triangle and Shaped really comes down to which you prefer the sound of. I prefer the Shaped dither. Even though the peak level is higher, it sounds quieter to me.

I have noticed that there are a few areas where Audacity’s dither could be improved, but it’s a complex area so I need to research more before writing a proposal. If there are any C+ programmers interested in this area, please get in touch and I’ll share my findings so far (pretty boring stuff for anyone else :smiley: )

waxcylinder:

Wouldn’t any dither noise be masked by the surface noise inherent in LPs. I thought the noise floor in LPs is certainly quite a bit higher than that of CDs, especially DDD.

mdubin

Quite possibly - but part of that noise floor is LF runble, usually coming from the TT mech, you can trim this out with a High Pass Filter effect.

That “LP noise flor” is part of the “warmth” that some folk ascribe to vinyl versus digibit CDs :wink:

WC

Should I bother using Dither for Highest Quality 32 bit musical recordings? I dont want it to dull frequencies. Is it more of a function when using microphones/white noise?

Dither adds some very quiet noise to mask nasty conversion sounds when you downsample from 32-bit to 24-bit or 16-bit. Usually, you want to have dither turned on if the Audacity Quality Preferences are set to 32-bit.



Gale