Windows version: XP, SP3
Audacity 2.0.0 installed from .exe
I recorded 74 minutes of white noise from an analog source, with “Show Clipping” selected in the View menu. At the end of the recording, there were three places where the red lines indicated clipping. Zooming in on each of the three locations, I saw that the maximum amplitude value was reached for just one sample, so I decided to use this recording and exported it to a WAV file. Now the problem is, when I open the WAV file in Audacity, it shows clipping for very many samples. I zoomed in on the first sample with a red line, saw that it was at maximum amplitude value, and noted the time value in the file. Then I closed the WAV file and opened the aup file with the original recording, zoomed in to the same time value, and saw that the sample was slightly less than maximum amplitude, so there was no red line on the sample. My question is, why are amplitude values not preserved exactly from the Audacity project to the exported WAV file?
Dither was already selected to None, the default value.
It occurred to me changes in sample values may occur in the conversion from 32-bit float, the default sample format in Audacity, to 16-bit, the sample format in WAV files. I noticed that in the menu Edit Preferences Quality Default Sample Format, I can change the selection from 32-bit float to 16-bit, so I’ll give that a try.
The default is for dither to be enabled (shaped). If you had a previous version of Audacity installed then your current version would inherit the Preference settings from the old version.
Some considerations:
16 bit audio does not have a value for +1.0, but 32 bit float format does.
The highest positive value in 16 bit format is ((2^15) - 1)/(2^15) = 0.999969482
If dither is disabled then should Audacity truncate or round?
@Steve Brown,
You’ve just learned a golden rule of recording. Never record right up to the limit; always leave a little headroom. I never, ever let the sound go above -1dB.
Peter
I figured out what the problem was. It has nothing to do with conversion of samples from 32-bit float to 16-bit in the WAV file. It has to do with mixing two mono tracks to stereo, where one track is panned 75% left and the other track is panned 75% right. The adding that occurs when the two tracks are mixed to two stereo channels in the WAV file resulted in more peaks hitting maximum amplitude value. This was not apparent in Audacity, because the two tracks are separate and mixing resulting from panning only occurs during playback. To solve the problem, all I need to do is re-record the mono tracks allowing more headroom. Any changes in amplitude values resulting from conversion from 32-bit float to 16-bit are probably negligible or insignificant.
Thanks for that tip. It occurred to me that although Audacity uses 32-bit float for sample values internally, the benefits of that precision, in terms of higher resolution of sample amplitude values, are not realized if the sound card is 16-bit. To take advantage of the high resolution of 32-bit float, a 24-bit sound card is required.
32-bit float is an excellent format when processing audio data even if the final format is only 16 bit.
One of the major advantages is that it is tolerant of “over 0 dB” sample values. With 16 or 24 bit (integer format) audio, if during the course of processing the peak level reaches 0 dB the audio will clip and be irreparably damaged whereas with 32 bit float format the audio would not be clipped and can be “amplified” back down to below 0 dB without any damage. The extremely high precision of 32 bit processing means that even after multiple processing operations the losses due to rounding errors are still negligible. Also on modern computer hardware 32-bit processing is typically faster than lower bit formats.
That’s interesting that 32-bit float is tolerant of amplitude values that exceed 0 dB, but I wonder how that is possible, given that the ADC of a 24-bit sound card outputs maximum value (0x00FFFFFF) at the maximum positive analog signal peak it can handle. I presume that for signal values above that peak, the digital output is still 0x00FFFFFF, so there is clipping. Or are my assumptions wrong regarding how the sound card and Audacity interact? Is it that Audacity takes a value less than 0x00FFFFFF as 0 dB, allowing a little headroom above 0 dB? If that is the case, how much headroom above 0 dB is there? I think someone in this thread mentioned that clipping begins at +1 dB for 32-bit float sample format.
The point that the high precision of 32-bit float sample format minimizes rounding errors is understood. Additionally, I think it minimizes loss of amplitude precision that occurs when a recording is made with lots of headroom, not using the full dynamic range. If the sample format is 16-bits, the effective precision might be only 13 or 14 bits. Then when the recording is normalized to 16 bits to fill the full dynamic range, the amplitude steps of lower resolution become magnified. With 32-bit float, the steps are so small to begin with, magnification of steps does not become a problem when a recording is normalized.
When editing 32-bit float, the format can handle values over 0 dB, so the signal will not be damaged. However, as you say, sound cards cannot handle over 0 dB signals and the signal will be clipped to 0 dB by the sound card if it is played, so the sound is damaged. That’s why it i essential to amplify/normalize to < 0 dB for the sound to play without clipping.
Sound cards do not allow any head room above 0 dB, in fact most will clip just below 0 dB (which I presume is a calibration error, but I’m not an audio hardware engineer).
That depends where the “lots of headroom” occurs.
In a 16 bit sound card the signal going into A/D converter is encoded into a number between 0000 and FFFF, a range of 65535 possible values. Each of the 16 bits equates to about 6 dB, so if the full 16 bits are used then there is about 96 dB range between the quietest possible signal and the loudest possible. If the input signal has a maximum peak level of -12 dB, then the highest 2 bits are unused so in effect it is being recorded in 14 bit precision rather than 16. Take it further and if the maximum peak signal at the A/D converter is -24 dB then it’s down to 12 bit precision. Add onto this the fact that the lowest bit (lsb) is probably trash and it is in effect down to 11 bit precision. No matter what you do to the signal after that the lost precision cannot be recovered. For best quality it is essential that you have a good signal level going into the A/D converter.
Once the signal has entered the digital domain, it may be safely scaled down to leave “lots of headroom” if it is in 32-bit float format.
The benefit of 32-bit float is that if the signal is recorded low, then amplifying will not introduce additional rounding errors, whereas 16 bit processing may.
We recommend that for a 16 bit sound card the recording level should ideally have a maximum peak level of around -6 dB. This provides a manageable amount of headroom while still providing a theoretical 90 dB of dynamic range. For a 24 bit sound card you can safely leave a lot more head room.