dither on export (bug 22)

http://bugzilla.audacityteam.org/show_bug.cgi?id=22

It has long been known that due to a bug in Audacity, when exporting a 16 bit track to a 16 bit uncompressed file format, if “dither” is enabled in Preferences, dither will be applied to the exported data.

bgravato recently mentioned that he tried something similar with 24 bit integer audio and dither was not applied. :astonished:
I’ve started this topic so that we can investigate exactly when dither is applied and when not to clarify the actual situation regarding bug 22.

Test 1
New Audacity Project - dither enabled in Preferences - default Quality 32 bit float. (Copy or read directly does not appear to affect the outcome.)

Open a 16 bit wav file.
Export as 16 bit wav.
The two files are different.

Test 2
As for Test 1, but using 24 bit instead of 16 bit.
The two files are identical.


How to test for identical audio data:
With “Preferences > Quality” set for 32 bit float;
Import both files.
Invert one of the tracks.
Select both tracks and “Mix and Render”.
Select the “silent” track and call up the “Amplify” effect. If the track has absolute silence, the “New peak amplitude” will show as “-Infinity”.


Question: Is Audacity “doing the right thing” with 24 bit files by not-dithering, or is something else occurring?
I suspect that Audacity is not doing the right thing and bgravato has discovered another bug… more tests to follow.

Exporting 32-bit float sine wave as a 24 bit integer file.

The expected result would be for dither to be applied (assuming that it is set in Preferences).

With dither set to “shaped”:

  • Generate a sine tone.
  • Duplicate the track.
  • Invert track 2
  • Ctrl+shift+M (Mix and Render to new track)
  • Apply Amplify to the mixed track > “-Infinity”.

This is as expected.

  • Generate a sine tone.
  • Duplicate the track.
  • Invert track 2
  • Set track 1 to 24 bit.
  • Ctrl+shift+M (Mix and Render to new track)
  • Apply Amplify to the mixed track > “-68.5”.

This is to be expected because dither will have been applied when converting the first track to 24 bit.
Amplify this mix track by 100 dB (apply Amplify with +50 dB twice), then plot the spectrum. The characteristic “shaped dither” shows clearly.
shaped-dither.png

  • Generate a sine tone.
  • Export the track as “Other uncompressed files > 24 bit integer”
  • Re-Import the exported track.
  • Invert track 2
  • Ctrl+shift+M (Mix and Render to new track)
  • Apply Amplify to the mixed track > “-88.6”.

Dither should have been applied when exporting to 24 bit, but we would then expect to see noise at -118.5 dB (the Amplify effect should show “-68.5” as in the previous test).
Amplify this mix track by 100 dB then plot the spectrum. This is NOT dither noise.
not-dither.png
The peak at 880 Hz (double the original tone frequency).
If the mix track is normalized it will be seen that the noise is all positive going (the exported track has been inverted).
This suggests to me that the sample values are being truncated and not dithered.

Steve:
You an I went 'round and 'round on a similar topic over two years ago http://forum.audacityteam.org/viewtopic.php?f=26&t=12905&start=10#p49864

Test 1 is, I believe, wrong.
When quality prefs (QP) is set to 32-bit float, a 16-bit WAV or AIF will be converted to 32-bit float when imported (the track shows 32-bit float quality). In order to get a 16-bit track from a 16-bit import the QP needs to be set to 16-bit.

So starting with QP set to 16-bit PCM …
Import AIF 16-bit PCM into 16-bit PCM track
Export as AIF 16-bit PCM
Close project
Set QP to 32-bit float
New project
Import original AIF
Import exported AIF
Invert one track then Mix and Render
The two files are different.

Close project
Set QP to 24-bit PCM
New Project
Import 24-bit WAV
For me, this imports as 32-bit float! Is this not a known bug in and of itself? 24-bit PCM imports as 32-bit float when QP is set to 24-bit PCM?

So I don’t see any way of testing Bruno’s scenario.

– Bill

FWIW confirmed results on Mac with 2.0.0 latest rc.

– Bill

This is the test I did (for testing something else not this…):

Audacity Version: 2.0.0rc4 on MacOS-X 10.6.8
Conditions: Audacity Quality preferences set to 44100Hz, 24-bit, Dither Triangle, Interface set to -145dB view mode

  1. Imported a 24-bit wav file (containing valid audio)
  2. Silenced some part of the file
  3. Exported as other uncompressed format: 24-bit WAV (Windows)
  4. Started new project and imported both tracks mentioned in 1) and 3)
  5. Inverted one of the tracks and mixed and rendered both tracks

Outcome:

  • graphically there’s still silence in the silenced part of the exported track
  • after step 5) the sum of the inverted parts was not pure silence, there was some noise visible just below 138dB, which I think indicates dither in the non-silence parts

I can later try the same with Audacity set to 32-bit float and post back.

Bruno:
When you import the 24-bit file into Audacity set for 24-bit PCM, what does it say on the Track Control Panel?

  • Bill

Oddly it says Stereo, 44100Hz, 32-bit float

Since I have nothing set to 32-bit float in Audacity that sounds very wrong…

Edit: the file’s headers say 24-bit:

macbookpro:audacity-tests bruno$ file 24bit.wav 
24bit.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 24 bit, stereo 44100 Hz

Back in the old thread I referenced earlier, Gale posted this table from the source code:

    Quality setting    File Format      Imports as         
    16                      16                   16
    16                      24                   32 (24)
    16                      32                   32

    24                      16                   24                   
    24                      24                   32 (24)
    24                      32                   32

    32                      16                   32
    32                      24                   32
    32                      32                   32

So there seems to be some ambiguity about how 24-bit PCM files are imported when Audacity is set to 16-bit or 24-bit PCM quality. What does “32 (24)” mean?

Based on our limited tests it seems that 24-bit PCM files are always imported as 32-bit float regardless of the quality setting.

– Bill

Yes. The table is not from the source code but one I made by testing a while ago. “(24)” in “32 (24)” means that 24 is expected (by me), but not what happens.

From ImportPCM.cpp:

// In general, go with the user's preferences.  However, if
// the file is higher-quality, go with a format which preserves
// the quality of the original file.

   if (mFormat != floatSample &&
       sf_subtype_more_than_16_bits(mInfo.format))
      mFormat = floatSample;

So Audacity is behaving as intended, but you can preserve the quality of 24-bit files by importing at 24-bit, so without forcing a dither down from 32-bit (if that actually happened). But no, on Windows too if I export 32-bit silence as 24-bit PCM WAV, with shaped dither on, there is no dither added to the WAV. If I export the same silence as 16-bit PCM WAV, there is dither added.

I happened to discuss upsampling on import with Michael Chinen a couple of years ago while discussing something else, and he gave no indication he felt the above code was bad behaviour.


Gale

It looks like that was written at a time when there was no 24 bit option. Has there ever been such a time?

I think there is now.
It looks to me like 24 bit files are converted to 32 bit float (surprising perhaps, but perhaps not a “bad” thing), but then on Export to 24 bit, the sample values are truncated to 24 bit. Export should properly convert to 24 bit (with dither if enabled) and not just truncated to 24 bit.

@Gale - I vaguely recall some fuss a long time ago about something in the manual that said that converting from 32 bit float to 24 bit integer was done by truncating. Do you remember that? Does it relate to this issue?

Converting from float to int by truncating sounds wrong on many levels :slight_smile: I can’t properly understand what truncate means in this context…

rounding down.

I understand the “idea”, my failure to understand is in terms of programming/computing… what could it mean to truncate a float and get an int without float->int conversion?
But never mind I’m just being picky about languistics (Ed influences? hehe)

It’s something like:

Multiply the original 32 bit float value by 2^23 then round down to get the signed 24 bit integer.
In 32 bit float notation, 0 dB is +/- 1
In 24 bit integer notation, 0 dB is +/- 2^23

Multiplying and rounding is not truncating… it’s multiplying and rounding :wink:

Those lines were last modified in 2002 (around the time of 1.0.0) when there was no sample format preference. 1.2.0 (2004) had the 24-bit format preference.


Is it possible to be a bit more definitive e.g. by testing this in 1.2.6 and HEAD and comparing sample values like for like?

I think all that was about the Manual stating that Audacity “by default” processed internally at 32-bit, when in truth it always does that irrespective of the sample format of the audio - there is no option.


Gale

Yes, certainly.

Sadly no, because there was a different bug in early versions of Audacity that caused 24 bit sample values to be calculated incorrectly and that didn’t get fixed until well into 1.3.x versions.


Nyquist code to generate a sequence of every 24 bit sample value (about 6 min 20 seconds duration):

;; generate ramp with one sample at each 24 bit value
(abs-env 
  (control-srate-abs *sound-srate* 
    (pwlv -1 (/ (power 2 24) *sound-srate*) 1)))

Nyquist code to read sample values in 24 bit format. All values should be integers. The sample values are displayed with one decimal place just to show that they are integers.

;; print 24 bit value for x number of sample
(setq x 20)
(setq *float-format* "%1.1f")
(setq output "")
(dotimes (i x)
(setq output 
  (format nil "~a~%~a" output
    (* (power 2 23)(snd-fetch s)))))
(print output)

Note: Nyquist runs in 32 bit float. If dither is enabled and the track that it is generating into is less than 32 bit float, the output will be dithered.

Test results:

  1. Generate samples into a 32 bit float track.
    The samples test as sequential integer values. Example output from 4min into the track :
2195392.0
2195393.0
2195394.0
2195395.0
2195396.0
2195397.0
2195398.0
2195399.0
2195400.0
2195401.0
  1. Generate samples into a 24 bit track, dither = none.
    Same results as 1 (as expected). Example output from 1 min into the track:
-5742608.0
-5742607.0
-5742606.0
-5742605.0
-5742604.0
-5742603.0
-5742602.0
-5742601.0
-5742600.0
-5742599.0
  1. Generate samples into a 24 bit track, dither = triangle.
    Sample values are still integers (as they must be for 24 bit) but there is a degree of random variation in the values due to the triangle dither.
    Example output from 4min into the track :
2195392.0
2195393.0
2195394.0
2195394.0
2195396.0
2195398.0
2195398.0
2195399.0
2195400.0
2195400.0
  1. Generate samples into a 32 bit float track. (the sample values are in sequence).
    Export as 24 bit with dither = none.
    Import the track and test the sample values.
    The sample values are still integers in sequence as expected.
    Example output from 4min into the track :
2195392.0
2195393.0
2195394.0
2195395.0
2195396.0
2195397.0
2195398.0
2195399.0
2195400.0
2195401.0
  1. Generate samples into a 32 bit float track. (the sample values are in sequence).
    Export as 24 bit with dither = triangle.
    Import the track and test the sample values.
    The sample values are still integers in sequence. Dither has not been applied.
    Example output from 4min into the track (Note that the sample values are identical to test number 4) :
2195392.0
2195393.0
2195394.0
2195395.0
2195396.0
2195397.0
2195398.0
2195399.0
2195400.0
2195401.0
  1. Modify the generator code so that we get some fractional values:
;; generate ramp from -0.25 to +0.25
(abs-env
  (control-srate-abs *sound-srate* 
      (pwlv -0.25 (/ (power 2 24) *sound-srate*) 0.25)))

Modify the sample reading code so that we get 2 decimal places by changing (setq float-format “%1.1f”) to (setq float-format “%1.2f”)
Generate samples into a 32 bit float track.
The samples test as sequential half-integer values. Example output from 4min into the track :

548848.00
548848.25
548848.50
548848.75
548849.00
548849.25
548849.50
548849.75
548850.00
548850.25
  1. Export the track from test 6 as 24 bit with dither = none.
    I would have expected integer results rounded to the nearest value but we actually get rounded down values.
    Example output from 4min into the track :
548848.00
548848.00
548848.00
548848.00
548849.00
548849.00
548849.00
548849.00
548850.00
548850.00
  1. Export the track from test 6 as 24 bit with dither = Triangle.
    We would expect integer results with a degree of randomness to the values due to the dither.
    We actually get results identical to test 7 - the values are rounded down.
548848.00
548848.00
548848.00
548848.00
548849.00
548849.00
548849.00
548849.00
548850.00
548850.00
  1. Same as test 8 but with dither = shaped.
    Again we get identical results as test 7.

    \
  2. Same as test 8 but with dither = rectangle.
    Again we get identical results as test 7.


    Conclusion:
    Exporting 32 bit float audio to “signed 24 bit Integer PCM”, using “Other uncompressed files” always rounds down the sample values and dither is never applied.

Results in Audacity 1.3.12 are the same as Audacity 2.0

Results in Audacity 1.3.10 are slightly different - the values appear to be rounded rather than rounded down. Dither is not applied even when selected in Preferences.
Audacity 1.3.4 also appears to round to the nearest. Dither is not applied even when selected in Preferences.

Thanks for the tests.

So I’m clear, what values would you expect in 7 (what direction should the rounding go in)? And 1.3.10 and 1.3.4 do this?

I tested 32-bit float silence exported to 24-bit WAV with shaped dither set in Preferences in 1.3.2, 1.3.0 and 1.2.6. All WAV’s had no dither applied.

So looks like two separate bugs in addition to #22 (or three if you regard 24-bit quality setting importing 16- or 24-bit files as 32-bit as a bug).


Gale

Mathematically speaking “rounding” rounds to the nearest integer. Therefore “.00” through “.49” should round down to “.00” and “.50” through “.99” should round up to next “.00”