Possible bug in exporting files?

Help for Audacity on GNU/Linux.
Forum rules
ImageThis forum is for Audacity on GNU/Linux.
Please state:
  • which version of Linux you are using,
  • the exact three-section version number of Audacity from Help menu > About Audacity,
  • whether you installed your distribution's release, PPA version, or compiled Audacity from source code.

Audacity 1.2.x and 1.3.x are obsolete and no longer supported. If you still have those versions, please upgrade (see https://www.audacityteam.org/download/).
The old forums for those versions are now closed, but you can still read the archives of the 1.2.x and 1.3.x forums.
Post Reply
audidash
Posts: 3
Joined: Thu May 14, 2020 1:34 am
Operating System: Linux Mint

Possible bug in exporting files?

Post by audidash » Thu May 14, 2020 2:20 am

Linux Mint 18.3 / Audacity 2.1.2 / Installed distro release

I loaded a WAV and exported it to a FLAC. Without changing anything, I immediately exported it again. The exported FLAC files have differing checksums. (No meta data was included)
Then I used a CLI tool (flac) to convert the same WAV file to 2 x FLAC files. Those two output files had matching checksums.

Also, I exported the WAV as WAV 3 times to see what happens, again different checksums on each save.

To narrow down the issue:
. generate a sine wave
. trim it down to one single curve / wave
. export it to WAV twice
. load both WAV files and visually you can see the slight difference in the position of a single point of the line

The change is tiny, but shouldn't the exported files be identical at the binary level every time?

Attached are the sample sine WAV files and screenshots of the visual difference.
files.zip
(92.15 KiB) Downloaded 4 times

steve
Site Admin
Posts: 81627
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Possible bug in exporting files?

Post by steve » Thu May 14, 2020 2:31 pm

The reason for the different checksum is that Audacity works in 32-bit float format.
The conversions that occur:

Import:
Flac -> 32-bit PCM (lossless)

Export:
32-bit PCM -> 16-bit Flac (or 24-bit Flac)


As Audacity is designed for recording, editing and audio processing, the default behaviour when exporting to less than 32-bit is to apply "dither". It is the "dither" that causes the difference in the checksum.

In audio processing, dither is used to eliminate harmonic distortion caused by "quantizing" errors ("rounding" from a higher to lower number of bits).
The type of dither (including "no dither") can be selected in Preferences: https://manual.audacityteam.org/man/qua ... ences.html
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

audidash
Posts: 3
Joined: Thu May 14, 2020 1:34 am
Operating System: Linux Mint

Re: Possible bug in exporting files?

Post by audidash » Thu May 14, 2020 11:35 pm

That quite interesting!
So keeping everything at 32 bit avoids that dithering issue - I'm still learning :-)

One other question about exporting.
"WAV (Microsoft) 32-bit float PCM" vs "Other uncompressed > WAV (Microsoft) > Signed 32-bit PCM"
Same file exported via each method produce different checksums. I can't find out from the manual the difference. Can you please explain it? (if it matters at all)

https://manual.audacityteam.org/man/oth ... tions.html

Thank again for the help!

kozikowski
Forum Staff
Posts: 69374
Joined: Thu Aug 02, 2007 5:57 pm
Operating System: macOS 10.13 High Sierra

Re: Possible bug in exporting files?

Post by kozikowski » Fri May 15, 2020 7:47 am

Same file exported via each method produce different checksums.
Did you turn dithering off? The dithering signal is customized random noise added to the export. It's always going to be different each export. Working in 32-bit just eliminates the need for dither.

Koz

steve
Site Admin
Posts: 81627
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Possible bug in exporting files?

Post by steve » Fri May 15, 2020 8:39 am

audidash wrote:
Thu May 14, 2020 11:35 pm
"WAV (Microsoft) 32-bit float PCM" vs "Other uncompressed > WAV (Microsoft) > Signed 32-bit PCM"
"Signed 32-bit PCM" is 32-bit integer (each sample is represented by a 32-bit integer number).
"32-bit float PCM" is 32-bit floating point.

32-bit integer has even higher precision than 32-bit float, but has an absolute limit of 0 dB.
32-bit float still has extremely high precision, and can go over 0 dB. The fact that it can go over 0 dB is a very helpful feature when editing and processing.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

audidash
Posts: 3
Joined: Thu May 14, 2020 1:34 am
Operating System: Linux Mint

Re: Possible bug in exporting files?

Post by audidash » Fri May 15, 2020 10:38 pm

steve wrote:
Fri May 15, 2020 8:39 am
32-bit integer has even higher precision than 32-bit float, but has an absolute limit of 0 dB.
32-bit float still has extremely high precision, and can go over 0 dB. The fact that it can go over 0 dB is a very helpful feature when editing and processing.
OK, thanks steve, that's very helpful to know!
kozikowski wrote:
Fri May 15, 2020 7:47 am
Did you turn dithering off? The dithering signal is customized random noise added to the export. It's always going to be different each export. Working in 32-bit just eliminates the need for dither.
Yes, I did after it was pointed out to me, and as you suggest now I work only in 32 bit. Also I get the difference in the 2 formats as steve explained. Kind of obvious when I rethink that out! :oops: But very good to know about the 0 dB limit.

Interesting side note: multiple saves using "WAV (Microsoft) 32-bit float PCM" were puzzlingly still producing files with different checksums!
Comparing everything via a hex editor, literally a single byte is the difference. The 32 bit float files have two additional chunks in the header compared to the 32 bit integer files, a "fact" chunk and a "PEAK" chunk, the latter being the one that was different by one byte. There is very little info about the PEAK chunk but finally found this:
https://web.archive.org/web/20081201144 ... Chunk.html
It turns out there is a timestamp in the chunk:

Code: Select all

<timeStamp> is the number of seconds since 1/1/1970. This is used to
 see if the date of the peak data matches the modification date of
 the file. If not, the file should be rescanned for new peak data.
 
Mystery solved! Back to sound editing... :geek:

Post Reply