My project is using GSM 6.10 audio compression and I’d like to use Audacity to “play” some of the sound clips. To create the appropriate file, I read through the Audacity 2.1.2 source code. I believe that I have the header for the file figured out, but am confused by the data section.
As a background, GSM (full rate) uses an 8 Khz sampling rate with 16-bit samples. GSM also uses 20 msec packets (160 samples) and encodes them into fixed-length packets. The encoder that is being used in Audacity is the same one I am using and creates a 33 byte representation of the 20 msec burst.
Audacity appears to wrap the GSM packets with a WAVE file format.
What is confusing is that Audacity seems to pack 2 GSM packets per disk write. I would expect the disk write buffer to then be 66 (33 * 2) bytes, but Audacity uses a 65 byte buffer. When writing to the buffer, it appears that the first packet is written to the beginning of the buffer (as expected) but the second packet is written to the buffer with an offset of (WAV_W64_GSM610_BLOCKSIZE/2). Since WAV_W64_GSM610_BLOCKSIZE is defined as 65, truncation would put the beginning of the second packet 32 bytes from the beginning of the buffer. So I figured for some reason the authors wanted a 65 byte buffer and decided to overwrite the last byte of the first packet.
Then I looked at the decode function (both functions are in sudacity-minsrc-2.1.2/libsndfile/src/gsm610.c). It uses an offset of (WAV_W64_GSM610_BLOCKSIZE +1)/2 which rounds up the result to 33. So when reading the packets back, the second packet from each disk read comes from an offset of 33!
Recording an example, exporting it in GSM610/Wave, then importing it back in and playing it sounds pretty good. Is every-other packet really corrupted by a single byte offset?
What am I missing? Is there documentation that defines these details (versus reading the code)?