what does PCM encoding actually do when importing raw data?

shoufeng · August 18, 2017, 3:53am

Hi All,

I’ve got some binary files that I can import into Audacity with PCM encoding, 16bit signed, and big endian format, I could also convert them to .wav files and they sound alright.

However, when I tried to batch converting the binary files in matlab, by grouping the numbers into 16bit signed and big endian format numbers which I assume represent the sound amplitudes, the resulting signals are not correct. The envelope is ‘flat’ between full scales. - Is there anything that the PCM encoding does? I guess I need more basic insight about the PCM. Can someone please help with this?

Thanks in advance.

steve · August 18, 2017, 8:49am

A proper WAV file must conform to the file specification, which defines a structure for the data and is a subset of the RIFF format. Converting to WAV is not simply adding “.wav” to the file name of a RAW PCM file. To be a legitimate WAV files it must have WAV file headers.
Lots of detailed technical information here: WAVE Audio File Format
Short version here: WAV - Wikipedia

A PCM file without the proper headers ‘may’ play in some applications, but that is application specific.

PCM consists of a series of numeric values that represent the waveform amplitude at regular intervals.
The numeric values may be signed integer, unsigned integer or floating point. For example in a standard CD quality WAV file the sample values are signed 16-bit integer. In Audacity, sample values are 32-bit float.

To convert from integer format sample values to floating point values, the range of integer values is “normalized” to a range of +/- 1. That is, the maximum possible integer value is scaled to +1.0 and the minimum possible integer value scaled to -1.0.

My guess is that you have an incorrect conversion between float and integer somewhere in your MatLab code.

shoufeng · August 18, 2017, 10:07am

Hi Steve,

Thanks a lot for your reply. I guess I need to clarify my question a bit.

My problem is basically to convert a binary file to a sound signal (correct sequence of numbers) in matlab.
(After getting the correct sound signal the simple matlab function wavwrite() would do the job to save the signal as a .wav file.)

DVDdoug · August 18, 2017, 3:32pm

If you know it’s 16-bits, the most likely problem is that the endian is reversed. Getting the bit depth wrong can also scramble the data, but most other screw-ups will just give you the wrong number of channels and/or the wrong speed/pitch. When you open a raw file in Audacity, the wrong offset (odd or even number) can also reverse the high/low bytes but there should be no offset in a raw PCM file and I assume you’re not applying an offset in MATLAB.

There is no standard for “raw” PCM files, but it’s just a sequence of values, each value representing the instantaneous amplitude of one sample. The tricky thing is… The file is a sequence of bytes and those bytes have to be re-assembled into sample-values (unless it’s an 8-bit mono file).

A WAV file is a PCM file, preceded by a header. Of course, there are standards for WAV files and any additional information needed (sample rate, but depth, etc.) is of course, identified in the header.

I’ve never used MATLAB but I’m 99% sure it can open a WAV file. So, you should be able to convert to WAV in Audacity and then open it in MATLAB. (Make sure dithering is disabled so the data isn’t altered.)

This probably won’t help, but you can “look at” the bytes (in any file) with a [u]hex editor[/u]. (I guess you could do the same thing in MATLAB.) That might be helpful if you had some control over the PCM data (which I assume you don’t). For example, if it’s a stereo file you could make one channel silent (zeros) or you could create low-level signals (below ~ -50dB) where the high-byte is zero… Otherwise, with thousands of samples per second of audio it’s hard to “see” anything in the raw bytes. (And, I guess it’s tricky if you’re not used to hexadecimal, but zero is still 00 as a hex byte.)

steve · August 18, 2017, 3:47pm

Where does Audacity come into this?

shoufeng · August 19, 2017, 6:59am

Hi DVDdoug (and Steve),

Thanks a lot for your reply. - We always have to get down to the bottom of the problem, don’t we.

I thought I had checked everything else (word length, sampling rate, signed, normalization, endianess, etc.), and the PCM encoding of Audacity might have done something that I didn’t know, hence my post of the question. It turns out, as you pointed out, that the PCM are just digitized amplitudes in this case, nothing else…

The problem is solved now. I appreciate your comments. Just to further your comments about the binary file: the binary file is a sequence of bits (not bytes, to be strict), and this is where the mistakes came from. As in my first post, the data format is 16 bit big endian, so that we can read 16 bits as a sample each time, and the first 8 bits are the MSB (byte), while the second 8 bits are the LSB (byte), and the bit order should be reversed in each byte. Then these two bytes are connected to be a 16 bit number, which is the digitized amplitude of sound.

Hope this explains the problem. I actually tried to attach my matlab code and the binary file in my last post, but it got blocked in submission, so I could only post part of my reply to Steve (sorry Steve).

BTW, Audacity is an awesome software for people who play with sounds. Kudos to you guys, and please do keep on the good work!