Importing non audio files as raw data

rlaliberty · September 27, 2012, 8:20pm

Hi! New to the board. I’ve been using audacity for a while now but I’ve finally gotten around to joining the forums.

So, I’m sure a good amount of you have tried importing an .exe or .jpg as raw data. It’s a lot of fun if you haven’t

What I want to know is why this happens. Where does this glitchy audio come from? Direct from the hex code or something? Is it an arbitrary connection? I’m sure I could examine the source code but I’m not super code-savvy to the point that I think I would be able to understand it. I’m curious mainly because a lot of my graduate work is in new media studies and I’m currently looking into interfaces that utilize sound in addition to graphics as their points of interaction.

Thanks!

steve · September 28, 2012, 12:54am

When you “Import Raw”, Audacity reads in “numbers” from the file. It decodes the “bits” of the file according to the import options that you have selected, so for example, if you select “Signed 16 bit PCM, Little-endian” it will interpret the binary digits of the file (bits) assuming that is the correct format, so if it reads in the bits:
1110 1000 then 0000 0011
E8 then 03 (Hex)
then it will interpret that as 03E8 (Hex) = 1000 decimal
which in 16 bit PCM encoding is equivalent to 1000/32767 = (approx) +0.030518 linear
so it will create a sample with an amplitude of (approximately) +0.030518

rlaliberty · September 28, 2012, 11:53am

Awesome. Thanks so much.

So, each byte is interpreted, i.e., amplitude “decided,” in isolation from each other? What about frequency? Where is the connection there?

steve · September 28, 2012, 2:31pm

The sample rate is set by the user in the “Import RAW” dialogue.
If , for example, you set the sample rate to 44100 Hz, then Audacity will create an audio track with the sample rate set to 44100 Hz and will place each sample at a sample position, so…

Let’s say that the sample values are : 0.1 , 0.15 . 0.2 . 0.25 . 0.3 … (mono)
Then the time positions of the samples (if the sample rate has been set at 44100) will be:

Time = 0/44100 = 0.000000000 seconds : sample value = 0.1
Time = 1/44100 = 0.000022676 seconds : sample value = 0.15
Time = 2/44100 = 0.000045351 seconds : sample value = 0.2
Time = 3/44100 = 0.000068027 seconds : sample value = 0.25
Time = 4/44100 = 0.000090703 seconds : sample value = 0.3
…

rlaliberty · September 28, 2012, 2:47pm

Awesome. Thanks, Steve. I think I need to read a bit more into sample rate. I may have some more questions down the line

Thanks again. Now, off to the reading

steve · September 28, 2012, 4:02pm

See here: http://en.wikipedia.org/wiki/Sampling_rate

rlaliberty · September 30, 2012, 2:02pm

So, just to clarify…

Any and every file is just hex/binary. It’s the header that interprets how it’s read. So when imported as raw data, the header is ignored and the binary in this case is read as if it were a .wav file (or whatever is selected). Thus, what I’m really trying to undertand is just digital to analog conversion and how pitch/amplitude/duration are encoded/decoded in binary?

steve · September 30, 2012, 4:24pm

This may help: http://en.wikipedia.org/wiki/Digital_audio

rlaliberty · October 25, 2012, 12:25am

Steve,

Would you mind verifying my understanding (based on 16 bit PCM, 44.1k, little endian):

So, when we import NTOSKRNL.exe as a sound file into Audacity, we are interpreting the binary code that it contains as if we were interpreting any other audio file. For example, say we cull these two arbitrary bytes/16 bits from the file: 0101 0100 and 1101 1000 (Hex: 54 E8). The order of which these bytes are read depends on the endianness. If we chose “little endian” then the least-significant byte is read first. Thus, we would have “54E8,” or in decimal, 21,736. If our bit depth is 16, we have 65,536 possible amplitude levels of the sound. However, since we imported the file as 2 channel stereo, this number is halved: thus, we have 32,768. Based on the 16 bit PCM encoding, the amplitude of the given sample is then approximately 65,536/32,768 or + .66333. This process repeats based on the sample rate, in our case, 44,100 times a second, until the file is totally read and rendered as an analog wave.

steve · October 25, 2012, 1:27am

That’s pretty close, except that “16 bit PCM” means 16 bits per channel. If there are two channels (stereo) then 2 bytes (16 bits) are sent for one sample in one channel, then 16 bits are sent for one sample in the other channel, then on to the next sample in the first channel, and so on.

rlaliberty · October 25, 2012, 2:11am

Wouldn’t that mean the each channel was playing something different?

rlaliberty · October 25, 2012, 2:21am

Right. That was a stupid question. So is this more correct?

So, when we import NTOSKRNL.exe as a sound file into Audacity, we are interpreting the binary code that it contains as if we were interpreting any other audio file. For example, say we cull these two arbitrary bytes/16 bits from the file: 0101 0100 and 1101 1000 (Hex: 54 E8). The order of which these bytes are read depends on the endianness. If we chose “little endian” then the least-significant byte is read first. Thus, we would have “54E8,” or in decimal, 21,736. If our bit depth is 16, we have 65,536 possible amplitude levels of the sound. However, since we imported the file as 2 channel stereo, this number is halved: thus, we have 32,768. Based on the 16 bit PCM encoding, the amplitude of the given sample is then approximately 65,536/32,768 or + .66333. This process repeats based on the sample rate, in our case, 44,100 times a second, until the file is totally read and rendered as an analog wave. Since we imported the file as 2 channel stereo, PCM encoding dictates that the channels are interleaved, meaning that sample 1 goes to the left channel, sample 2 to the right, sample 3 to the left, and so on. Frequency/pitch is a product of wavelength/time, and if we know the amplitude at any given time, we also know the distance from one crest to another and thus the frequency of a given sample.

steve · October 25, 2012, 2:56am

No it is not halved. If the bit depth is 16 then there are 65,536 possible amplitude levels for each and every sample.
16 bit audio usually uses signed notation so there are 32,768 values below zero and 32,767 values above zero. (see: http://en.wikipedia.org/wiki/Two's_complement)

Yes. That’s the whole point of stereo. Stereophonic sound - Wikipedia

This article provides a lot of information: Audio bit depth - Wikipedia

rlaliberty · October 26, 2012, 12:37am

You’re a saint, Steve. I’d buy you a beer if I could. Thanks!

steve · October 26, 2012, 1:49am

We’re probably several time zones apart, but next time you’re having a beer, raise your glass… cheers

cloud_canvas · February 15, 2014, 7:01pm

One more quick question about this to amend something to the current definition.

Is the header included in the rendering process as well? As in, the header is rendered as 16-bit samples at the beginning of the audio file as well?

Gale_Andrews · February 16, 2014, 3:20pm

Any header in an audio file is rendered and will be heard as an audible click.

Gale

ggreene · March 7, 2014, 2:06am

So I just posted my first post and thought “Hey I’ll read through the forum a little” and saw this one.

Then it dawned on me to have Audacity import the Audacity executable.

Its mostly noise as you’d expect, but there are some sections that work out to be “tonal”(?) I guess for lack of a better word.

I did…
File->Import->Raw Data
[select Audacity executable]
Select
“Signed 16 bit PCM”
“Little endian”
[leave others default]
Click “Import”

You can see some of the stuff I’m talking about starting around the 41 second mark.

Anyway… that was fun.