Hi! New to the board. I’ve been using audacity for a while now but I’ve finally gotten around to joining the forums.
So, I’m sure a good amount of you have tried importing an .exe or .jpg as raw data. It’s a lot of fun if you haven’t
What I want to know is why this happens. Where does this glitchy audio come from? Direct from the hex code or something? Is it an arbitrary connection? I’m sure I could examine the source code but I’m not super code-savvy to the point that I think I would be able to understand it. I’m curious mainly because a lot of my graduate work is in new media studies and I’m currently looking into interfaces that utilize sound in addition to graphics as their points of interaction.
When you “Import Raw”, Audacity reads in “numbers” from the file. It decodes the “bits” of the file according to the import options that you have selected, so for example, if you select “Signed 16 bit PCM, Little-endian” it will interpret the binary digits of the file (bits) assuming that is the correct format, so if it reads in the bits:
1110 1000 then 0000 0011
E8 then 03 (Hex)
then it will interpret that as 03E8 (Hex) = 1000 decimal
which in 16 bit PCM encoding is equivalent to 1000/32767 = (approx) +0.030518 linear
so it will create a sample with an amplitude of (approximately) +0.030518
The sample rate is set by the user in the “Import RAW” dialogue.
If , for example, you set the sample rate to 44100 Hz, then Audacity will create an audio track with the sample rate set to 44100 Hz and will place each sample at a sample position, so…
Let’s say that the sample values are : 0.1 , 0.15 . 0.2 . 0.25 . 0.3 … (mono)
Then the time positions of the samples (if the sample rate has been set at 44100) will be:
Time = 0/44100 = 0.000000000 seconds : sample value = 0.1
Time = 1/44100 = 0.000022676 seconds : sample value = 0.15
Time = 2/44100 = 0.000045351 seconds : sample value = 0.2
Time = 3/44100 = 0.000068027 seconds : sample value = 0.25
Time = 4/44100 = 0.000090703 seconds : sample value = 0.3
…
Any and every file is just hex/binary. It’s the header that interprets how it’s read. So when imported as raw data, the header is ignored and the binary in this case is read as if it were a .wav file (or whatever is selected). Thus, what I’m really trying to undertand is just digital to analog conversion and how pitch/amplitude/duration are encoded/decoded in binary?
Would you mind verifying my understanding (based on 16 bit PCM, 44.1k, little endian):
So, when we import NTOSKRNL.exe as a sound file into Audacity, we are interpreting the binary code that it contains as if we were interpreting any other audio file. For example, say we cull these two arbitrary bytes/16 bits from the file: 0101 0100 and 1101 1000 (Hex: 54 E8). The order of which these bytes are read depends on the endianness. If we chose “little endian” then the least-significant byte is read first. Thus, we would have “54E8,” or in decimal, 21,736. If our bit depth is 16, we have 65,536 possible amplitude levels of the sound. However, since we imported the file as 2 channel stereo, this number is halved: thus, we have 32,768. Based on the 16 bit PCM encoding, the amplitude of the given sample is then approximately 65,536/32,768 or + .66333. This process repeats based on the sample rate, in our case, 44,100 times a second, until the file is totally read and rendered as an analog wave.
That’s pretty close, except that “16 bit PCM” means 16 bits per channel. If there are two channels (stereo) then 2 bytes (16 bits) are sent for one sample in one channel, then 16 bits are sent for one sample in the other channel, then on to the next sample in the first channel, and so on.
Right. That was a stupid question. So is this more correct?
So, when we import NTOSKRNL.exe as a sound file into Audacity, we are interpreting the binary code that it contains as if we were interpreting any other audio file. For example, say we cull these two arbitrary bytes/16 bits from the file: 0101 0100 and 1101 1000 (Hex: 54 E8). The order of which these bytes are read depends on the endianness. If we chose “little endian” then the least-significant byte is read first. Thus, we would have “54E8,” or in decimal, 21,736. If our bit depth is 16, we have 65,536 possible amplitude levels of the sound. However, since we imported the file as 2 channel stereo, this number is halved: thus, we have 32,768. Based on the 16 bit PCM encoding, the amplitude of the given sample is then approximately 65,536/32,768 or + .66333. This process repeats based on the sample rate, in our case, 44,100 times a second, until the file is totally read and rendered as an analog wave. Since we imported the file as 2 channel stereo, PCM encoding dictates that the channels are interleaved, meaning that sample 1 goes to the left channel, sample 2 to the right, sample 3 to the left, and so on. Frequency/pitch is a product of wavelength/time, and if we know the amplitude at any given time, we also know the distance from one crest to another and thus the frequency of a given sample.
No it is not halved. If the bit depth is 16 then there are 65,536 possible amplitude levels for each and every sample.
16 bit audio usually uses signed notation so there are 32,768 values below zero and 32,767 values above zero. (see: http://en.wikipedia.org/wiki/Two's_complement)