Newbie asking for a kickstart, please!

Sorry guys, I just can’t get my head around how to start with this.

All I wish to do (to get me pointed in the right direction), is to average the value of each sequential pair in a wave form.

My guess was something like this:

( / ( + (first s) (first (rest s))) 2)

But I’m not sure if that just does the first two samples, and not the rest?
I tried creating a recursive loop (not knowing if it was needed or not), but just kept running into various errors.

Please can someone give me a hint as to what this should really look like if entered at the Nyquist prompt?

Just for clarity, in case I’m using the wrong terminology, when I say ‘first two samples’ I mean the first two points on the waveform.

Does this do what you want?

;version 4
(snd-avg *track* 2 2 op-average)

Thanks, Steve - I think that’s what I’m after, although it would be 2 1 in my case.

Trying to run it though doesn’t seem to affect anything, and the debug prompt gives me this in the output:

error: bad argument type - #(#<Sound: #6920328> #<Sound: #6920568>)
Function: #<Subr-SND-AVG: #93ac2c8>
#(#<Sound: #6920328> #<Sound: #6920568>)

I do have 2 channels loaded, does that make a difference?
Also, does it matter what the original file type was (I assume once it’s imported it’s all the same to Audacity)?


That shows a stereo sound.
Stereo sounds are handled as an array of two “sounds”.
An “array of two sounds” is NOT a “sound”, it’s an “array” containing two “sounds”.

The function SND-AVG requires a “sound” as the first argument (an “argument” just means one of the ‘parameters’ that is passed to the function).
Try it with a mono track.

Thanks Steve.

Due to your hint, I now have it working in stereo, and it does as expected.

Unfortunately I’ve also just discovered that although I can import DSF files with ffmpeg, I can’t export them. sigh

Median filter ? Robert J. H made a plugin that does that … Help needed cleaning screeches/peaks from voice recording - #4 by Trebor

Hi Trebor,

Again, I’m not sure of the terminology, but I would say this averaging creates a variable high-pass filter; the higher the frequency, the more it is removed.

(snd-avg sound 2 1 op-average) gives a ‘rolling average’ of each sample with the previous sample, which acts as a low-pass filter where the corner frequency is 1/4 of the sample rate.

Here’s the frequency response for a sample rate of 44100 Hz:

That’s what I meant, LOL. :unamused:

With DSD128 the sample rate is 5.6 Mhz; I was curious as to what would happen if you attempted to remove all the spurious ultrasonic noise. The waveform looks a lot better (especially if you do the averaging more than once), but I can’t recreate the DSF to see if there are any audio anomalies. :frowning:

Does anyone know of any DSF decode/encode libraries other than C++?
Let’s just say that’s not one of my fortes. :laughing:

FFmpeg includes “dsfdec” which is a DSD / DSF decoder.
I think that dBpoweramp also has an optional DSD decoder

I don’t know of any software DSF encoders. Conversion from analog to DSD is generally done in hardware. Software transcoding from PCM to DSD would defeat the point of DSD (which according to advocates of DSD, avoids limitations that are intrinsic to PCM).

I did play with a format converter to do DSD → PCM → DSD , and that process also removed some of the noise from the signal; but it also messed up the timing in the file too - waves weren’t where they used to be!

That was one of the reasons I thought I’d try to remove the noise natively (relatively) and leave the stream untouched as far as timing is concerned. Of course the noise might be the reason DSD sounds different to some, and removing it might make it more PCM-like. On the other hand, it might make DSD64 sound like DSD128 and save lots of hard drive space!

Like I said, just curious.

Since posting I have found a patch for SoX that appears to add DSF encoding, but again it’s C++. I’ll have to dig some books out and see if I can work out what it’s doing.

Could you post a link to that. I’d be interested in having a look.

The patch I initially found is here:

Although it looks like there’s more to the thread than just that patch…

I assume the sigma-delta routine is for PCM → DSD conversion, so I think all I need to look at in this case is the initial patch (2/6) to repackage the stream. At least I hope so!


I know I’m drifting off topic now, but I’ve taken a look at the file format, and it seems relatively straightforward, but - this is the important bit - how does the data in the dsf file translate into a wave form?

Any clues appreciated, but I do understand if you don’t want to wander down that route… :wink:

See here:

That’s just the file layout again.

If I assume that the binary bits are added for 1 and subtracted for 0, then it would be impossible to get the up and down spikes that we see in Audacity, unless the bits are summed every n samples. But if that’s the case, it’s hardly bitstream, is it?

…or is the Audacity display a summation of what’s actually going on? I haven’t tried counting the 5,000,000 samples per second to find out. :open_mouth:

Edit: I just took a look and that is what it seems to be doing; One sample in Audacity appears to be something like 50-60 bits?

Further edit: That’s at a sample rate of 705600, or about 1/8th of the original. Which means the samples (rightly or wrongly) appear to be measured in bytes.

OK, I’ve finally twigged why bitstream editing is so difficult, LOL. :unamused:
You don’t have a simple amplitude value to adjust, you have to recalculate the whole wave… :frowning:

There isn’t a direct relationship between “bits” in DSD and “samples” in PCM. They are related only through the analog waveform that they represent.

In PCM digital audio we represent the amplitude at a given point in time with a number. The time / amplitude value pair is called a “sample”. The continuous analogue waveform is thus described by interpolation between sample values that are equally spaced in time, where the frequency of samples per second is the “sample rate”.

In DSD (Direct Stream Digital) audio, there are no “samples”, there is just a continuous stream of “bits”. A “bit” is a “binary digit”, which means that it is either an “on” state or an “off” state. The continuous analogue waveform is described by the “density” of “on” states. A continuous stream of zeros (off states) thus represents low voltage output, and a continuous stream of ones (on states) thus represents high voltage output. This is called “Pulse Density Modulation”:

Here’s an image to illustrate the scheme. Each vertical bar represents a “bit”. If the “bit” is a “one” (an “on” state) it is shown in blue, and if “zero” (“off”), it is shown white. The red line represents the analogue waveform, that is “high” when the density of “ones” is greatest, and “low” when the density of “ones” is least.