A different wave drawing style

Nat · September 5, 2008, 12:00am

As far as i understand, strictly speaking just connecting the dots of the sample values with straight lines is not really representing the resultant waveform. See: http://en.wikipedia.org/wiki/Whittaker–Shannon_interpolation_formula (There also exist windowed versions). The “real” waveform, as produced by an DAC can look quite different and even exceed the sample values.

To have a display mode, that could more accurately display the resultant waveform would be good, because it would clear up some misconceptions about digital vs. analog and it would make the visual comparison of waves easier and more reliable.

A subsample timeshift feature would be nice too.

Any thoughts on this ? Good idea, bad idea ?

steve · September 5, 2008, 9:07am

The most accurate representation would be to just show the dots (samples) and not join them up at all. The reconstruction of the analogue wave from the samples is handled by the sound card, not Audacity, so the method used to reconstruct the analogue signal may be different from one sound card to another. This is not something that Audacity can predict, but a simple dot-to-dot linking of samples is in most cases a reasonable representation (approximation) of what the sound card is likely to do. An exception to this is with very high frequencies (close to the Nyquist frequency) at which point the sound cards D/A conversion could be quite unpredictable. However, at normal sampling rates, these frequencies are so high that it is unlikely that the loudspeakers that are used for listening will produce the sound at all accurately. Fortunately these frequencies are virtually inaudible.

Nat · September 5, 2008, 11:47am

Of the digital data yes, I agree. Or plot lines going from the baseline straight up.

The reconstruction of the analogue wave from the samples is handled by the sound card, not Audacity, so the method used to reconstruct the analogue signal may be different from one sound card to another. This is not something that Audacity can predict, but a simple dot-to-dot linking of samples is in most cases a reasonable representation (approximation) of what the sound card is likely to do.

I cannot really agree with this. The sound card is not “free” to do whatever it wants with the samples, as it should attempt a reconstruction of the originally sampled wave. There is a mathematical “optimum” that the soundcard should try to come as close to as possible. The closer it gets the better it will sound.

An exception to this is with very high frequencies (close to the Nyquist frequency) at which point the sound cards D/A conversion could be quite unpredictable. However, at normal sampling rates, these frequencies are so high that it is unlikely that the loudspeakers that are used for listening will produce the sound at all accurately. Fortunately these frequencies are virtually inaudible.

Hmm, i can’t really follow/agree with you on the unpredictable bit either.

Anyway, I am wondering, if I shouldn’t try to do this myself. If I imagine that the screen drawing is really just another way to do ADC conversion, then instead of using the “draw straight line” routine, I would just generate points to plot with a windowed sinc function and I wouldn’t have to massage the input values at all. (?)

steve · September 5, 2008, 12:38pm

and down?

Some examples:
All sound cards have self noise (how much? what sort of noise?)
All sound cards produce distortion (perfect amplifiers have not been invented yet)
All sound cards low pass filter the output (at what frequency? what order filter?)
Some sound cards work internally at 16 bit, others at 24 bit, some even higher. A sound card running at 24 bit to render 16 bit data may anti alias the output. Sound cards may or may not use oversampling.
Some sound cards use noise shaping, others do not.

Should D/A converters use high pass filters? DC offset may be valid data in the digital realm, but it is not “sound”.

What if the sound card works internally at 48 kHz and is called upon to resample 44.1 kHz data? At low frequencies the conversion can be very accurate, but at very high frequencies the errors can become very high.

Consider a “perfect” sound card. I generate a triangle wave at 20kHz and record it through the sound card at 44.1 kHz. I then generate a sine wave at 20 kHz and record it at 44.1 kHz. I play the two recordings back through the perfect sound card. How does the sound card know what shape wave to produce?

Try this on a piece of graph paper: Set your X axis to correspond to 48kHz sample rate, and plot a sine wave at 20 kHz (well below the Nyquist frequency?) Look at the dots you have plotted and they look nothing like a sine wave. Could any other wave fit those dots?

Nat · September 5, 2008, 2:48pm

yeah. But I don’t care really about that drawing mode anyway

From my point of view, I think you are much too concerned about actually emulating what a specific sound card may produce. I am not. I am only concerned with what the soundcard should ideally produce. Like a 3D graphics card that gets a polygon to render. There is a mathematical model and the 3D card renders and gets it hopefully close or good enough. Same goes for sampling theory, Dirac, Nyquist, Sinc etc.

Good idea, excuse the coarse drawing… The blue line is what the soundcard is supposed to output and what I like to see on the screen (as an alternative rendering mode). The red line is the status quo. Imagine just looking at the red line, could you draw the blue line mentally in your head ? I couldn’t.

steve · September 5, 2008, 4:07pm

I hope you are enjoying this discussion and finding it interesting. I am

That is more or less the point - No you probably wouldn’t reconstruct the blue line in your head, but the sound card most likely would not either.

Here I have 3 recordings of a sine wave recorded at different sample rates. In the top recording, the sine wave is close to (but below) the nyquist frequency. How should the soundcard reproduce each of these waves? How is it likely to reproduce each of these waves? Will all sound cards reproduce the waves the same?

Let’s imagine that we just had the lower track (high sample rate) wave in Audacity, but our sound card was only capable of working at frequencies shown in the upper track. Will it reproduce the theoretical sine wave that we began with, or something different? From doing some experimental tests, I can confidently say that the original sine wave will not be reproduced, but rather something more like the top track with bezier curves rather than straight lines.

Your comparison with graphics is an interesting one. If we zoom in close on a photograph on a computer, we begin to notice that what at first appeared to be continuous changes of shade, curves and colours now reveals itself as rectangles of different colours. Similarly, as we zoom in very close on a wave form, we see it represented as dots. Perhaps it would be better if these dots were joined by curves rather than straight lines - would that have any impact on the audio performance of Audacity? Would the benefit to users justify the additional program code?

Nat · September 5, 2008, 5:59pm

But, and I think I am just repeating myself, assuming that the the sample values are a correct representation of the source waveform, then it’s possible (see link in my first mail) to reconstruct it for viewing. At least we get a nice view of the status quo., that being the sampled signal not necessarily the rendered (by the DAC) signal.

Here I have 3 recordings of a sine wave recorded at different sample rates. In the top recording, the sine wave is close to (but below) the nyquist frequency. How > should > the soundcard reproduce each of these waves?

Excellent question! Wouldn’t that be nice to view in audacity ? If we had that drawing mode and assuming nyquist and friends are right, we should be seeing the same sine wave. Or we would not and could gauge the distortion right there just by looking at it.

How is it likely to reproduce each of these waves? Will all sound cards reproduce the waves the same?

Again the sound card, as far as I know, should ideally take the samples and apply the Whittaker–Shannon interpolation formula and use that to reconstruct the waveform. I don’t know that much about actual DACs, it would depend on the filter design. Maybe simulating a common converter would be nice feature too (but a bit more complicated).

Let’s imagine that we just had the lower track (high sample rate) wave in Audacity, but our sound card was only capable of working at frequencies shown in the upper track. Will it reproduce the theoretical sine wave that we began with, or something different? From doing some experimental tests, I can confidently say that the original sine wave will not be reproduced, but rather something more like the top track with bezier curves rather than straight lines.

Hmm you mean, play back at 8000 Hz and record at 44100 again ? Hmm I can’t do that here, I only can record and play at the same speed

Your comparison with graphics is an interesting one. If we zoom in close on a photograph on a computer, we begin to notice that what at first appeared to be continuous changes of shade, curves and colours now reveals itself as rectangles of different colours. Similarly, as we zoom in very close on a wave form, we see it represented as dots.

(I meant more like a polygon on say a Quake model, but anyway.)

Perhaps it would be better if these dots were joined by curves rather than straight lines - would that have any impact on the audio performance of Audacity? Would the benefit to users justify the additional program code?

If you consider that the curves I am talking about are comparable to the Lanczos interpolation (extremely similar math!) in graphics, compared to the straight line being linear interpolation, I think you’re know where I am coming from. Since is only a visualization thing it obviously doesn’t make the audio performance any different.

Benefits:

o See part of your wave being “lopped off” by the 0dB line, even if you didn’t suspect it. _Edit: Found an interesting article about that: http://www.audioholics.com/education/audio-formats-technology/issues-with-0dbfs-levels-on-digital-audio-playback-systems_
o Educational benefit for students of DSP
o Easier visual comparison between sampled data
o Get me off your back

Anyway it would only make the drawing slower, since the drawline algorithm has to be optionally exchanged with the custom drawer (or so I believe currently!).

steve · September 5, 2008, 7:23pm

I’m not sure about that - zooming out so that scrolling during playback does improve performance (more tracks simultaneously). Would the more complex interpolation make any difference? Certainly switching to spectrum view has a major hit on performance - I don’t know enough about how the visual rendering is done to answer that.

I was wondering if all converters would actually "lop off " those peaks, or not - read the article in the link you posted and it seems that some do and some don’t, depending mostly as far as I can tell, on the design of the analogue circuitry after the DAC.

In real world audio, the effect is a lot less serious than indicated in those tests, since the effect is very small at low frequencies and high level signals at high frequencies are quite rare (typically you would expect frequencies above 5kHz to peak below -24dB for music normalized to 0dB). Nevertheless, it does indicate that (contrary to the general practice of many professional mastering sound engineers) it is better to normalize below 0dB for CD.

If it wasn’t for the straight lines, we wouldn’t be having this conversation - leave it as it is and students of DSP can argue about it

Possibly, but its not such a major issue 'till you are dealing with frequencies that are predominantly above about 8 kHz, so for editing music it will rarely if ever be a problem. Anyway, I’ve done lots of audio editing and never found it to be a problem (though curves would also be prettier).

OK you’ve convinced me
How good are your programming skills? Help yourself to the source code and get hacking.

Nat · September 5, 2008, 11:45pm

OK

I tried to build Audacity on my Mac, took a few hours and the app even starts, but it’s strangely not debuggable. When I run it I get a “Program exited with code 055.”

Later…

Breakpoint 3, 0x9460b05c in ptrace ()
(gdb) where
#0 0x9460b05c in ptrace ()
#1 0x163b245c in globals_key ()
#2 0x163b30fc in CAUWrapperEntry ()
#3 0x9484d994 in CallComponentOpen ()
#4 0x9484be24 in OpenAComponent ()
#5 0x0022e714 in AudioUnitEffect::AudioUnitEffect ()
#6 0x0022e7a4 in AudioUnitEffect::AudioUnitEffect ()
#7 0x0022e5c4 in LoadAudioUnits ()
#8 0x00140af4 in LoadEffects ()
#9 0x0001000c in AudacityApp::OnInit ()
#10 0x000123f4 in wxAppConsole::CallOnInit ()
#11 0x00728b1c in wxEntry ()
#12 0x0000d444 in main ()
(gdb) p $r3
$1 = 31
(gdb)

walitza:~ nat$ grep PT_DENY_ATTACH /usr/include/sys/*
/usr/include/sys/ptrace.h:#define PT_DENY_ATTACH 31

well no wonder…

I will try to code something in Cocoa as a demo first.

kozikowski · September 6, 2008, 2:17am

<<<DC offset may be valid data in the digital realm, but it is not “sound”.>>>

An audio system should not pass DC. This is an exercise in fantasy that engineers go through. Yes, a theoretical perfect sound system would pass and process DC, but that would mean that the microphone could capture wind and the speaker system should deliver that wind at the other end. DC, right, one direction air movement? AKA wind.

Cool to think about, but impractical and usually dangerous.

Yes, indeed 44.1 will not pass 20 KHz with any accuracy at all. If you stick with undithered audio signals, the highest reliable pitch tone is 17-something KHz. Nyquist 2.6. An interesting thing happens in the output filters, though. No audio system will pass a square or triangle wave that far up. By definition, a distorted waveform consists of the base tone and multiple harmonics. Since the system will clearly not pass a 40 KHz audio tone, the triangle and square waves magically turn back into a sine waves losing all, or most of their distortion.

They may not be accurate, but they will not be anything but smooth sine waves. This is why “oversampling” and fancy-pants filters were a part of CD advertising for a long time. What do you do with 44.1KHz which can’t really do what it’s being asked to do.

I can’t hear any of that, but I think the dog appreciates it.

Koz

Nat · September 6, 2008, 9:46am

I pondered about this a little but I have no idea how this fits into the discussion or if this is even addressing me, so I am ignoring it for now

Yes, indeed 44.1 will not pass 20 KHz with any accuracy at all. If you stick with undithered audio signals, the highest reliable pitch tone is 17-something KHz.

It would seem to me that depends on the criteria for accuracy. Here’s a 20 Khz sine wave generated by Audacity and below it (scroll) the signal as recorded by Audacity using an Apogee Duet.

If I ignore the phase difference, to me it looks close enough for “any accuracy”. Edit: after thinking about it, I remembered that the Project Rate is 44100 but the Duet was set to 48000 (though what would it matter, since this rate conversion step could be also attributed to the DAC). So I set it to 44.1Khz. Same result.

An interesting thing happens in the output filters, though. No audio system will pass a square or triangle wave that far up. By definition, a distorted waveform consists of the base tone and multiple harmonics. Since the system will clearly not pass a 40 KHz audio tone, the triangle and square waves magically turn back into a sine waves losing all, or most of their distortion.

This reads to me that you are saying that the DAC will/should anti-alias above Nyquist. Well yes! But since the signal is supposed to be bandwidth limited already anyway, there is no distortion “lost”, since the DAC is not supposed to generate frequencies in the alias bands.

They may not be accurate, but they will not be anything but smooth sine waves.

They surely will not look like a smooth sine wave. They will look like a combination of (quoting you) base tone and multiple harmonics, basically the distorted waveform. Just try it for yourself in Audacity.

steve · September 6, 2008, 4:43pm

I think that this does fit into the discussion if we are talking about making the Audacity wave display reflect more accurately the theoretical output from the sound card, and it poses something of a conundrum.
Audacity currently displays “silence” which has a DC offset, by simply joining the dots (as it does with all samples). Some DACs will output this signal in the same way, and produce a constant DC voltage on the output, but others will apply a subsonic high pass filter to block the DC offset. Most sound cards block DC from the output, and certainly all audio amplifiers should block this (DC offset is a real good way to burn out speakers). But in the audio realm, what does DC offset represent? A constant pressure differential, otherwise known as wind. An accurate audio reproduction of DC offset would be a constant flow of air either into or out of the loudspeaker, which obviously does not happen. So would it be better for Audacity to show the dots in the actual sample positions, but draw the line moving away from the samples and dropping to zero (-infinite dB)? This would show more accurately the signal that comes out of the soundcard/amp.

They will produce a combination of sine waves, all below the Nyquist frequency. Check it out on a spectrum analyser. Fast Fourier analysis of an square wave will show an “infinite” series of harmonics, but as previously said, this will be band limited. With high frequency square waves (or sawtooth waves…) the DAC just has to make the best approximation that it can, (which is not very close at all, often producing huge amounts of modulation distortion).

Nat · September 7, 2008, 4:46pm

Again if you want to emulate the behaviour of specific soundcard maybe you want to do so (I wouldn’t).

But we are just running around in circles on that subject, I find the emulation of actual or hypothetical soundcards and their foibles/features not as interesting as seeing what the samples are supposed to represent. The straight line interpolation between samples is very crude and the potential amount of error I have already shown in my previous drawing:

And that’s not just for dog’s ears. If we assume the sample rate is 44.1Khz that sampled wave is in the very audible range of ca. 10Khz. And as can be deduced form the 0dbFS articles it’s not just theory, the DACs actually do emit waveforms similiar to the blue curve, rather than what the red lines show.

Yeah, but that’s just what I wrote (combination of base tone and multiple harmonics) in different words.

steve · September 7, 2008, 6:53pm

Perhaps we are - maybe we’re getting tied up on semantics, but I’m having trouble with your phrase “what the samples are supposed to represent”.

As I see it, the data in the computer is nothing more than sample values - these are represented by the dots. As far as the procesor is concerned, there is no analogue waveform, just sample values. To represent the data correctly, there should be no line at all. However, in most situations, the joining of the dots makes the visual data easier to read for the user.

The line between the dots, whether straight or curved, is indicating inter-sample values, but these values do not really exist except in the the analogue domain, and that is either pre A/D or post D/A. In either case, drawing a curve suggests that Audacity can predict either, the analogue wave prior to digitisation, or the analogue wave post sound card. In either case we would be looking at what the DACs do, not the digital data which has no inter-sample values unless we extrapolate the data (essentially up-sampling).

Here we see a 3000 Hz sine wave that has been recorded at 8kHz (upper track).
The lower track shows that 8kHz sample rate wave after it has been rendered by the sound card and re-recorded at 48 kHz.

This would suggest that it would be quite reasonable for Audacity to interpolate the data as you suggest, but now let us look at another example.

In this next example the top track is a recording of a 2770 Hz square wave - recorded at 8kHz sample rate.
Now should our “analogue line” (inter-sample values) represent the theoretical waveform prior to A/D conversion, or post D/A conversion. As we will see, the two are very different.

The second wave is the 8kHz recording, rendered by the sound card and re-recorded at 48kHz. We can immediately see the difficulty that the sound card had, though I doubt that the result differs much from the mathematical optimum.
It is interesting to note here that this track has considerably higher peak amplitude than the original, although the record setting were the same for each of these tracks.

The third track is the original square wave at 48kHz sample rate, and is a fair indication of a square wave.

The final track is the 48kHz wave, rendered through the sound card and re-recorded. It appears that the sound card has produced errors in the sample values, (the horizontal lines are no longer straight, but jitter up and down), but this is not due calculation errors in the sound card, but rather by ripple caused by bandwidth limiting. Perhaps both the third and fourth tracks should indicate this ripple?

The questions here are:

Looking at the first track - would it be better for Audacity to draw a line that represents the wave before or after conversion to the digital domain - noting that in the digital domain there should be no line at all.

Looking at the third track, should the line show a square wave as it does, or should it show the ripple that will be produced due to bandwidth limiting?

Looking at the fourth track, should the line be a curve indicating the overshoot and subsequent ripples?
(I’m not really asking for answers here, just voicing my contemplations.)

Finally, let’s look at a couple of milliseconds of real world audio rather than test signals:

For this scenario, is there really a problem with using straight lines to join the dots? Is it worth the effort of programming a windowed sinc function into the rendering? After all, Audacity is intended as an audio editor, not a scientific signal analysis tool.

You may be interested in this project: Sonic Visualiser

Nat · September 7, 2008, 10:38pm

Ok that’s how I see it too (or maybe vertical lines only).

However, in most situations, the joining of the dots makes the visual data easier to read for the user.

The line between the dots, whether straight or curved, is indicating inter-sample values, but these values do not really exist except in the the analogue domain, and that is either pre A/D or post D/A.

And that is not really true. And that’s because of Nyquist and friends. Each inter-sample value has a definite value that is dependent on the other sample values. Think of the sample values as more like parameters to an underlying function. These inter-sample values will appear in the digital domain, f.e. with sample rate conversion. Upsample your file and you will see these intermediate values . Such a sample rate conversion is incidentally: drawing to the screen at a zoom.

In either case, drawing a curve suggests that Audacity can predict either, the analogue wave prior to digitisation, or the analogue wave post sound card. In either case we would be looking at what the DACs do, not the digital data which has no inter-sample values unless we extrapolate the data (essentially up-sampling).

Yes it would suggest that, because it can. The beauty of mathematics and sampling theory That’s why I suggested to you a glance at Whittaker–Shannon interpolation formula - Wikipedia (and maybe also Nyquist–Shannon sampling theorem - Wikipedia).

I suspect you haven’t read it, so lets cite it here:

The sampling theorem states that, under certain limiting conditions, a function x(T) can be recovered exactly from its samples

and (both apply to us)

**There are two limiting conditions that the function x(t) must satisfy in order for the interpolation formula to be guaranteed to reconstruct it exactly:

x(t) must be bandlimited.
The sampling rate, fs, must exceed twice the bandwidth,**

and then it goes on and reveals what the “underlying” function is (->the Whittaker–Shannon interpolation formula)

…
Now should our “analogue line” (inter-sample values) represent the theoretical waveform prior to A/D conversion, or post D/A conversion. As we will see, the two are very different.

Looking at the fourth track, should the line be a curve indicating the overshoot and subsequent ripples?
(I’m not really asking for answers here, just voicing my contemplations.)

The nice thing is, you don’t really need to ponder this. There is only one way to draw the line based on the sample data and the sample frequency. Everything else is emulation of more or less accurate hardware. As the source of the data and the eventual render is often unknown (e.g. load/save files), it’s probably mostly pointless and uninterestung to try to emulate hardware.

Finally, let’s look at a couple of milliseconds of real world audio rather than test signals:

For this scenario, is there really a problem with using straight lines to join the dots? Is it worth the effort of programming a windowed sinc function into the rendering? After all, Audacity is intended as an audio editor, not a scientific signal analysis tool.

Well I would say yes, because if I it weren’t for real world samples, where I would like to have this feature, I wouldn’t even be bothering you or myself with this. And I actually I suspect it would be a pretty big feature for Audacity, as it would set it apart from other editors. We’ll see when I get my demo renderer going.

You may be interested in this project: > Sonic Visualiser

Thanks for the link! But I like Audacity much better