Shaded waveform view (when zoomed out)

This read-only archive contains discussions from the Adding Feature forum.
New feature request may be posted to the Adding Feature forum.
Technical support is available via the Help forum.
h-h
Posts: 110
Joined: Tue Jul 28, 2015 2:37 am
Operating System: Please select

Shaded waveform view (when zoomed out)

Post by h-h » Tue Aug 18, 2015 9:37 pm

Hello,

there are issues with the current zoomed-out waveform view, i.e., the waveform view that doesn't show individual samples. It's a kind of two-layer "monochrome" rendering (pixel on/pixel off) with two kinds of information about the samples in the time frame of the pixel column: the maximum peak and the RMS ("root mean square", a kind of average).

This waveform view can be rendered better in my opinion, i.e., with shades. The outer shape of the waveform would stay the same and the shades within the waveform would provide more information than before--potentially (and in my opinion indeed) more useful information. The RMS value now visible would be somehow contained in a newly rendered pixel column. In my understanding the RMS value is "just a kind of representation" and doesn't contain exact information about frequencies or the loudness (there might be frequencies in it too low for the human ear). An important part of a waveform is to provide a shape for orientation and hints to the acoustical characteristics. So in my opinion the RMS value isn't something to desperately keep.

I'm no professional on this topic. The proposal of the algorithm I make might have to be refined with respect to edge cases. I expect that this can be done by the developers with a sufficient technical base provided here. The good news is that the algorithm is relatively easy to implement--not harder than the current one I think.

Let me try to explain how I suggest to render a pixel column. Please see the image "Algorithm" (you might be able to view it better by dragging it in a seperate browser tab). See also this explanation:

Code: Select all

             DESCRIPTION OF THE CALCULATION OF THE VALUE ASSOCIATED
                WITH EVERY PIXEL (UNIT: SAMPLES; SEE ALSO IMAGE)

          x   x   x                                                3
         / \ / \ / \
        x   x   x   x   x                                          8 = 3 + 5
       /             \ / \
      x               x   x                                       11 = 8 + 3
     /                     \
    x                       x                                     13 = 11 + 2
---/-------------------------\----------------------------------- Average sample value
  x                           x       x       x       x       x   18 = 12 + 6
                               \     / \     / \     / \     /
                                x   x   x   x   x   x   x   x     12 = 4 + 8
                                 \ /     \ /     \ /     \ /
                                  x       x       x       x        4

Number of samples: 31. Each value associated with a pixel is divided by the
number of samples. Each value of 0 leads to full transparency, otherwise the
minimum visibility of the waveform would be made to nonsense.
  • There are these concepts:
  • Value associated with a pixel: This value is a sample count that is later divided by the number of samples of the pixel column to get the intensity factor of a pixel.
  • Maximum value: When the intensity value for a pixel is calculated, experience shows that this results in the darkest value being gray. This is because of the alternating character of audio waves. The value--which has a maximum possible value of 1.0--has to be stretched according to a maximum value. Example: With a maximum value of 0.5 a value of 0.4 becomes 0.8. The attached images apply a maximum value of 0.5. See also the image "Electric guitar, 1000 samples per pixel column, without applying maximum value".
  • Minimum visibility/opacity/intensity: If you would have a minimum visibility of 0 %, the waveform would ease out into the background without giving information about the peak. So there has to be a minimum opacity over the background to be able to see peaks. The attached images apply a minimum intensity of 0.15, if not stated otherwise. See also the images "Electric guitar, 1000 samples per pixel column, minimum visibility of 0.0" and similar.
How to describe a user not reading a technical specification what the waveform view shows? I only can try: Imagine a sine wave. The positive and negative parts be termed "filled". The darker a pixel the more the wave is filled on that spot and above. Why also above? This is because a lower part of a wave can be viewed as the base of a higher part (thus the darker representation on lower parts) and experience shows that there is no useful waveform resulting if you consider the spot alone (see attached image "Electric guitar, 1000 samples per pixel column, pixels without higher peaks").

Conclusions:
  • Most peak pixels are light gray. If someone has a mathematical/algorithmical alternative, let me know. For most waveforms shown there isn't anything dissatisfying to me regarding this matter. Personally I can easily think of it as related to the data since high peaks are not present the most in most files.
    • See this image to prove that the data is correctly represented: "1000 Hz sine wave with 1000 Hz square wave, 500 samples per pixel column".
    • See this image to prove that higher peaks should be counted to lower peaks (the image doesn't do this): "Electric guitar, 1000 samples per pixel column, pixels without higher peaks".
  • You might see the DC offset of a subrange better. See image "Female voice singing, 200 samples per pixel column".
  • Dark pixels might reach a peak pixel. See this image for an explanation of this: "Subwaves on high peaks of low waves". The flute waveform shows the difference. Dark parts are high and shrill. Parts with a little amount of dark are low.
  • Please view waveforms in Audacity to see the same effect if you ask yourself about the alternating character of waveforms like: "Male speech, 200 samples per pixel column".
  • As to my experience the current logarithmical waveform view removes too much shape information to still provide proper orientation hints. Otherwise I would use it more often. I tried to show a logarithmical scaling of the samples, but it doesn't look like Audacity's shape. I'm unsure whether I'm doing it correctly since I'm just handling insentity values in the range from 0.0 to 1.0. This is the formula (C#):
    • sample = Math.Log(1 + Math.Abs(sample) * (logarithmBase/*10*/ - 1), logarithmBase) * Math.Sign(sample);
  • When having a nearly unrecognizable waveform of a song amplified to the maximum you might be better able to see the beats. See the image: "Trance music with strong rhythm, 500 samples per pixel column".
  • You are better able to identify areas of a song with alternating loud and silent parts. See image: "Orchestral music with strings, 5000 samples per pixel column".
The attached images (see ZIP file) were created with a demo converter program I wrote to be able to see the effects of the waveform-drawing algorithm. You should view the images pixelated when you zoom. All underlying sound files have a sample rate of 44100 Hz. All generated waveform images are 500 px wide, so an image with 500 samples per pixel column has a duration of 5.7 seconds.

Note: There has been some discussion before about this topic, but mixed with other topics and not as thoroughly described as here. You can find the old discussion here if, e.g., you want to learn more about the future of the zoomed-in waveform view.

Please share your opinion with some details in case it's not already been said. Please share also the technical insight you might have that would improve the concept, if you're as well interested in Audacity having this feature.

See also my second post as an addition to this one.

You're able to generate waveform images with the demo converter program attached. It's a command line programm, that might be best to use with a batch file.
Attachments
Example images of waveforms.zip
(1.32 MiB) Downloaded 50 times
Algorithm.png
Algorithm.png (170.61 KiB) Viewed 1207 times
Waveform Image Generator 1.1 (for Windows, .NET 4.0 required).zip
(177.37 KiB) Downloaded 38 times
Last edited by h-h on Wed Aug 26, 2015 4:43 pm, edited 5 times in total.

Gale Andrews
Quality Assurance
Posts: 41761
Joined: Fri Jul 27, 2007 12:02 am
Operating System: Windows 10

Re: Shaded waveform view (when zoomed out)

Post by Gale Andrews » Wed Aug 19, 2015 12:51 pm

It may have been useful to say at the top that this was the same request as what you previously called an "anti-aliased" waveform.

Pretend we released this as an alternative waveform that does not show RMS. How would the one sentence in the release notes describe it, so that anyone would want to use it?

Say in another sentence - does the top of a pixel width ever show values above the peak?

Say in another sentence - if a zoomed out pixel width is darker than its neighbouring pixel width, what does that mean?

Say in another sentence - can the darker section of a pixel width start above the centre line - or is that an optical illusion in your images?

Say in another sentence - do we expect blurrier images the farther we are zoomed out, as appears to be the case to me?


Gale
________________________________________FOR INSTANT HELP: (Click on Link below)
* * * * * Tips * * * * * Tutorials * * * * * Quick Start Guide * * * * * Audacity Manual

steve
Site Admin
Posts: 81609
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Shaded waveform view (when zoomed out)

Post by steve » Wed Aug 19, 2015 1:46 pm

Gale Andrews wrote:It may have been useful to say at the top that this was the same request as what you previously called an "anti-aliased" waveform.
That's perhaps my fault. I suggested that h-h avoided the term "anti-aliasing" as that caused so much confusion last time (and "anti-aliasing" is something completely different from what h-h is now describing).


@ h-h
If I understand correctly:
  • The vertical scale is divided into amplitude bands
  • The algorithm looks at the audio samples that lie within each pixel width, and counts how many samples lie within each band
  • The colour / transparency of each pixel is then calculated, based on the proportion of samples within the pixel width that lie within each amplitude band.
  • For each pixel width, the more samples present within an amplitude band, the darker the colour.
This bit I don't understand. In your first illustration (the "ASCII art" image) you show a cumulative score for the pixel values as the amplitude bands approach the dashed "center" line. Thus the colour for a single pixel column will always be darkest at that "center line". What is that "center line"? Is that the same as the "silence level" center line shown in the current audacity waveform track, or is that the mean average of all samples within the pixel width, or is it something else?
What is happening at points A and B in this image?
Female voice singing, 200 samples per pixel column.png
Female voice singing, 200 samples per pixel column.png (53.77 KiB) Viewed 1244 times
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

anahuj
Posts: 92
Joined: Fri Dec 06, 2013 8:14 pm
Operating System: Please select

Re: Shaded waveform view (when zoomed out)

Post by anahuj » Thu Aug 20, 2015 2:57 am

I remember times when GNU/Linux's only audio editor did not show min-max waveform. (Did I add it?) The rms waveform was confusing and misleading. Now I have hard time to understand what the suggested waveform displays. Lets not add confusion. Min-max is what I use because it has been best for years.

The proposed waveform display seems to count high frequencies and if they happen at high or low amplitude. Looks like. Spectrogram is better. The question is how to incorporate some useful features of spectrogram to waveform display. And do we have to embed the information within the waveform - between min/max values. Simple overlay of waveform and spectrogram could be less confusing - do plot min/max lines on the spectrogram display.

h-h
Posts: 110
Joined: Tue Jul 28, 2015 2:37 am
Operating System: Please select

Re: Shaded waveform view (when zoomed out)

Post by h-h » Thu Aug 20, 2015 3:48 am

First, I wan't to add some more to the first post (i.e., the following until the separator line).

I suggest to additionally attenuate the peak pixels (i.e., 2 per pixel column) according to the amount the samples really reach the top (for the upper half) or the bottom (for the lower half) of the peak pixel. See the attached images of this post.
  • It gives a little bit more peak information
  • It can lead to smoother edges of the waveform based on real data
  • It removes the solid line otherwise present on silence
  • It makes the most useful view from very small waves
As you can see on the attached images, you don't have to be stuck to grayscale waveforms, of course.

Please see the two images "Square waves 60, 1000, 3000, 6000, 10000 Hz ...". They indicate that lines or curves can tell of a break in the loudness of different frequencies, e.g., but they don't tell what frequency is present. An interesting waveform in this context is shown by the following picture from the zooming-out ZIP file: "0272 s.p.px.col." (center).

This is an attempt to explain why the center of a waveform is darker:

Code: Select all

             x                                  lightest
            /#\
           x###x                                lighter
          /#####\
         x#######x                              light
        /#########\
       x###########x                            dark
      /#############\
     x###############x                          darker
    /#################\
   x###################x                        darkest
------------------------\---------------------------------
                         x###################x  darkest
                          \#################/
                           x###############x    darker
                            \#############/
                             x###########x      dark
                              \#########/
                               x#######x        light
                                \#####/
                                 x###x          lighter
                                  \#/
                                   x            lightest

There's a series of waveforms in a seperate ZIP file attached to this post that show zooming out a waveform. When you look at the first zoomed-in image, you see the representation lacks visual quality. If someone knows a mathematical/algorithmical way to change that, please tell.

____________________________________________________________________________________________________
Gale Andrews wrote:It may have been useful to say at the top that this was the same request as what you previously called an "anti-aliased" waveform.
I told it at the bottom.
Gale Andrews wrote:Pretend we released this as an alternative waveform that does not show RMS. How would the one sentence in the release notes describe it, so that anyone would want to use it?

Say in another sentence - does the top of a pixel width ever show values above the peak?

Say in another sentence - if a zoomed out pixel width is darker than its neighbouring pixel width, what does that mean?

Say in another sentence - can the darker section of a pixel width start above the centre line - or is that an optical illusion in your images?

Say in another sentence - do we expect blurrier images the farther we are zoomed out, as appears to be the case to me?
AUDACITY RELEASE NOTES wrote:A shaded waveform view was added that gives you more ideas of acoustical characteristics, as multiple amplitude-wise frequency breaks. It shows the peak equal to the traditional waveform view, but slightly more precise, which can lead to smoother edges. When the dark area of a pixel column is nearer to the peak than on another pixel column with the same peak, it means that more samples of the respective time frame are located in an area nearer to the peak. When the dark part of a pixel column is not vertically centered, this indicates a shifted average sample value which on a larger scale may be a shifted DC offset of a subrange. Since every pixel column is rendered by itself, it shows the audio data as carefully as the traditional waveform view without lossy blurring when zooming out.
____________________________________________________________________________________________________
steve wrote:I suggested that h-h avoided the term "anti-aliasing" as that caused so much confusion last time (and "anti-aliasing" is something completely different from what h-h is now describing).
The ability to see waveforms clears things up for me now, too. See the image "Chirp" to see how it has started as an idea for smoother edges (different than in last attachment).
steve wrote:If I understand correctly: The vertical scale is divided into amplitude bands
Each "band" goes down/up to the DC offset (average sample value) and the "bands" are added up on the z-axis.
steve wrote:The algorithm looks at the audio samples that lie within each pixel width, and counts how many samples lie within each band. The colour / transparency of each pixel is then calculated, based on the proportion of samples within the pixel width that lie within each amplitude band.
If I understood correctly, yes.
steve wrote:For each pixel width, the more samples present within an amplitude band, the darker the colour.
See also the above code block.
steve wrote:This bit I don't understand. In your first illustration (the "ASCII art" image) you show a cumulative score for the pixel values as the amplitude bands approach the dashed "center" line. Thus the colour for a single pixel column will always be darkest at that "center line". What is that "center line"? Is that the same as the "silence level" center line shown in the current audacity waveform track, or is that the mean average of all samples within the pixel width, or is it something else?
The waveform is the darkest at the DC offset, i.e., the average sample value, that can differ from pixel column to pixel column. Otherwise zoomed-in waves would not show correctly; they have to be the darkest at the DC offset (of the very few samples). It's also a useful information to see the DC offset when zoomed-out.
steve wrote:What is happening at points A and B in this image?
You see this kind of "interlaced view" in the current waveform view, too. I try to explain it by means of the underlying audio data: The current waveform view (to my understanding) and the proposed one always pick a fixed number of samples and render a pixel column. It can be that from a wave that's rather constant for a few pixel columns, first, samples are picked that mostly are large positive values, then samples are picked that mostly are small negative values. Further, the DC offset (average sample value) might differ. There's a kind of shifting happening if you know what I mean.

____________________________________________________________________________________________________
anahuj wrote:Now I have hard time to understand what the suggested waveform displays.
I'm sure it's a matter of getting used to getting ideas of the acoustical characteristics of the type of waveforms an individual is working with. You should rather intuitively get used to it.
anahuj wrote:The proposed waveform display seems to count high frequencies and if they happen at high or low amplitude. Looks like.
Please see the two images "Square waves 60, 1000, 3000, 6000, 10000 Hz ..." in this context. They don't distinguish between frequency values, but they show frequency breaks.
anahuj wrote:Spectrogram is better. The question is how to incorporate some useful features of spectrogram to waveform display. And do we have to embed the information within the waveform - between min/max values.
This would cut away information from the spectrogram. These views have completely different scales.
Attachments
Waveforms.zip
(301.36 KiB) Downloaded 38 times
Female voice singing, zooming out.zip
(1.93 MiB) Downloaded 38 times
Last edited by h-h on Thu Aug 20, 2015 3:41 pm, edited 5 times in total.

steve
Site Admin
Posts: 81609
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Shaded waveform view (when zoomed out)

Post by steve » Thu Aug 20, 2015 11:38 am

h-h wrote:First, I wan't to add some more to the first post.
Please don't change the content of older posts after there has been a reply. It is very confusing to do so. Fixing spelling errors is of course OK, but if the content is changed then that demands that everyone engaged in the topic has to re-read the entire topic, which I for one do not have time to do. So please, if you wish to modify something that you said previously, do so in a new post.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

h-h
Posts: 110
Joined: Tue Jul 28, 2015 2:37 am
Operating System: Please select

Re: Shaded waveform view (when zoomed out)

Post by h-h » Thu Aug 20, 2015 11:42 am

steve wrote:
h-h wrote:First, I wan't to add some more to the first post.
Please don't change the content of older posts after there has been a reply.
I did it by means of my second post.

steve
Site Admin
Posts: 81609
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Shaded waveform view (when zoomed out)

Post by steve » Thu Aug 20, 2015 1:02 pm

Gale Andrews wrote:Pretend we released this as an alternative waveform that does not show RMS. How would the one sentence in the release notes describe it, so that anyone would want to use it?

Say in another sentence - does the top of a pixel width ever show values above the peak?

Say in another sentence - if a zoomed out pixel width is darker than its neighbouring pixel width, what does that mean?

Say in another sentence - can the darker section of a pixel width start above the centre line - or is that an optical illusion in your images?

Say in another sentence - do we expect blurrier images the farther we are zoomed out, as appears to be the case to me?
Perhaps I can try to answer some of these and h-h can correct me if I'm wrong.
  • Pretend we released this as an alternative waveform that does not show RMS. How would the one sentence in the release notes describe it, so that anyone would want to use it?
    ? Sorry, I don't know that one.
  • Say in another sentence - does the top of a pixel width ever show values above the peak?
    No. The proposed algorithm only changes the colour / transparency of the pixels that display the waveform, not which pixels are used,
  • Say in another sentence - if a zoomed out pixel width is darker than its neighbouring pixel width, what does that mean?
    There are more samples within that pixel width that are at, or below the level of that pixel, relative to the average amplitude of pixels in that pixel width, than there are in the neighbouring lighter pixel.
  • Say in another sentence - can the darker section of a pixel width start above the centre line - or is that an optical illusion in your images?
    Yes, the darkest colour occurs at the mean average amplitude of all samples within the pixel width, which may be higher or lower than the track centre line.
  • Say in another sentence - do we expect blurrier images the farther we are zoomed out, as appears to be the case to me?
    Not "blurrier", but generally "lighter" (more transparent) at higher amplitudes. (an exception to this is constant tones, which will retain the same appearance regardless of zoom level, provided that we are zoomed out far enough to show the "zoomed out" type view rather than the sample view).
h-h wrote:I did it by means of my second post.
Thanks, that's how we like it ;)
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

steve
Site Admin
Posts: 81609
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Shaded waveform view (when zoomed out)

Post by steve » Thu Aug 20, 2015 1:11 pm

h-h wrote:Each "band" goes down/up to the DC offset and the "bands" are added up on the z-axis.
How are you measuring "DC offset"?
I thought that you were taking the mean of samples within each pixel width, but if that is the case you cannot determine that it is "DC offset".
DC offset is by definition a constant value, but at low zoom levels, each pixel width may be only a few samples,
If for example, the zoom level is such that there are 4 samples per pixel, then even high frequencies will cause the so called "DC offset" to jump up and down in consecutive pixels, which is clearly not a case of "DC offset". So what does that actually indicate?
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)

h-h
Posts: 110
Joined: Tue Jul 28, 2015 2:37 am
Operating System: Please select

Re: Shaded waveform view (when zoomed out)

Post by h-h » Thu Aug 20, 2015 3:05 pm

steve wrote:How are you measuring "DC offset"?
I thought that you were taking the mean of samples within each pixel width, but if that is the case you cannot determine that it is "DC offset".
DC offset is by definition a constant value, but at low zoom levels, each pixel width may be only a few samples,
If for example, the zoom level is such that there are 4 samples per pixel, then even high frequencies will cause the so called "DC offset" to jump up and down in consecutive pixels, which is clearly not a case of "DC offset". So what does that actually indicate?
Sorry if I should mix up some terms related to common understanding. I had to describe matters some way. I didn't came along another term for the average sample value. I think I should speak of just the average sample value if you also can't tell of a dedicated term.

Please see also the "Algorithm" image of my first post which I updated.

Locked