Default window size.

Anything you think is missing, or needs enhancement, goes here.

If you require help using Audacity, please post on the forum board relevant to your operating system:
Windows
Mac OS X
GNU/Linux and Unix-like

Default window size.

Permanent link to this post Posted by Piotr Grochowski » Sun Jun 25, 2017 8:25 am

In Audacity, default window size for spectrograms is 256. However, it lacks in frequency resolution a lot. I suggest for it to be around 4096.

Listen to 256, 4096 and 32768 spectrograms: https://www.youtube.com/watch?v=ig9GCtJi594
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Mon Jun 26, 2017 1:51 pm

The choice of default window size is a compromise between frequency resolution and processing speed. For long tracks with a large window size, calculating the spectrogram can be very slow. I agree that for modern machines, 256 is a bit small as the default, but I think that 4096 would be too large for many of our users.

Also, temporal resolution decreases as the window size increases, which is another compromise that needs to be taken into account.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Site Admin
 
Posts: 45033
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jun 27, 2017 5:40 am

Have you clicked the link to listen to these spectrograms? Now hear the original The Mine Song and hear how different it is to 256. Too much pitch information lost.

I did samples of more settings: https://www.youtube.com/watch?v=ji1it6awsN8

Notice how pitchless 256 is. A higher setting would be better.
Last edited by Piotr Grochowski on Tue Jun 27, 2017 9:52 am, edited 1 time in total.
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Tue Jun 27, 2017 7:47 am

The spectrogram settings in Audacity have no affect on the sound. They only affect the visual display of the spectrogram track view (http://manual.audacityteam.org/man/spec ... _view.html)
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Site Admin
 
Posts: 45033
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jun 27, 2017 9:53 am

I know. But hearing the spectrograms allows you to hear what details this spectrogram shows. If you hear 256 spectrogram in Photosounder (see video link, normalized, inverted then gamma set to 3.1:1 for proper volume scale) you can't hear pitch of his voice, so this information is missing.

The frequency resolution of 256 - default for 44100Hz is 44100/256=172. It can't distinguish semitones until at about 2900Hz. In contrast, 4096 has 16 times better frequency resolution, distinguishing semitones at 180Hz. Please note that the linear difference between semitones is lower in lower frequencies, so FFT has worse frequency resolution in basses. And remember that if you need better time resolution, 1024 and 2048 are always there with you. Spek (a program that shows spectrograms of sounds) went with 2048. Photosounder (a program that allows editing sounds by editing their spectrograms) went with another method, but it's equivalent to using different window sizes for half-semitone resolution, or more frequency resolution if time resolution reached 1/100 seconds.
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jul 25, 2017 6:33 pm

The frequency resolution of 256 - default "is 0".

Image

Good luck telling what sound it is in 256 - default. The frequency scale is logarithmic from 27 to 20000. If you drew the scale, you can see that it's an increasing tone, starting at about 260Hz. Exactly, it is 12 second sine chirp from 264Hz to 528Hz at max volume.

And by the way, the spectral leakage stop in 256 - default is 396Hz having a period of 128 on 50688Hz sample rate.

Not only make 4096 default (or 2048, or 1024 if you really want time resolution) but also make logarithmic scale the default. Photosounder even goes as far as showing melodic octaves (semitones if vertically zoomed in, increasing in amount the more zoomed) on the right side. On linear scale, the entire top half is octave or less. Linear only could be useful in low frequencies, where it is much more compact than logarithmic (relative to higher frequencies). Also, linear is not how ears work. I don't see how constant frequency resolution with FFT or equally spaced overtones could give useful information. Ears don't hear overtones equally apart.
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Wed Aug 16, 2017 8:21 am

Paulstretch a song with factor of 1 and time resolution of 0.006. That's what 256 - default sounds like.
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Wed Aug 16, 2017 9:23 am

A click on a voice recording:

tracks003.png
tracks003.png (86.06 KiB) Viewed 397 times
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Site Admin
 
Posts: 45033
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Wed Aug 16, 2017 10:51 am

You didn't show time scale.

This time, an actual song:
Image

You probably haven't watched these videos.
Piotr Grochowski
 
Posts: 81
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Wed Aug 16, 2017 11:49 am

Piotr Grochowski wrote:You didn't show time scale.

Regardless of the absolute time scale, it can clearly be seen that the time resolution with a window size of 256 is much better (16 x better) than at 4096). Similarly, frequency resolution is much better (16 x better) with a window size of 4096 than a window size of 256. (Are you familiar with the expression "swings and roundabouts"?) There is no "one size fits all", which is why it is a setting that users can change to suit their needs.

Personally I think that as computers are typically faster than 10 years ago, and it is still not common for people to process multi-hour recordings, a default window size of 1024 would be a better compromise, but it will always be a compromise and will never suit everyone.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Site Admin
 
Posts: 45033
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Next

Return to Adding Features to Audacity



Who is online

Users browsing this forum: No registered users and 1 guest