Default window size.

Anything you think is missing, or needs enhancement, goes here.

If you require help using Audacity, please post on the forum board relevant to your operating system:
Windows
Mac OS X
GNU/Linux and Unix-like

Default window size.

Permanent link to this post Posted by Piotr Grochowski » Sun Jun 25, 2017 8:25 am

In Audacity, default window size for spectrograms is 256. However, it lacks in frequency resolution a lot. I suggest for it to be around 4096.

Listen to 256, 4096 and 32768 spectrograms: https://www.youtube.com/watch?v=ig9GCtJi594
Piotr Grochowski
 
Posts: 23
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Mon Jun 26, 2017 1:51 pm

The choice of default window size is a compromise between frequency resolution and processing speed. For long tracks with a large window size, calculating the spectrogram can be very slow. I agree that for modern machines, 256 is a bit small as the default, but I think that 4096 would be too large for many of our users.

Also, temporal resolution decreases as the window size increases, which is another compromise that needs to be taken into account.
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Senior Forum Staff
 
Posts: 43949
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jun 27, 2017 5:40 am

Have you clicked the link to listen to these spectrograms? Now hear the original The Mine Song and hear how different it is to 256. Too much pitch information lost.

I did samples of more settings: https://www.youtube.com/watch?v=ji1it6awsN8

Notice how pitchless 256 is. A higher setting would be better.
Last edited by Piotr Grochowski on Tue Jun 27, 2017 9:52 am, edited 1 time in total.
Piotr Grochowski
 
Posts: 23
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by steve » Tue Jun 27, 2017 7:47 am

The spectrogram settings in Audacity have no affect on the sound. They only affect the visual display of the spectrogram track view (http://manual.audacityteam.org/man/spec ... _view.html)
9/10 questions are answered in the FREQUENTLY ASKED QUESTIONS (FAQ)
steve
Senior Forum Staff
 
Posts: 43949
Joined: Sat Dec 01, 2007 11:43 am
Operating System: Linux *buntu

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jun 27, 2017 9:53 am

I know. But hearing the spectrograms allows you to hear what details this spectrogram shows. If you hear 256 spectrogram in Photosounder (see video link, normalized, inverted then gamma set to 3.1:1 for proper volume scale) you can't hear pitch of his voice, so this information is missing.

The frequency resolution of 256 - default for 44100Hz is 44100/256=172. It can't distinguish semitones until at about 2900Hz. In contrast, 4096 has 16 times better frequency resolution, distinguishing semitones at 180Hz. Please note that the linear difference between semitones is lower in lower frequencies, so FFT has worse frequency resolution in basses. And remember that if you need better time resolution, 1024 and 2048 are always there with you. Spek (a program that shows spectrograms of sounds) went with 2048. Photosounder (a program that allows editing sounds by editing their spectrograms) went with another method, but it's equivalent to using different window sizes for half-semitone resolution, or more frequency resolution if time resolution reached 1/100 seconds.
Piotr Grochowski
 
Posts: 23
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10

Re: Default window size.

Permanent link to this post Posted by Piotr Grochowski » Tue Jul 25, 2017 6:33 pm

The frequency resolution of 256 - default "is 0".

Image

Good luck telling what sound it is in 256 - default. The frequency scale is logarithmic from 27 to 20000. If you drew the scale, you can see that it's an increasing tone, starting at about 260Hz. Exactly, it is 12 second sine chirp from 264Hz to 528Hz at max volume.

And by the way, the spectral leakage stop in 256 - default is 396Hz having a period of 128 on 50688Hz sample rate.

Not only make 4096 default (or 2048, or 1024 if you really want time resolution) but also make logarithmic scale the default. Photosounder even goes as far as showing melodic octaves (semitones if vertically zoomed in, increasing in amount the more zoomed) on the right side. On linear scale, the entire top half is octave or less. Linear only could be useful in low frequencies, where it is much more compact than logarithmic (relative to higher frequencies). Also, linear is not how ears work. I don't see how constant frequency resolution with FFT or equally spaced overtones could give useful information. Ears don't hear overtones equally apart.
Piotr Grochowski
 
Posts: 23
Joined: Fri Jun 23, 2017 12:45 pm
Operating System: Windows 10


Return to Adding Features to Audacity



Who is online

Users browsing this forum: No registered users and 1 guest