Power spectral density over time

Hello,

I am a physiotherapy student in France working on a clinical study on breath sounds before and after mucus drainage techniques.
For this, I have .wav sound data with the breath sounds (normal sounds + abnormal sounds, the later confirm mucus presence).

I would like to have a graph giving the power density of the signal (vertical axis) over time (horizontal axis). This would enable me to detect inspiratory and expiratory phases.

(If this function does not exist, maybe a shift of all the frequencies to a single frequency on the spectogram graph?? I can’t manage to find this)

Thanks to anyone why could help,

Aaran

The standard release version of Audacity can produce a visual representation, but not a graph. See: https://manual.audacityteam.org/man/spectrogram_view.html

Hello Steve,

Thanks for your answer, the spectrogram doesn’t correspond to what I need, I need to have the instantaneous density (level of sound) per unit of time.
Inn the spectrogramm it is split up in frequencies, what I would need is the sum of the densities (like an absolute value of the initial sound data, all on a scale above zero)

Would you have a way to to this?

Thanks

Inès

Audacity does not have a built in way to do that (it’s not very relevant to Audacity’s primary role as an audio editor), but I think it is possible to write a Nyquist script to give the required data.

This is rather complicated to do with Nyquist, so it may take me a while to work up a solution. I’ll get back to you as soon as I can.

Give this Nyquist plug-in a go.
It is fairly slow, and it is limited to mono tracks only. Using a low sample rate will speed up the process a lot.

The output is in the form of a list: “time value” where “time” is in seconds and “value” is the PSD.
The PSD value is normalized such that a 0 dB sine wave produces a PSD of 1.

To produce a graph, you should be able to import the file into a graphing application (such as GnuPlot http://www.gnuplot.info/ or Microsoft Excel).

The analysis uses windowed FFT (Hann Window) with 50% overlap, so two time / value pair are produced for each analysis window. The size of the analysis window is set in the GUI and is in samples. The size must be a power of 2.

Note that the first value is a full window, so there is no output for time = 0.

This is an example of the output for a 10 second file, sample rate 8000 Hz, window size 1024 samples:

``````0.064	0.00128624
0.128	0.00251473
0.192	0.016568
0.256	0.0142747
0.32	0.0132586
0.384	0.00681119
0.448	0.0121131
0.512	0.00868481
0.576	0.00578852
0.64	0.00495181
0.704	0.00657584
0.768	0.00441939
0.832	0.00524667
0.896	0.00263004
0.96	0.00255072
1.024	0.00263647
1.088	0.00278444
1.152	0.00295678
1.216	0.00356109
1.28	0.00350264
1.344	0.00261589
1.408	0.00226764
1.472	0.00197015
1.536	0.00288867
1.6	0.00223002
1.664	0.00183656
1.728	0.00204151
1.792	0.00248404
1.856	0.0025184
1.92	0.00227756
1.984	0.00203437
2.048	0.00354586
2.112	0.00364554
2.176	0.00196267
2.24	0.00379757
2.304	0.00586177
2.368	0.00301561
2.432	0.00349984
2.496	0.00773801
2.56	0.00614947
2.624	0.00447345
2.688	0.00899372
2.752	0.00886032
2.816	0.00800526
2.88	0.00888664
2.944	0.00704041
3.008	0.00594997
3.072	0.0034502
3.136	0.00240895
3.2	0.00190482
3.264	0.0021041
3.328	0.00186768
3.392	0.00291826
3.456	0.00233459
3.52	0.00112069
3.584	0.00192513
3.648	0.00292788
3.712	0.00198419
3.776	0.00568655
3.84	0.0135437
3.904	0.00738803
3.968	0.0105214
4.032	0.00715912
4.096	0.0101056
4.16	0.00733338
4.224	0.00614387
4.288	0.00561147
4.352	0.00574663
4.416	0.00431356
4.48	0.00447303
4.544	0.00327666
4.608	0.00380439
4.672	0.00303737
4.736	0.00369902
4.8	0.00449977
4.864	0.00459039
4.928	0.00326087
4.992	0.00230175
5.056	0.00265385
5.12	0.00299299
5.184	0.00268782
5.248	0.0023695
5.312	0.00208152
5.376	0.0031229
5.44	0.00258706
5.504	0.00223606
5.568	0.00207771
5.632	0.00155771
5.696	0.00130329
5.76	0.00106292
5.824	0.00178928
5.888	0.0017907
5.952	0.00210631
6.016	0.00187868
6.08	0.00257582
6.144	0.00225569
6.208	0.00173322
6.272	0.00184288
6.336	0.00174932
6.4	0.00236448
6.464	0.00182442
6.528	0.00119574
6.592	0.00111843
6.656	0.000855777
6.72	0.00120917
6.784	0.00112962
6.848	0.00124522
6.912	0.00126838
6.976	0.000886619
7.04	0.00071727
7.104	0.00121383
7.168	0.00176047
7.232	0.00136977
7.296	0.000848616
7.36	0.0042369
7.424	0.00504058
7.488	0.00308637
7.552	0.00401782
7.616	0.0045622
7.68	0.00299821
7.744	0.00456851
7.808	0.00432524
7.872	0.0041667
7.936	0.00271066
8	0.00363011
8.064	0.00456645
8.128	0.00321707
8.192	0.00470179
8.256	0.00969036
8.32	0.0568791
8.384	0.0339827
8.448	0.028619
8.512	0.0297567
8.576	0.0441635
8.64	0.0255006
8.704	0.017747
8.768	0.014662
8.832	0.0268725
8.896	0.0102266
8.96	0.0106406
9.024	0.00896745
9.088	0.0083277
9.152	0.0127638
9.216	0.0859554
9.28	0.0515675
9.344	0.0448324
9.408	0.025454
9.472	0.0479879
9.536	0.0269636
9.6	0.0178904
9.664	0.0183282
9.728	0.015325
9.792	0.0180833
9.856	0.0103029
9.92	0.00687405
9.984	0.00435782
10.048	7.82864e-06
``````

Installation instruction for Nyquist plug-ins can be found in the manual here: https://manual.audacityteam.org/man/customization.html#plug-ins
and here is the plug-in:
psd.ny (1.87 KB)

I’ve received this reply from Professor Dannenberg from CMU (one of the founders of Audacity):

Hi Steve,

I think your implementation is correct. You could output the
magnitude squared of the bins individually to get a power spectral
density – it’s not clear whether Aaran wanted spectral data or just the
overall power.

If you just want overall power, you can use Nyquist’s RMS function
to get very nearly the same thing without performing an FFT. Just square
the result to get from root-mean-square to mean-square (which is power).
This will be slightly different due to different windowing.

Thinking about the problem, power is going to follow the 1/d^2 law,
not to mention room effects, so I would think controlling for microphone
position would be critical and difficult if you want to compare power
measurements. If you normalize for power and look at spectral changes
maybe due to turbulance, there might be more interesting signal
differences between normal and abnormal.

-Roger

His suggestion of using the square of the RMS function has the benefit that it will probably be a lot faster than my original plug-in, and also a lot easier to code. I’ll be happy to make a plug-in using this technique for you if it is the “overall power” that you are wanting.

Perhaps you could provide a bit more background information so that I can design the plug-in to fit your needs better.
Is this part of a medical research project or just a private project?
What are the aims of the project?
What kind of analysis are you intending to perform on the data?
What sampling rate are you wanting for the data?
Are you interested in a specific frequency range? If so, what is that range (in Hz)?
Anything else that may help me to have a clear picture about what you are doing?

Thanks a huge lot Steve,

Sorry for the late answer, the lockdown with my kids has made my agenda pretty full and I had not time to work in the last few days…

The normal breathing sound signal has regular ups and downs, on bigger one at the end of inspiration and one in the middle of expiration.
The abnormal ones that I am studying are like “crackes” or “whistling” (signing the presence of mucus in the small airways for the first one, and in the bigger ones for the second). These one can be seen thanks to the spectrogram of audacity. These sound have clear sound signal caracteristics, but since they can be seen on the spectrogram, they are perfect (initally I wanted to remove the heart sound but high pass filtering doesn’t remove the higher frequency components of the heart sound), so I just played with the spectrogramm parameters and it is now OK.

In theory I should be using Matlab, but I have no knowledge of coding and the fact that audacity is already near my needs I had to chose audacity, and I’m really happy with the spectrogram and time representation, as I can put them one underneath the other.

To be able to count the crackles and whistles, I need to categorized when they happen during the breathing cycle (ie. during expiration and inspiration), and that could be seen with a 3rd graph.

I think that the “overal power” option would be perfect.
I don’t know it this is engough information, please come back to me if you need more information.

Is this part of a medical research project or just a private project?=> It is my graduation project (in France, because I’m french), so I would say private as we are allowed no funding
What are the aims of the project? => The aim is to identify the characteristics of breathin sounds before and after mucus drainage techniques to quantify how much these characteristics change. The end target is to define an efficiency criteria for the technique that could be used in a day-to-day basis for physiotherapists.
What kind of analysis are you intending to perform on the data? What I will do is count crackels and whistling, and for the whistling get the mean frequency (with the imbedded spectral analysis of audacity). For the crackles I need to know in which phase they happen (inspiration/expiration), and the I will measure the 2 cycle deflection width.
What sampling rate are you wanting for the data? => the electronic stethoscope has sampled the data at 2000Hz (as per experts requirement I found in research articles)
Are you interested in a specific frequency range? If so, what is that range (in Hz)? => since crackles, whistling, heart, and breath sound have overlapping frequencies I need to keep all frequencies.
Anything else that may help me to have a clear picture about what you are doing? => I have .jpeg images of the breath sounds but the forum doesn’t allow to share these.

• normal lung sounds are 100-1000Hz with a drop of energy at 200Hz
• crackles are rapidely dampened wave deflection around 350Hz, 15msec max
• whistles (called “ronchus”) are sinusoid 100-5000Hz >80msec

I thank you a lot for the time you already spent on helping me, don’t hesitate to request more info from me. I tried to be clear but maybe I wasn’t.

Thanks

Aaran

Hi again Steve,

The sampling is 4000Hz, not 2000Hz as I have said, sorry

In the next Audacity release, there is a “multi view” which allows a single track to be displayed in waveform and spectrogram view at the same time. (all being well it is scheduled for around the end of this month). You can see a preview here: https://alphamanual.audacityteam.org/man/Multi-view

I’ll assume that is the case for now, but if required, the plug-in that I have already posted could be adapted to output spectral data - that is, power per frequency band over time. I’d envisage such data as a table in which each row represents a time, and each column represents a frequency band. The number of frequency bands would be half of the “window size”. Thus a window size of 512 would give 256 frequency columns.

It sounds as if precise band filtering would be useful for you. The next version of Audacity has an update for Nyquist, which allows Nyquist to perform very tight band filtering. By “tight”, I mean that the filter slope can be extremely steep. There is however a trade-off, which is that the steeper the slope of a filter (greater frequency precision), the lower it’s precision in the time domain (“blurring” in the time domain). Nevertheless, the steepness and hence the time resolution may be tweaked to provide an optimal balance for the job in hand.

Note that with the plug-in already provided, the code can be easily modified to give “band limited power spectral density”. That is, rather than summing the power for all frequency bands, it could sum the power of just some of the bands.

Example:
For a sample rate of 4000 Hz, the available frequency range is 0 to 2000 Hz.
If the window size is 512, then there are 256 frequency bands.
Each frequency band has a width of 2000/256 = 7.8125 Hz
If only the highest 128 frequency bands are used in the calculation, then you will get the power spectral density for the frequency range 1000 to 2000 Hz.

So that provides a bandwidth up to 2kHz.

I’ll make the faster “overall power” version of the plug-in, and post it here when I’ve written it.

Given the current pandemic, (which I’m expecting will be with us for quite some time), I expect your work will have great value, so I, and the Audacity team, are happy to help where we can. If any of the code or information that we provide can be useful to any of your colleagues, please make it available to them.

Regards
Steve

One other thing…

I’m guessing that from the above, you want data points at intervals somewhere in the region of 1ms to 5ms?
What would be the preferred format for the output data text file? Time indexed (as in the plug-in already posted)? Space separated power levels?

Here you go:
powerdata.ny (1.01 KB)

Thanks a lot Steve (and Roger also by the way) for your answers and clarification, and for the help for this clinical research. If it pays out it will help more people than myself, esp the lung PT community.

The aim is to present the data to a jury for my graduation, but also to the 13th french respiratory physio research day (if not cacelled this year), and if the final data is relevant (and it already seems so over the 500 samples I took), I will publish the data, but also the method to get there. It would be great to propose a simple way to get to the lung sounds detection. Today a lot of research is done on this subject but the “sound engineering part” is way too complex to be used on a day-to-day basis by PTs, and the automatic detections of sounds work is far from being efficient yet (because of the overlap of the sound caracteristics).

And to come back to our subject, I’ve installed the first Nyquist plugin, I was able to generate the file with the raw data. So I think I will be able to run your new Nyquist prompt. (crossing fingers ).

I’m using this method in audaicty today to have the normal representation and the spectrogram on underneath the other :

1. I open the sound file
2. I highlight it and copy it
3. I click in the area underneath
4. I paste => I have the 2 same representations on under the other
5. I change the second one to a spectrogram
=> this way I have both with the same timeline

Ideally I would linke to have 3 graphs one under the other (the 2 first I managed with the simple recipe I detailled above) :

1. the usual prepresentation => OK done
2. the spectrogram => OK done
3. the overal power => under your hepl ==> will we be able to import the file you are helping me to get back into audacity? (with the “import” then “raw data” functions?)

Don’t hesitate to come back to me if I’m being unclear.

Thank you, and also thank Roger for me and the whole Audacity community for your help on this.
Naturally, I will have a special acknowledgment on my written report (that I can send you if you wish), my oral exam in front of the jury, and also if I publish the work (in english and hopefully in a good review).

Have a good morning (when you wake up)

Aaran
2. the spectro

The plug-ins that I’ve posted do not export the data in a suitable format for importing back into Audacity, but can be modified.
The second plug-in (powerdata.ny) is much simpler code, so easier to modify, so I’ll work with that one for now. What I propose to do is to make the plug-in do two things:

1. Convert the selected audio into a “power” waveform that may look something like the image below.
2. Export the data as a text file that can be used for further analysis (as the current version does).

Hi Steve,

This would be just perfect!! will the new graph have the same timeline than the waveform? As I can see I will be able to change to the “power” view in the Audio Track Dropdown Menu? So I can copy paste the sound track as I do with the spectrogram and change the waveform to the “power” view"?

Really this is a lot of help!! thanks so much!!

Aaran

Here it is:
powerdata.ny (1.03 KB)
How to use it:

1. Record or import the track to be analyzed
2. Ensure that the track is selected (“Ctrl + A” to select All)
3. Duplicate the track (“Ctrl + D”)
4. Ensure that only the second track is selected (double click on the track)
5. Apply the plug-in

Hi Steve,

This is JUST fabulous!! thanks a lot.
I tested the plug in on one of my recordings, and I have my 3 graphs, perfectly aligned and with ALL the info I need!!!

Just great

Thanks, a “big pin is out of my foot” as we say in litteraly translated french! It means : A big obstacle has been overcome

Thank you very much!

Aaran

You’re very welcome.

If you need to include a citation for Audacity in your report, the correct form is shown on our website: https://www.audacityteam.org/about/citations-screenshots-and-permissions/
If you need to credit the plug-in author, my name can be found in the plug-in (open the plug-in in a plain text editor).

I surely will! thanks!

Hi Aaran,
I was very interested in your post. I’m looking to do some power spectral density analysis on a patient with Parkinson’s disease related stridor and in particular using Audacity to do this. I have no coding skills. I wanted to ask how you are progressing with this and if you have any advice for making the recording given the intended analysis. kind regards, Tim