NoiseRemoval.cpp

Who can help me understand this file? I see no names in it besides Dominic Mazzoni. A couple things puzzle me.

Try saying what puzzles you.


Gale

  1. ApplyFreqSmoothing is called in two places. The call from Cleanup is unreachable because mDoProfile is always false at that point. A mistake?

  2. I understand why there is division by 20 calculating mNoiseAttenFactor. I understand why division is by 10 not 20 when computing mSensitivityFactor because it is used in comparison of squared amplitudes. I do not understand why division is by 10 not 20 when computing mOneBlockAttackDecay which affects decay of gain factors that are not squared.

  3. Would it make more sense to take geometric means, not arithmetic means, in ApplyFreqSmoothing, since these are multiplicative factors that are being averaged? But there may be zeroes if isolating noise. On the other hand, what Isolate does is questionable. See point 6.

  4. In this line:

   mOutSampleCount += mWindowSize / 2; // what is this for?  Not used when we are getting the profile?

Isn’t the simple answer, “to make processSamples progress toward termination” ?

  1. In these lines:
   // Apply gain to FFT
   for (j = 0; j < (mSpectrumSize-1); j++) {
      mFFTBuffer[j*2  ] = mRealFFTs[out][j] * mGains[out][j];
      mFFTBuffer[j*2+1] = mImagFFTs[out][j] * mGains[out][j];
   }
   // The Fs/2 component is stored as the imaginary part of the DC component
   mFFTBuffer[1] = mRealFFTs[out][mSpectrumSize-1] * mGains[out][mSpectrumSize-1];

I believe the result is that mFFTBuffer[0] and mFFTBuffer[1] always become 0 because the corresponding entries of mRealFFTs are always 0. The squares of the untransformed coefficients are in the first and last entries of mSpectrums[out] and were never copied into mRealFFTs. We have lost the signs of ths coefficients by now.

  1. The manual says of Isolate:

Select this option to leave just the noise - useful if you want to hear exactly what the Noise Removal effect is removing.

But the effect is not in fact to compute the difference between what would be returned with Remove, and the original signal. Maybe that is proper, because with phase shifts of passed frequencies, the aural impression would not be correct. But perhaps the better way to compute this would be to compute all gain factors as if for noise removal, then apply one minus gain to the frequencies just before the inverse FFT?

  1. There are two hidden magic numbers in the code, the FFT window size of 2048 and the 50 millisecond minimum signal time that influences the exact noise detection criterion. Might these make sense as advanced controls?
  1. Is FinishTrack wasting time in case the user canccels?

What happens if you change it from 10 to 20?


What happens if you change it?


You’re correct, but in NoiseRemoval.h it has a contradictory comment:

   // Variables that only exist during processing
   WaveTrack            *mOutputTrack;
   sampleCount       mInSampleCount;
   sampleCount       mOutSampleCount;

How would you explain that so that?

Does that work better?


Regarding the FFT window size, a larger window size can give (subjectively) better noise reduction with some material, so yes there is a case for making that an advanced control. On the other hand, many users already find this effect intimidating / too difficult and additional “advanced controls” would make that worse.
(I’ve not tested the 50 millisecond signal time)

  1. In these lines:

// Apply gain to FFT
for (j = 0; j < (mSpectrumSize-1); j++) {
mFFTBuffer[j2 ] = mRealFFTs[out][j] * mGains[out][j];
mFFTBuffer[j
2+1] = mImagFFTs[out][j] * mGains[out][j];
}
// The Fs/2 component is stored as the imaginary part of the DC component
mFFTBuffer[1] = mRealFFTs[out][mSpectrumSize-1] * mGains[out][mSpectrumSize-1];

>
> I believe the result is that mFFTBuffer[0] and mFFTBuffer[1] always become 0 because the corresponding entries of mRealFFTs are always 0.  The squares of the untransformed coefficients are in the first and last entries of mSpectrums[out] and were never copied into mRealFFTs.  We have lost the signs of ths coefficients by now.

The real part holds DC and Nyquist respectively, it's the imaginary part (sine) that is always 0.
It is a little strange to put DC and Nyquist into one bin pair though. Is it correctly re-allocated in the IFFFT?

If I recall (I’ve not checked the code), the DC component is discarded.

It’s correctly handled, as far as I can see.

Another interesting improvement could be to transfer the smoothing factor from being linear to logarithmic.
In other words, it is currently fixed over a certain frequency range. We could try to translate this into bandwidth (in fractions of an octave for instance).
In this way, the upper frequencies are more smoothed than the lower ones.

The DC and the Nyquist frequencies are not correctly handled. I knew it from the code but I verified it too on a sample in the debugger.

You must look at more code than what I quoted to see why that is. mRealFFTs (and array of arrays) is initialized to 0’s in StartNewTrack. It is filled in at FillFirstHistoryWindow after the forward FFT fills mFFTBuffer. Look closely at the loop index and you see the error.

   for(i = 1; i < (mSpectrumSize-1); i++) {
      mRealFFTs[0][i] = mFFTBuffer[hFFT->BitReversed[i]  ];
      mImagFFTs[0][i] = mFFTBuffer[hFFT->BitReversed[i]+1];
      mSpectrums[0][i] = mRealFFTs[0][i]*mRealFFTs[0][i] + mImagFFTs[0][i]*mImagFFTs[0][i];
      mGains[0][i] = mNoiseAttenFactor;
   }
   // DC and Fs/2 bins need to be handled specially
   mSpectrums[0][0] = mFFTBuffer[0]*mFFTBuffer[0];
   mSpectrums[0][mSpectrumSize-1] = mFFTBuffer[1]*mFFTBuffer[1];

The “special handling” did not remember the coefficients in the place where they are presumed to be when we set up for inverse FFT! And forgot their signs anyway by only storing squares.

So I think Steve is right that DC offset is zeroed, but it looks like an error in the code, not by design.

Though if I look at noise removal results in spectrogram, DC appears not to be zero. But that must be because the lobes still pick up other frequencies.

Steve, I recall you wrote some help in the wiki about finding good settings for noise removal, but I forget where that was.

Most of my suggestions (along with the opinions of others) have been rolled into the documentation in the new 2.0.6 (alpha) manual: http://manual.audacityteam.org/man/Noise_Removal

This is the first effect I have studied that follows this general procedure:

Apply FFT to windows of sound. (In this case 2048 samples stepped by 1024, and using a rectangular window function.)

Change frequency domain values.

Do inverse FFT.

Combine overlapping windows into a new signal. (Multiply each by a Hann window and then add.)

What are the advantages and disadvantages of variations in the above?

Why Hann and not just a simple triangular window? You want the sum of the weights of samples from different windows to be always 1. The triangular window would do that.

Is there any advantage in finer stepping than by half a width?

I wonder too what happens with shorter windows. Does that make more distortion in lower frequencies?

If you do this procedure with no change to the frequency domain values, how much change results in the signal? Is it a mathematical identity or not?

Nyquist gives me all the means to play with these ideas in Lisp.

There’s a proposal on the wiki that I think is worth reading: http://wiki.audacityteam.org/wiki/Proposal_Noise_Removal

I was particularly encouraged by the preliminary tests of “Spectral Subtraction” (the current effect is more like a “spectral gating” type effect). Though I don’t think the current patch is correct - but probably worth looking at. I have used Spectral Subtraction in other software, and for certain types of material it can produce much better results than our current effect (though the current effect may perform better with other types of material, so this would want to be an “option”).

Another idea that was put to me recently (not tested at all) was the idea of running multiple passes with different FFT sizes.
Q. How much does the unwanted “artefact” noise vary with FFT size? Would two passes with different FFT sizes result in less artefacts for a given amount of noise reduction than one pass?

Are you asking me all of this to quiz me, or because you don’t know either and it’s worth it to find out? :slight_smile:

In some programs you see something like an “Advanced options V” button which expands the dialog box if you click it and becomes “Advanced options ^”. That’s one way to cater to sophisticated users without intimidating most. Does this sort of thing occur in any Audacity dialogs?

Regarding the “isolate” button. The more I examine what it really does, the less sense it makes! I am browsing revisions of the file, and it appears the entire isolate feature was added in 11013 on March 22, 2011. That also added Sensitivity (which I find nothing wrong with), and changed how attack/decay was handled from something wrong to something less wrong but as I suspect, not quite right yet either.

… and it looks like revision 9298 not long before that made some major changes, INTRODUCING a bug in attack/decay, and also introducing the bug in handling the DC bin!

I don’t know the answer off the top of my head and it is rare that Audacity developers read topics on the forum. With your knowledge and experience of C++ you are probably in a better position than most to work out the answer.

If my question was: “I do not understand why division is by 10 not 20 when computing mOneBlockAttackDecay”,
then to try and work out what was going on, I’d probably try changing it to see what happens. Similarly with the other points. In other words, I’m suggesting an approach to the question.

There has been one instance of something like that in the past, so it is possible using wxWidgets, but I don’t think it is in the current code base. I don’t recall exactly where or when, but I do recall how it was done. In order to “expand” the interface, the windows was destroyed, and a new “expanded” window created in its place.

An alternative approach that could be used would be to create a tabbed interface. I think there is an example in the wxWidgets “samples” folder.

That does not surprise me. There’s something a bit buggy in the behaviour of the “isolate” feature that I’ve noticed several times but have not found repeatable steps to reproduce the issue. “Sometimes” it does not switch correctly between remove / isolate.

… major changes by Dominic in 6047 introduced attack/decay and frequency smoothing to begin with, replacing some other frequency smoothing algorithm I haven’t studied yet. Seems other cooks spoiled his broth later…

In here :confused:
https://code.google.com/p/audacity/source/diff?spec=svn9298&r=9298&format=side&path=/sf-cvs/trunk/audacity-src/src/effects/Normalize.cpp

When I browse versions with Tortoise, the previous version in the trunk is 8591. I am still getting acquainted with the version control browsing.