click-removal add UI interface to preview the nature of the clicks (not just remove and residual)
but a scatter-plot of click threshold vs width vs (loudness) and allow for cut-off be based on both dimensions in a fuzzy-logic manner
(and option of soft threshold instead of hard-trim-cutoff if worse than criteria)
equalizer for hum-removal allow for removal of drifting transient hums
gradually changes in both frequency and amplitude (e.g microphone in non-ideal and moving from location to…)
time-pitch stretching: doing it without making it sound maybe not blurry (similar to tube-yoplait commercial effect)
is it the phase information being lost (similar to STFT vs reasssigned spectrogram) http://photosounder.com/download.php edits sound by converting it into an image but suffers form quality loss similar?
(based on http://arss.sourceforge.net/)
Being able to have multiple spectrum windows open at once would be valuable. I do it with a screen capture and display. Desperation method.
DeNoising has the advantage that everybody knows what a half-tone screen looks like. Audio noise tends to be a good deal more free-flow. Your ability to do content recognition in your head goes a good long way. I recognize instantly the sound of a Chevy Nova with a bad tail-pipe, but there’s no way to tell the software what that is so it can be managed or removed.
We famously can’t split a performance apart into individual instruments, voices and sounds. The profile step in noise reduction is the closest—and it depends on you being able to get a terrific, clean profile.
removal of drifting transient hums gradually changes in both frequency and amplitude
That’s also singing. The minute the target starts moving, it snaps us back to content recognition and performance splitting. This is hum in motion, that’s my voice.
Are you going to write such an algorithm for us? You can read the Manual to see the tradeoffs between using Change Tempo/Change Pitch and Sliding Time Scale/Pitch Shift.
In the next 2.1.3 version of Audacity when released, Change Tempo and Change Pitch have a new option to use the SBSMS algorithm that Sliding Time Scale uses. Otherwise they use a different algorithm called SoundTouch. You can research those two algorithms online if you require more information about them.
fanchirp spectrogram seems to look good for speech , (tilted uncertainty ellipses) http://iie.fing.edu.uy/~pcancela/fcht/
if noise removal’s spectral gating could make use of that
other than just plain reassigned spectrograms there’s ConceFT, it appears to less artifacts
if denoise could also show a heat map showing frequency vs amplitude
and super-imposed on a vertical-area graph showing the subtracting noise-floor
and option for a soft-threshold instead of a hard-cut of frequencies that dont-make-it
make more internal details/processes visible (probably what $oftware might wanna hide, and here code is open)
allow to tune noise floor cut to for better cut/split
option for multitaper instead of only one window-instance https://github.com/melizalab/libtfr
conceft is a multi-taper synchrosqueezed, synchrosqueezed different form reassignement in that it reallows reconstructing the signal from spectrogram
What is the point of posting all these links?
Are you. for example, suggesting that the “Extreme Time Stretch in python” from your first link, or IRCAM’s SuperVP/AudioSculpt on which it is based, are better in some way than the extreme time stretch algorithm used in PaulStretch? If so, then in what way is it better? What are the pros and cons? What is your assessment of paulnasca’s response to why phase is randomised in PaulStretch? Have you considered or compared performance (speed) of this effect? What exactly do you want us to look at and why?
about paul’s response but Amaz Slow Downer (ircam based)
sounds relatively crisp, also it’s like the argument for blurry photos look nice (perference problem)
time-stretching test on text to speech ( similar to eye test on text ) requires intelligibility not just pleasant
but music that sounds nice is fine too
for effects that configured to use a certain window-size expressed in terms of # of samples
it probably would output very differently depending on what sample rate the audio track was in
also the uncertainty-ellipse’s aspect ratio would scale proportionally but ²
should it take this into account?
also the effects seem to use a single CPU are upgrades-to-parallel on the roadmap?
i lost the code the generating the heatmap (histo spectro.jpg)
but the vertical axis : is audio-frequency
the horizontal axis : is spectrogram’s intensity
the heatmap’s brightness : is number of pixels with those properties
(due to sampling problems, those artifact vertical lines are visible)