Pitch shift voice

milasudril · June 29, 2011, 1:54pm

What is the best way to make a low frequency voice? Just pitch-shifting makes my ‘a’ into an ‘å’.

kozikowski · June 29, 2011, 3:04pm

That would be the Announcing Voice Module I announce every April First. The problem is a low voice isn’t just a higher voice pitch shifted down. When you and the announcer say “sss” you both sound exactly the same, but when you say “aaa” there is a huge difference.

Even AutoTune only works over a limited number of notes before it starts making the singer sound weird, so no, there is no good way to do a significant shift without damage.

Or spending a lot of money. I understand there are software packages that can do a credible job, but even they don’t work well over very many notes.

Koz

milasudril · June 29, 2011, 5:14pm

But what is the easiest way to do it reasonably? Try too speak the inverse? A vocoder sounds a bit too mechanical. I want some understandable monster speech.

steve · June 29, 2011, 5:29pm

Record yourself (or someone else) speaking with as low pitched and monstrous voice as you/they can - then pitch shift it down just a couple of semitones lower.
A bit of voice acting will sound much better than trying to get an effect to do all of the work.

milasudril · June 30, 2011, 1:31pm

I think I will try to record using wrong vowels and let the pitch shifter make them correct. That works for this sentence.

However this could be quite interesting. The idea is to use the original wave fundamental frequency preserving intonation:

              +-->[high-pass filter preserving consonants]---------------------------------------------------------------------------------->
              |                                                                                                                               [Mixer]-->[output]
[wave input]--+-->[low-pass filter removing overtones]-->[pitch shifter]-->[pulse train generator (pulse at wave peak)]-->[formant filter]-->
                                                                                                                                  ^
                                                                                                                                  |
[Time for begin of each vowel in the message previously detected]-----------------------------------------------------------------+

I do not have time to test it. But anyone is free to test how it sounds.

kozikowski · July 1, 2011, 4:56am

What is expected to happen? I loaded a voice WAV into Audacity 1.3.12 and put that text into the Nyquist Prompt box.

Nyquist did not return audio.

Koz

milasudril · July 1, 2011, 9:26am

Instead of using a pulse train with fixed frequency which may be used in a voice synthesizer, the idea is to get the frequency from the original track, which in this solution is done by tracking the wave peaks in the sine like wave that comes from the lp-filter, which removed all overtones. Yes DFT could also do that skipping the filter. In any case we must estimate the fundamental frequency of the voice.

milasudril · July 1, 2011, 9:28am

It really worked

kozikowski · July 1, 2011, 6:04pm

It really worked…

Oh, I totally believe you.

How?

Step By Step. I’m on Audacity 1.3.12 and the voice test I’m using is this…

Pretend I’ve never used Nyquist in my life.

Koz

Trebor · July 1, 2011, 8:35pm

milasudril:

              +-->[high-pass filter preserving consonants]---------------------------------------------------------------------------------->
              |                                                                                                                               [Mixer]-->[output]
[wave input]--+-->[low-pass filter removing overtones]-->[pitch shifter]-->[pulse train generator (pulse at wave peak)]-->[formant filter]-->
                                                                                                                                  ^
                                                                                                                                  |
[Time for begin of each vowel in the message previously detected]-----------------------------------------------------------------+

I do not have time to test it. But anyone is free to test how it sounds.

You’ve written some sort of flow diagram, (which may or may not be correct),
but it is not encoded into the nyquist (lisp) language: it is not “CODE”, so isn’t going to do anything when input into the nyquist prompt.

[ Further reading … Formant - Wikipedia ]

kozikowski · July 1, 2011, 8:43pm

I wondered. There doesn’t seem to be enough there to do any work.
Koz

kozikowski · July 1, 2011, 8:45pm

There’s enough concept in the thread to make me interested in pursuing this further, but nobody ever mistook me for a programmer.

Koz

Trebor · July 1, 2011, 9:47pm

The free OliLarkin “auto talent” (autotune) plugin Gale Andrews posted a link to has “formant correction” to preserve intelligibility, but I’m not very impressed with the results …

There may be other similar plugins which do a better job.

Trebor · July 2, 2011, 8:14am

I may have been too harsh about the oliLarkin autotalent plugin : Anatares ($$$) Autotune VST doesn’t sound much different …

milasudril · July 2, 2011, 3:20pm

Sorry for the misunderstanding. It worked for my specific sentence to speak wrong and the pitch shifting. However, the concept of using the fundamental frequency as input for a vocoder or some voice synthesiser may also be fine. I tried the equalizer on a pulse train generated from the fundamental frequency using this MATLAB or Octave code:

%TODO: Make y band limited.

function y=peaktrack(source)
	y=zeros(length(source),1);
	direction=-1;
	direction_prev=-1;
	x_prev=source(1);
	for n=2:length(source)
		direction_prev=direction;
		if(source(n)>=x_prev)
			direction=1;
		else
			direction=-1;
		end
		if(direction_prev>direction)
			y(n-1)=x_prev;
		end
		x_prev=source(n);
	end
end

I speak with a frequency of around 100 Hz and therefore, I used a lp-filter with cutoff frequency just above 110 Hz. When I apply the equalizer with the profile corresponding to the current vowel, it sounds like someone pronouncing that vowel up and down in pitch. So using the varying fundamental frequency (this can safely be pitch to any frequency) instead of a fixed frequency as input to a vocoder could be very interesting. Probably it still sounds mechanical but because we saved the fundamental frequency and therefore knows the intonation, it would sound much more realistic than a vocoder with a fixed input frequency. It also preserves the vowels. However It probably does not work if the person speaks with a frequency near or above the lowest formant frequency.

milasudril · July 2, 2011, 3:26pm

Exactly. I do not know nyquist. I c, c++ and MATLAB or Octave. How can I write Audacity plugins without the propriety VST? BTW, I also need to check out more details about the vocoder.

kozikowski · July 2, 2011, 5:09pm

I speak with a frequency of around 100 Hz

Which makes you a celebrity. When we shoot sound for movies, we almost always use the 100 Hz roll-off filter to avoid air conditioner rumble and traffic noises. So you can also never appear in a movie, at least not and sound like you do now.

If that’s the case, then you already have a deep, rumbling voice and don’t need the filters and tools.

You need to be careful that the design process isn’t using one-off considerations.

I always wondered what would happen if you designed the voice equivalent of the DBX 120X – that would automatically give you an octave down – and tracks your voice.

Koz

milasudril · July 3, 2011, 6:42pm

100 Hz is probably normal fundamental frequency of a male voice. I need down to 50 Hz to be a monster. A woman is between 200 Hz and 300 Hz I guess. But the idea is to track the fundamental frequency and the amplitude of the speak.

Trebor · July 3, 2011, 7:18pm

after #1: reduce speed and pitch by 10%, boost around 100Hz and around 4000Hz both by approx 6db

after #2: as #1 but amplitude modulate with 50Hz, put this code in the nyquist prompt …

  (mult s (hzosc 50))

sounds bit like a Dalek, (Dalek is lower ~30Hz)

kozikowski · July 5, 2011, 9:14pm

Is there software to create sub-harmonics? I know there’s no such thing in analog electronics, but in digital…

Put a frequency or narrow range of frequencies in and it will produce octave-down versions.

Koz