Resampling introducing LISP

My VoIP company tells me that any recording I use for my IVR must follow this format: Windows WAV, PCM 8kHz 16 bits Mono.

I created a really nice mix to start with, and then choose Mix > Format > 16-bit PCM, Project Sample Rate = 8000 Hz and then choose Tracks > Resample.

When I listen to the .wav file created, it now sounds like I have a LISP?! :astonished:

Where are you finding the “Mix > Format” menu?
– Bill

In the Project, in the left margin of the track is a box with controls (e.g. X, Mix, Mute, Solo, etc.)…

OK, I see. “Mix” is the name of the track, and that is the track dropdown menu.

When you export a project to a file with a sample rate of 8000 Hz, all frequencies above 4000 Hz will be eliminated. Perhaps this is what you are hearing?

– Bill

So how can I address this issue?

If I use audio files that have a sample rate above 8000 Hz the IVR system wigs out and doesn’t work properly.

In fact, I am pretty sure that is the standard for most telephone providers and VoIP providers.

So how can I get a clean, clear recording of my voice and not get the LISP I ran into today?

Am hoping there are some tricks I could use in Audacity to address this issue?

Otherwise all of my work over the last month to create professional voiceovers for my home and business VoIP lines is ruined?! :frowning:

Select the track of your final mix. Do Effect > Low Pass Filter. Enter 3500 for Frequency, and select 48 dB for Rolloff. Click OK to apply the filter. Your mix will now have almost no energy in frequencies above 4000 Hz. Compare it to your exported file. Is it better?

– Bill

My project is set to…
Sample Rate = 44,100 Hz
Format = 32-bit Float
Mono

I followed your instructions above, and it introduces a “lisp” in my voice that is a little worse than when I render things down to 8000 Hz.

Based on my limited understanding of things, that makes absolutely no sense, unless I had just had a hit of helium?!

What is wrongw ith my voice?

BTW, I am trying to set up a webserver so I have a place to post my audio files, but not sure how long that will take since everything seems to be difficult for me to implement… :frowning:

In that case, the “lisp” is an inevitable result of removing frequencies above 4000 Hz. Do you have a lot of sibilance in your voice?

– Bill

I am trying to get a webserver set up so I can post the suspect samples online for you or whoever to scruntenize.

In the mean time, you may be on to something…

Maybe it IS all my fault with the way I sound and speak?! :frowning:

When I listen to my voiceover alone and then my voiceover with soundbed, I think for an amateur it sounds really good - and definitely good enough for a silly phone system.

However, the moment I convert thinsg to 8000 Hz the quality goes to hell in certain parts of my message.

It also doesn’t help that I am an allergy sufferer, and with all of this climate change crap - it’s not supposed to be 80 degrees in October in most parts of the U.S., I know that phlegm is a real problem for me. (That is a whole thread in and of itself.)

My ultimate frustration is that I cannot learn 20 years of sound engineering and vocieover techniques in a day or week or month. Yet I still need to fly.

I tackled this IVR thing because it seemed like an “idiot-proof” way to test out my gear and voice skills to get started.

I was feeling very proud of myself yesterday morning when I had put things together, and then I converted things to 8000 Hz and called my phone line and listened in horror?! :astonished:

The fact that your little Audacity trick made things worse, makes me wonder if it is all my fault. Even worse, what if I just don’t have the voice to do this stuff?

That would be DEVASTATING to me, especially since most voiceover pros I have heard have said that “you just need to sound like yourself!”

Am now starting to feel like a cripple that had hopes of running in the Olympics and didn’t know any better…

8000 Hz sample rate is a bit too low for voice. As Bill wrote, at this sample rate, audio frequencies are limited to below 4000 Hz. This is approximately equivalent to old style “narrowband” telephone, on which it is difficult to distinguish “f” from “s” sounds.

The frequency range of a human voice generally goes up to around 14 kHz, so for good quality speech, the sample rate needs to be over 28 kHz (double the highest audio frequency). In practice, reasonably good voice reproduction can be achieved with an audio frequency range up to about 7 kHz. Modern “wideband” telephony generally goes up to around 7 kHz. More info here: https://en.wikipedia.org/wiki/Wideband_audio

So, in short, with a sample rate of only 8000 Hz, voices will sound quite bad in places, particularly “f”, “s”, sh, “t” and “th” sounds, and there’s not really anything you can do about that, other than use a higher sample rate.