Smoothing out computer generated speech
Posted: Wed Jun 23, 2010 9:17 am
Hi--
I'm working on a project that involves synthesized speech, and I'm producing this speech with Festival. Like a lot of computer generated speech, the result sounds kind of harsh, with abrupt breaks between different sounds. I'm wondering if there's a method for smoothing this kind of thing out?
An analogy: If you have a photograph with harsh artifacts from a bad digital camera or a bad image compression, you can sometimes correct this by softening the picture in your image editor, then resharpening it. Is there a similar process I can do to a wav file?
Using SoX, I'm already adding a small reverb and a lowpass filter to the output from Festival, but this doesn't quite solve the problem.
Thanks,
Eric
I'm working on a project that involves synthesized speech, and I'm producing this speech with Festival. Like a lot of computer generated speech, the result sounds kind of harsh, with abrupt breaks between different sounds. I'm wondering if there's a method for smoothing this kind of thing out?
An analogy: If you have a photograph with harsh artifacts from a bad digital camera or a bad image compression, you can sometimes correct this by softening the picture in your image editor, then resharpening it. Is there a similar process I can do to a wav file?
Using SoX, I'm already adding a small reverb and a lowpass filter to the output from Festival, but this doesn't quite solve the problem.
Thanks,
Eric