Firstly, a little background on what I did. I have some audio clips taken from a movie, and had it edited to 10-seconds. I want my participants to judge the emotional tone of that clip, and not rely on the semantics to judge. In the past experiments, participants tend to categorise expletives or vulgar words with anger, and those without expletives are categorised as disgust. Is there a way to create this sound; just the tone but without the words? I am concerned if I have the words exchanged with other sounds e.g. “avatar” voice, chipmunk would reduce the quality of the tone. The ideal way would be to muffle the words, and leave the tone as it is.
I’ve attached a 5 sec clip of the original, the numerous expletives are not helping, and I want it either removed or muffled.
Nothing I can do with the Equalizer or Low Pass Filter helps. I can remove almost everything that contributes to intelligibility and you can still understand what he’s screaming. I guess that’s what makes low quality radio (and most cellphone) conversations work.
I think one problem is the cadence. I once saw a violin talking with a human and it was no great stretch to figure out what the violin was “saying.” He did it all with rhythm and pitch. No articulation at all.
Thanks for the replies. I’m guessing Audacity might not do the job. I have Praat and figure it out.
Koz - I tried doing the low pass filter twice at 300 Hz, and I can still hear the words clearly. You mentioned about exporting it as MP3 at low bitrate - how do I do this?
Trebor - Thanks for telling me about Praat. As to the sine-wave speech, I listened to the sample clip, and I can still hear the words but the sound has been distorted to an extent that I can imagine it would be lost when someone is uttering angry words or sobbing out words.
And in that one sentence we need to find out what Audacity you’re on. You should be on Audacity 1.3 for all the fancy-pants tools. You also need to download and install the “lame” MP3 software.
After you do that.
File > Export > MP3 > Options. You’re intended to apply the Low Pass Filter only once with 12dB or 24dB per octave. Then Export as ratty MP3, import and export it again. MP3 damage is cumulative.
You may be up against some interesting human characteristics. I can tell exactly what the cabbie outside is screaming to his traffic without knowing what the actual words are. Cadence and pitch will do it.
Audacity 1.2 is very old and no longer supported,
patched, corrected, or updated. Audacity 1.2 can
be unstable on newer computers.
Download and install the latest Audacity 1.3 from here…
You can install both audacity 1.2 and Audacity 1.3 on
the same computer, but only use one at a time.
Audacity 1.2 will not open projects made on Audacity 1.3.
If you use MP3 or some of the more modern audio
compression formats, get Lame and FFMpeg software
from the same web site. Do not use older software
or software from other web sites, even though they
may have the same names.
I do have Audacity 1.3 installed. Thanks for your instructions and I have attached the clip below, after doing it twice. I can still hear the words. I am afraid this method isn’t working out as I hoped.
Trebor - the sine-wave speech link is very helpful. I can only hope that my participants are naive to everything. Unlikely The change in waveform which subsequently change the frequency is not good too. It will remove the emotional tone, while trying to remove the words.
Hi, I tried reversing the speech. I gave it to a person to try, and he laughed out loud. Didn’t even notice that there was anger tone. So, not good.
Thanks to Koz, Trebor and Steve for trying to help me. I will just have to figure something out.
A similar technique with audio would be to add a lot of “delay” or “reverb”. I think the problem with this approach will be that when the effect is strong enough to disguise what is being said, the sound will not be recognisable as speech (it’ll sound like an engine in a metal pipe) though the long term spectrum will be almost the same.
I think the interesting thing that has come out of looking at this question is the huge amount of damage that can be done to speech before it becomes incomprehensible.
How about obtaining some samples from a foreign language film?