Removing background voice (frequency?)

Hello everybody, never used this forum before, but I have big problem and very little sound editing skills. I have audio with two people speaking, and in the background child is talking/screaming (I hate that child) to me childs voice seems very different from people speaking, I thought removing high frequency with equalizer might help but had no luck. Is it possible to do something about this? at least decrease noise a bit? Trying to attach file, it’s not in english but i think you will hear that I mean:

https://wetransfer.com/downloads/5e191cd042225171b6f3c94d242e78cb20171114134059/76cf38

You will need very good editing skills and a lot of time.

This is the spectrogram view of a small section of the file (converted to mono):


The brighter red / white regions indicate high sound intensity at specific frequencies. The vertical scale shows the frequencies. This section shows the child’s voice at around 1.5 to 1.9 seconds. To remove that voice, you need to identify which of the bright regions belong to the child’s voice and which belong to one of the other people. Then you can use the Spectral Edit Multitool to remove (by notch filtering) those bits that belong to the child’s voice.

In this image I spent about 5 minutes doing that:


and this is the result (before and after):

Sorry to tell you this, but that’s impossible.

Real world sounds contain many simultaneous frequencies and the harmonics & overtones overlap across most of the audio spectrum. (It’s what makes a trumpet sound different from a piano when even they are playing the same note… It’s not a “pure” note.) If you select a portion of the recording that has the child-only and then click Analyze - Plot Spectrum, and do the same thing for the adults-only, you might see a difference but you’ll also see a lot of overlap.

Most modern professional recordings are multi-tracked, with the guitar, vocals, drums, etc. on separate tracks (separate recordings). That way, they can edit/delete/replace the various instruments/vocals separately and apply different effects to different tracks, etc., before mixing. If they could separate the sounds, multitrack (or even stereo) recording wouldn’t be required. And of course, pros are still recording in soundproof studios. :wink:

And of course, pros are still recording in soundproof studios.

Unless they can’t. Someone discovered a while ago you could use a distance-shooting shotgun microphone for hand-held interviews with great success. After a bout of poo-pooing the idea, everybody adopted it.

It’s all but impossible to get pictures of this. Who’s going to take the picture?

It takes a bit of getting used to. Because of the intense, narrow field of sound, you have to be right on top of aiming—it’s critical. Sloppy doesn’t do it. Headphones in the field mixer are required.

There was a recent NPR Outside Broadcast show where the interviewer was a New User, and apparently wasn’t wearing headphones. This is bad aiming.

“…really LIKED THE IDEA AND FOUND IT VALUABLE.”
“…and how YOU WOULD DO THAT.”
“…n’t thOUGHT ABOUT IT A WHOLE LOT.”

There is a catch, of course. The shotgun that a friend of mine uses for very difficult work comes in just shy of $1000 USD, and that’s not counting the field sound mixer and recorder.

Koz