I have a voice recording with a lot of noise and I’d like to fix it. I don’t need much sound quality, just enough to understand what’s being said. The whole recording is about 50 minutes long. For the first 35 minutes or so the noise is quite loud but I can understand what’s being said, so it’s OK (although it would be nice to improve it). Around the 35 minutes mark the noise starts to worsen gradually. Around 40 minutes in it becomes difficult to understand the words and around 45 minutes the noise is so loud that understanding the words becomes impossible.
This is a recording of two people talking in a relatively silent room (some traffic noise comes from the street, but it’s not very loud). I made it with the recorder of my cell phone, which of course is not of very high quality, but I’ve used it many times for similar recordings in the past and the quality of the results was more than enough. It looks like what created the problem in this case was the fact that I was charging the phone battery while it recorded (I didn’t anticipate that this might cause problems).
The spectrogram of the raw audio revealed lots of noise tones centered at round number frequencies (5000 Hz, 6200 Hz), which suggests the noise must come from the electicity mains. Using my very limited knowledge of audio edition I applied the notch filter around one hundred times to remove those tonal noises. This improved the quality somehow and led to the problem I’ve described (before using the notch filters the quality was even worse). I tried the Noise Reduction effect but it didn’t help, it reduces as much voice as noise. I don’t know if the reason is that this effect just can’t help here or that I don’t know how to set up the parameters correctly.
You can listen here to a ten minute sample that covers the segment from the 35 to the 45 minutes marks. In this sample you can appreciate how the noise worsens as described above. Is there anything I can do to fix this recording?
There’s a reason professional recordings are still made in soundproof studios, with good low-noise equipment and proper microphone positioning, etc…
…difficult to understand the words and around 45 minutes the noise is so loud that understanding the words becomes impossible.
The human brain is often the best “filter”, so there’s probably not a lot you can do digitally.
If you can make-out what’s being said, a transcript (or subtitles with a video) are often better than the “cleaned-up” recording.
You’ve already tried regular noise reduction (which works best with constant low-level background noise) and you’ve tried notch filters.
Generally, you can high-pass filter around 100Hz (because any very-low frequencies in a voice recording are just noise). Similarly, anything above 10kHz (or maybe 7kHz or so) can usually be eliminated without affecting intelligibility.
The “main voice frequencies” are around 100-300Hz, so you can try bosting those. And boosting the higher frequencies (maybe above 2kHz) can bring-out the “T” & “S” sounds, or you an otherwise just experiment with the Equalizer.
frequencies (5000 Hz, 6200 Hz), which suggests the noise must come from the electicity mains.
Power line frequency is 50 or 60Hz depending on where you live. You can get harmonics but anything in the kHz range is something else. (You can get high frequency noise from “switching type” DC power supplies and chargers)
The high pass filter didn’t produce any noticeable difference. The band width of the recording is 0-8000 Hz, so it looks like low pass filters wouldn’t do much either.
I fiddled a bit with the equalizer but it didn’t help much. A radio AM like curve makes a marginal improvement, but that’s it.
Other than that, I’ve realized that the waveform is riddled with little spikes all over the place. I tried Click Removal, and it helps a bit, but for the most damaged parts it’s still a marginal improvement, and depending on the parameters in some cases it even seems to remove the proper audio.
I was charging the phone battery while it recorded
(You can get high frequency noise from “switching type” DC power supplies and chargers)
Trying to get power systems or chargers to run at wall power frequency (50/60Hz) is a pain in the neck. They’re large, heavy and expensive. Much better to chop the wall power up into tiny chunks, say, 5000 times a second and convert that. Doesn’t have to be “clean,” either. The battery system doesn’t care if the charging power has spikes on it or is slightly unstable.
Many systems use the battery itself as a stabilizing influence, but that influence starts to fall apart as the battery ages.
Am I describing your system?
Can you post a bit of the noise on the forum? That would be better than having us just guess at it.
This is a recording of a session with my therapist.
My “system”, if it can be called that, is a budget cell phone, a Motorola Moto G4 to be precise. It certainly doesn’t have any kind of optimizations meant to improve the quality of recordings (nor does the charger), I think it didn’t even come with a recording app when I bought it.
I posted a ten minute sample in my first post. You can find it here.
That’s rough. The hash changes over the course of the performance, so there’s no such thing as Noise Reduction unless you wanted to take it in two or three minute chunks. Even at that, the last third may be useless. In order to get Noise Reduction to work, you have to find a chunk of the show with noise-only and no voice. I can do that in the first two thirds, but not the last. The trash is just too dense.
However, a note. My money is on radio interference. I can almost make out voices and other content pieces in there. I know The Human Ear can make voices out of anything, but this is not plain, ordinary background hash. I suggest “Lunch with Jenny Filmore” on CBX radio. Was this office near any tall red and white towers?
It is my opinion you can get a reasonable recording out of a complete trash microphone, but you can’t be a New User and you can’t have a challenging environment. All three of those at once is a killer.
Yeah, I’m afraid you can’t really exploit it due to the pile-up of issues (terrible hardware, background noise, radio interference).
I had the very same issue when recording my wife’s Ph.D thesis defence, we recorded it on a dictaphone and on my phone, but two of the professors were talking loudly to each other next to the former and the latter was too far in a corner of the room, and most of the recording on both items was messed up we managed to get understandable recordings of 90% of the recorded time by picking up what we had on both platforms, but there’s still 10% that was lost at sea.