Can I improve on what I already do?

PGA · January 23, 2012, 7:05pm

Sorry this is a long post! Firstly, let me set the scene. One of my hobbies is assembling audio-visual sequences (AVs). This involves combining images (usually stills) and sound (any combination of music, voice-over and location sound) into what I hope is a pleasant viewing/listening experience for the audience.

I have been creating AVs since 1985 – initially using Tascam 34B 4-track open-reel tape recorder and Tascam Portastudio cassette deck/mixer. I was introduced to Audacity in the summer of 2005 and switched to 100% digital AV work later that year. I currently use v1.3.14 of Audacity on a Windows 7 desktop PC. My location recording is done on a Zoom H4 and I also use that in my study to do voice-over recording. Although I am satisfied with the quality of my soundtracks, since joining the forum a few days ago I have realised that there is more to Audacity that the few features that I use. I am now regularly dipping into the User Manual and the wiki absorbing new knowledge. But I am still left wondering whether I could benefit from any of these other features.

My workflow in Audacity is extremely simple as I still apply the rule that I learned back in 1985 – “you cannot make a bad recording into a good recording, you can only make it into a different bad recording”. I always try to get the best original recording I can. In this respect, I find the Zoom H4 a remarkable little unit! It handles every challenge that I throw at it superbly: whether a bird singing 30 feet up a tree in the open air or our local Brass Band ensemble playing in a hall at less than 15 feet away.

All recordings are done as WAV files at 44100Hz and 16bit. These are brought into the Audacity project as copied in items, so I’m free to relocate them if I want to. The only time I create an MP3 is at final mix-down. Generally my work in Audacity is simple track editing (cutting out the bits I want – or the bits I don’t want) with very short fades-in and –out applied at each end of each piece. Each passage of voice-over goes on its own track; each ambient sound goes on its own track; each piece of music goes on its own track. I then use the Gain control and the Envelope tool (as I deem best choice) to adjust the balance of the various tracks so as not to overpower the voice track.

I will sometimes use the High-pass filter on a bird-song or natural sound recording because I find that having just the higher frequencies helps these to “sit above” the rest of the material. This filter can also help to reduce any slight wind “boom” that was picked up (although that is rare because I use a Rycote Wind-Jammer over the foam baffle on the Zoom’s mics).

As I said above, I’m not unhappy with my results but am wondering if there are features in Audacity that could lead to even better soundtracks. I don’t want “different” I want “better” – if that is possible. However, another rule that I live by is: “If it ain’t broke, don’t fix it!”

I have uploaded the first 90 seconds or so of my most recent soundtrack to Dropbox here: http://dl.dropbox.com/u/15623351/Beamish-ST-Clip.mp3.

I would welcome any comments – good or bad or otherwise.

Thanks in anticipation,
Peter

steve · January 23, 2012, 8:25pm

You seem to have it pretty well nailed, both in theory and practice.
The only (minor) points I can think of are:

There is some sub sonic (very low frequency) content in the audio. While this is too low to be audible the playback system will probably still try to reproduce those sounds, thus pushing the speakers to places that they don’t really want to go. As the first step after importing it would be useful to remove those sub sonic frequencies. This can be done using the “High Pass” filter effect, though I would recommend using the Equalizer instead. The Equalization effect is a FFT filter so it has the advantage of not causing phase shifts. Here’s an example of a sub sonic filter (note that the sliders need to be adjusted as well as adding the points on the graph to achieve this very sharp cut-off):

The overall level may be a little low.
When exporting as MP3 it is important to allow some headroom because MP3s frequently have peaks higher than the original audio, however you have about 4 dB of headroom after encoding, so you could afford to push the level up by 2 or 3 dB without any risk of clipping. This will probably make no noticeable difference at all if your “product” is being played only on your equipment as you will be used to adjusting the playback level to suit, but if it is for wider circulation then it’s worth considering that most people are used to audio that has been “maximized” to death, so your recoding is going to sound distinctly quiet (even after giving it a couple more dB).

Unless you have a specific reason for not doing so, I’d suggest that you try using VBR encoding rather than CBR. It’s possible that you AV equipment may object to VBR, but if you’ve not done so then it’s worth trying. The advantage of VBR is that while 192 kbps CBR will use 192 kb each second, regardless of what the audio is, some “simple” sounds can be encoded as MP3 with equal quality at lower bit rates, whereas some “complex” sounds require a higher bit rate to achieve good quality. VBR automatically changes the bit rate so that more bits are used when they are really needed. The “Standard” preset will probably produce marginally better quality than 192 kbps CBR. The “Extreme” preset does what it says - it’s very difficult to hear the difference between this and the original WAV even with “ideal” listening conditions.

Minor details - very nice work, thank you for sharing.

PGA · January 24, 2012, 9:29am

Hi Steve,

Thanks for the detailed reply. Can I ask: how did you identify that those low frequencies were present if they were inaudible? Were they present in all parts of the sound track or only in some parts? I’d like to try and identify the point at which they become present to see if I can eliminate them at source.

I usually present my AVs using my own equipment: an Acer laptop sending the image signal from its VGA output to a Dell digital projector and sending the audio signal via the Headphones jack to a pair of Bose active loudspeakers. At AV events the sequences are shown using equipment provided by the event organiser. My volume levels are pretty much the same as those used by other AV workers. So I’m happy to leave things as they are in that respect.

I had been using CBR for the MP3 files because that was the default. My philosophy on software settings is that the folks who create our software have spent thousands (if not millions) of man-hours working hard to get it right. Why should I then try to second-guess them? However, I will give VBR a go and then listen carefully to see if I can hear any difference.

Once again, thanks for the advice.

regards,
Peter

PGA · January 24, 2012, 12:08pm

OK, I’ve run some tests with VBR. I exported a selection from the same soundtrack that the previous clip came from (including the clip area and some more sound either side). I created four files: WAV, CBR 192K, VBR Standard and VBR Extreme. I then imported these into the same Audacity project, set the view to fit the four tracks vertically and started playing them back. I then "Solo"ed from track to track listening carefully to try and detect an audible difference. The playback was via a RealTek HD Audio soundcard (volume level at 94% in the Sound Manager) into a pair of Bose Companion 2 PC speakers (volume control set at mid-way point). The listening was via my 61-year old ears at about 3 feet from the speakers (the ears were given a clean bill of health early in 2011). Studying the file sizes I was slightly surprised. Obviously the WAV was the largest at 16,163KB, but the values for the MP3s were a surprise. The CBR was 2,202KB, the VBR-Standard was 2,363KB and the VBR-Extreme was 2,811KB. I think I was expecting a wider spread than that on the MP3s.

Any comments on my test technique or the outcome?

regards,
Peter

kozikowski · January 24, 2012, 1:04pm

Subsonic sounds like you have can be sneaky. Pull off the front grill of your speakers (if they come off) and watch the largest bass speaker. Play a segment with that problem in good loud volume and I bet you can hear normal sound, but the bass cone will be wildly pulsing in and out and seem to have nothing to do with the show. You will be able to hear that if the problem gets bad enough when the cone bottoms out in its mounting and actually makes slapping or cracking sounds. Extreme versions of sub-sonic sound can permanently damage the speaker.

Koz

PGA · January 24, 2012, 1:12pm

Koz,

I cannot remove the grille on either set of speakers (not the Bose Companion 2s nor the older pair of Bose that I take out “on the road” when presenting my AVs to an audience).

regards,
Peter

waxcylinder · January 24, 2012, 1:28pm

Try selecting some of the audio and then use Analyze > Plot Spectrum

Stretch the window out a bit - change the “Size” box to a much higher detail level and change the “Axis” to “Log Frequency”

WC

waxcylinder · January 24, 2012, 1:33pm

My post above makes me wonder - when I am transferring my LP recordings to digital formats would it be useful to Equalize out the very low frequencies (just in case).

I don’t want to damage my speakers - but I realize that I played the same LPs on the deck/arm/speakers for years without any apparent damage (and the one time they went back to the QUAD factory for a service the speakers were declared fully healthy - apart from a couple of dents on the grills ), so I’m guessing High Pass Filtering or Eq-ing to remove low frequencies may not be necessary.

I did just run a Frequency Analysis on a recent LP recording and it does show an “interesting” peak from 7-12 Hz of around -36 dB

WC

PGA · January 24, 2012, 1:52pm

You mean like this?

I’m not a sound engineer or an audiophile, so what does this tell me?

regards,
Peter
P.S. You folks are incredibly helpful. I’m impressed!

waxcylinder · January 24, 2012, 2:29pm

Yup that’s exactly what I meant. I’m not technical enough to be able to properly interpret this - but I’m sure that somene will be able to shed some techy insight on the attachment you usefully posted.

WC

PGA · January 24, 2012, 2:31pm

A further update…
I have tried to build an Equalizer curve similar to that recommended by Steve. It looks like this:

It’s almost the same but not quite. Is the difference important?

I then ran that against my WAV clip and re-did the Spectrum Analysis and got this:

When I play the two versions back and “Solo” to and fro between them I can hear that the bass is less “boomy” in the modified version. I had often thought that my sequences sounded a bit “bass heavy” on playback to audiences and had always put it down to room acoustics of the halls I was in. Now I’m beginning to think it wasn’t just the room acoustics.

regards,
Peter

waxcylinder · January 24, 2012, 2:43pm

This is the similar plot of one of my recent LP digitizations:
Peter's LP.JPG
Technical insights and comments greatly welcomed …

Interestingly I note from re-reading my workflow for LP digitization from the manual ( http://manual.audacityteam.org/man/Sample_workflow_for_LP_digitization ) that it recommends:

Remove subsonic rumble and low frequency noise
Use Effect > High Pass Filter… with a setting of 24 dB per octave rolloff, and a cutoff frequency of 20 - 30 Hz to remove unwanted subsonic frequencies which can cause clicks when editing. If your record is warped, this will definitely generate unwanted subsonics, in which case consider a lower cutoff frequency.

This step can probably be omitted given a flat record and high quality turntable, arm and cartridge.

I’ve always omitted this step as I assumed that I had “flat record and high quality turntable, arm and cartridge” - does the above plot prove me wrong?

Postcript:
Interestingly I just analysed a recording that I recently “borrowed” from YouTube which was obviously uploaded from LP (there was never a CD release of this track - apart from a bootleg Japanese version which was also copied from LP) - and this showed a very similar curve in the LF end of the plot up to 100 Hz or so - is this just a “feature” of vinyl LPs?

WC

waxcylinder · January 24, 2012, 3:22pm

And this is a similar analysis of a recent capture from interweb streaming Audio from an Irish radio station:
Peter's webstream.JPG
I note the sudden drop-off at around 9kHz - so I assume that it is “losing” the HF stuff?

And should I be worried about the LF stuff, especailly the peak right at the left end of the plot in the sub 3Hz range?

Are these normally expected “features” of streaming t’interweb audio?

WC

PGA · January 24, 2012, 3:49pm

Waxcylinder,

Do the words “cans”, “worms”, “lids” come to mind?

Peter

waxcylinder · January 24, 2012, 6:00pm

ROFLMAO

You see, even though I’m gettin’ on a bit I’m still keen to learn - and you sparked off a thought that’s been lurking at the back of my mind for a while …

And I’ve started bigger forest fires in my life (to mix metaphors)

WC

PGA · January 24, 2012, 6:32pm

Further investigations using Spectral Analysis of other recordings that I have made, indicate to me that the long flattish profile in the low frequencies, as seen in Audacity1.jpg posted earlier, is an artefact of the Zoom H4 recorder. My H4 is an early model. They quite quickly introduced a Version 2, with improved on-board software, that included a “Lo Cut” filter to allow these low frequencies to be eliminated at recording time. And then, of course, they introduced the H4n with lots more improvements. It looks like I will need to include, into my workflow, the use of Equalizer to eliminate the low-frequencies. All I need to do now is play around a bit more with the shape of the Equalizer curve to try and get a low-frequency cut-off that gives me an acceptable bass. Not sure yet whether what I have is as good as I can get it. Ah well, it’ll keep me off the streets!

Peter

steve · January 24, 2012, 9:28pm

Oh dear, I’ve created a monster.

Regarding bass roll-off, I ask myself, “what are the lowest frequencies that I’m interested in?” then roll off the bass a bit below that.

Clearly we don’t want DC off-set (a 0 Hz bias on the signal) as all that does is to heat up the voice coils on the woofers and reduce the amount of headroom. A small amount of DC offset, or sub-sonic sound will not produce audible sounds (though it can mess up the editing by introducing clicks), but as Koz pointed out, in extreme cases it can destroy the speakers. In less extreme cases it causes the speakers to operate outside of their ideal range and although the sub frequencies themselves are not audible it can have a detrimental effect on bass frequencies within the audible range making the sound “muddy”.

One of the benefits of “sealed box” speaker design is that the amount of speaker damping increases rapidly below the design frequency range, so offering some protection to the speakers against extreme excursion.This is one of the reasons that sealed box designs tend to have a “clean and tight” bass response. The main downside is that huge boxes are required for really low bass.

For playing “full range” audio through good quality speakers you will probably “be interested in” frequencies down to at least 40 Hz.
If you look closely at the Eq that I posted, there is virtually no attenuation at 40 Hz (hint: look at the green line rather than the blue line). It then drops sharply to about -14 dB at 30 Hz and below -60 dB at 20 Hz.

If the audio is designed to be played through a sound system with massive sub woofers (such as a cinema sound system) and the type of audio required really low bass (for example an action film with lots of big ground shuddering explosions), then this filter would roll off too much low bass.

With your filter (PGA), although the “dots” are in a similar place, you’ll notice that the bass starts to be rolled off from about 50 Hz, and is about 6 dB down at 40 Hz. The green curve is not following the blue line as closely, thus producing a less steep filter that starts rolling off a little earlier. You may loose a little too much low bass with that filter for when you play through high quality AV equipment, though it may be beneficial for the little Bose Companion 2’s giving “cleaner” bass.
To get the green line to follow the plotted points more closely, increase the “filter length” (this will also make the processing time a bit slower).

Regarding MP3 encoding.
The “recommended” settings for good quality music by the makers of LAME is the VBR “Standard” preset.
The most commonly used “general purpose” setting is 128 kbps CBR. This is not really good enough for high quality sound - it’s more like the lowest quality that you can get away with without the MP3 damage being too noticeable.
Some old MP3 players had problems playing MP3s correctly, and even today it is not uncommon for MP3 players to report the wrong length (time) for VBR MP3 files, though all modern MP3 players should play these files correctly.
The main advantage of 128 kbps CBR is that it is really common and so it is fully supported by all players.

Back to Bass’ics
The amount of sub-sonic frequencies shown by the examples (PGA and waxcylinder) do not look excessive to me.
Here’s the spectrum of a track that I pulled off the Internet that has a serious problem with sub-bass:

The cut off at 9 kHz will be due to the data compression. MP3 and others deliberately cut off the top end when using lower bit rates so that they have more “bits” available for the “more important stuff” that lies in the range where hearing is most sensitive.

MP3 encoding also tends to add a little low frequency noise, though in this particular case it looks like there may be a little DC off-set causing that rise below 5 Hz.
It looks like there has been a purposeful attempt at reducing low bass - between 4 Hz and 50 Hz the plot shows around -48 dB compared with a peak amplitude of around -16 dB in the 80 to 90 Hz region. I’d expect that to produce a good amount of “bottom end” on small computer speakers without over stressing them.

I can’t find the quote now, but PGA also mentioned about “booming” and “rooms”.
Most rooms have a resonant frequency. With sound re-enforcement this is a really common problem and I’m amazed by the number of “live sound” engineers that do not “Eq the room” before they start - that is, to adjust the equalization on the PA so that there is some compensation against those frequencies that are accentuated by the room. Such equalisation (where necessary - and it’s often necessary) should be done with the sound system, not by the recording. Better still, it should be done by treating the room, though that can be really expensive at public venues. In the home, arranging the furniture can make a big difference. Heavy curtains and sofas are good for soaking up reflected sounds.

PGA · January 24, 2012, 9:50pm

Steve,

Yes, you’ve created a monster - a monster learning curve that I am enjoying travelling along. I’ve adjusted the filter length so that the green line touches the axis at -48dB. When I apply this to some samples of my work, I like the result. They retain some nice bass but have lost the “boom” that they did have. Now I need to try the results on my “show time” pair of Bose speakers.

I take on board your comments about rooms and their acoustics, but when I’m out on the road I have no control over the venues. They are usually smallish halls - community centres, church halls, village halls (capacity no more than 60-100 and often smaller). I suspect we’re fairly close to the “elbow” where extra effort yields little extra improvement. Thanks for all the help from everybody!

regards,
Peter

billw58 · January 24, 2012, 10:14pm

I routinely remove sub-sonic frequencies from my LP captures. I have a peak at about 8 Hz (turntable rumble), so I apply a 15 Hz 24 dB/octave HPF, or use Brain Davies’ DeNoiseLF.

The level of the sub-sonic stuff is not too bad, but I see no reason not to remove it.
Screen Shot 2012-01-24 at 5.07.14 PM.png
Screen Shot 2012-01-24 at 5.08.30 PM.png
Screen Shot 2012-01-24 at 5.09.12 PM.png
– Bill

waxcylinder · January 25, 2012, 10:16am

No, no, no - I agree with PGA that this is a tremendously useful learning experience (I may even add the gems from this to the tutorials if I can find a suitable place for them).

Your insights here are much appreciated Steve.

Well spotted Steve
That is indeed the onboard SoundMax on my desktiop PC, the only one of the three soundcards that I used that exhibits any noticeable DC offset (you may remember, Steve, that we discussed it on a recent thread). I have never usually bothered to remove it as was so slight but now I’m thinking maybe I should. But I don’t get clicks at my edit points.

Phew… that’s a relief then I wasn’t fancying revising my thousands of archived WAV files derived from LP tracks to process them for Eq rolloff or HF pass filter - nor re-loadin’g 'em all into my iTunes library.

Peter.