recording with sony pcm d50, then direct to cd

Hi folks. i have a bunch of questions.

the goal is to record a live concert – solo singer-acoustic guitarist, live audience, medium-sized room i won’t get to scope out much ahead of time – using a sony pcm d50 stereo recorder, then take it to CD with as little processing as possible. to make it even more interesting, i won’t have had much time to experiment with the recorder before the gig. and i’ll be using its built-in microphones, which ought to be more than sufficient.

the little dab of post will be, i hope, limited to fading in and out the tracks.

my initial thought is to use the little dab of dither sony allows, making a 44.1/16 recording effectively 44.1/20 for a little extra dab of dynamic range. that, plus judicious gain setting, plus the supposedly great limiter, ought to give me a pretty clean recording.

so, now, my questions:

first, anyone here done that kind of recording with this machine? i’m interested especially in placement. my thought is to place it pretty close, as in on the mic stand itself. make sense? what would be better?

second, i had figured on using audacity to cut the thing up before burning it to red book CD. but my sense is that by merely bringing it into audacity i’m already altering the recording to some miniscule extent. the file will be a bog standard .wav. if this is so, might i be better off just recording at 24/96 and letting audacity do all the dithering? again, the goal here is to do as little as possible to the recording in post-production.

third, while the sony “super bit mapping” is supposed to be just great, it also supplies its own dithering – just 20 to 16, but still – and i’m wondering if this could somewhere cause more trouble than it’s worth in audacity, if i record at 44.1 using it.

finally, in that all i want to do is fade the cuts in and out, is there something that would be better than audacity for doing it? maybe a very simple .wav editor of some sort (i’m running kubuntu linux).

thanks for suggestions and advice.


IMHO, the recording conditions in the room and the proximity to the performers and other similar damage is going to be far worse than anything the digital services is going to do to it.

Are you recording this for the benefit of the performers? Audacity always works at 32-floating inside. It’s a similar trick to Photoshop. Both use an internal format far better than any other format likely to arrive with a show.


Whose mic stand? The singer? Can you get a house feed? Is there even such a thing? Do you know that the performance is amplified or are you guessing?

If it’s amplified, you are at the complete mercy of the house sound system. If it’s not, you’re at the mercy of the room. Have you been in the room? If you clap, does it come back for fifteen minutes?

I’m interested in what the others have to say. I have made perfectly delightful recordings at 48000/16 bit, but I was under tightly controlled conditions where I personally put the soundproofing in for the shoot.


that’s precisely the issue in the room: never been there and won’t be there until sound check. do not want to take a feed off the board, and yes, i’m thinking the performer’s mic stand – the mic isn’t worked closely and will be on a boom, so i have about a meter distance, unless i do something free-standing, which i may well be able to do if that’s too close. i know the significance of positioning; was hoping someone here may have used this recorder in this way – seems the standard recorder test is acoustic guitar, but no one ever specifies where the put the silly recorder relative to the instrument. (and yeah, i know that the room has a lot to do with it, but i’m hoping to minimize room effects.)

so audacity will deal with it in 32 bits even if going in and coming out everything is set to 16? even if default preferences are set to 16? i literally want to do nothing but cut it up and add a quick fade in and a slightly slower fade out.

goal is to see just how good a quality can be gained from doing as little as possible to the recording.

It sounds like you have a pretty good handle on the recording set-up end of things. The picture on page 9 of the manual coincidentally shows the device being used to record a singer-guitarist!
Some suggestions:

  1. Bring your own stand. A mic stand or tripod. So you can position the recorder independently of the stage mics. Bring a good set of headphones and make some recordings during soundcheck from several positions and choose the one you like best.
  2. Forget the built-in limiter. Record at 24 bit 44.1 kHz and set your recording levels low enough that you will never get overload. Ask the performer to do their loudest bit and set you levels to this, at least 6 dB below max, possibly more. Even if the maximum level you record ends up being -12 db you’ll still have 22 bits working for you.
  3. Set Audacity’s defaults to 24 bit 44.1 kHz and import the wav files without conversion. Work in 24-bit.
  4. Now is the time to apply some limiting, if needed. At least with Audacity you get to try different settings to find one you like (or decide you don’t like any of them), whereas if you use the Sony’s built-in limiter you’re stuck with it.
  5. When the post production is finished, export to 16 bit 44.1 kHz and let Audacity do the dithering.

Hope this helps. I’d be interested to know how the project works out.

– Bill

ah. many thanks.

funny – i used to do a lot of this kind of thing, but it was when sound editing involved a splice block and a razorblade! indeed, people could have made money by not investing in the things i mastered – darkroom work, tape editing.

i’ll certainly let you know how it works out.

Oh dear, sounds like we’re from the same era :wink:

I used to occasionally do this sort of thing, recording classical ensembles with coincident mics into a ReVox with dbx.

– Bill

best stuff i ever got of a group – chamber orchestra in this case – was done through the sennheiser binaural mic (on that silly plastic head with the good ears) into a half-track ampex. would be amusing to consider the s/n of those days, even at high tape speed, vs. even really cheapo gear now. though i wish i still had that mic. wasn’t the flattest thing around, but it sure did put things where they belonged spatially. kinda like the original bose speakers.

Me and my Ampex 350 have no idea what any of you people are talking about.

48000,16-bit is the television broadcast standard. The tone level (overall average sound level) is -20dB in the US. I believe it’s -18dB in Europe. That’s 20dB headroom and 60dB noise floor. Any singer I ever met can blow 6dB right out of the water on an expressive peak. Even Consumer DV, widely reputed to be inadequate, uses -12dB.

We did have some Audacity installs, on a Mac, I believe, that had troubles with 24-bit sound. I’ve always been really leery about that bit depth.

I would tell you to test your brains out, except you can’t.


the -20 is, i think, pre-emphasis and came to us by law in connection with fm stations. it’s a real pain, too, because it truly screws up s/n. which is why we got compressors, the goal being to make the quiet parts as loud as possible. which then became the standard for how most music sounds (producers thinking that if it’s gonna get smashed, might as well do it in the studio and retain control). for a time i had a dbx compander, the idea being that i could record vinyl to tape with it all smashed flat, then play back with it expanded again and lose all the tape hiss. problem is, there was a lot of processing noise.

dynamic range isn’t nearly as much an issue with digital stuff unless you’re going to put it on commercial fm, in which case it will be smashed against the top no matter what you do.

as to the 16-vs.-24 thing – wonder if that falls victim to the widely supposed exact multiples thing?

a good discussion of compression hell is here:

<<<the -20 is, i think, pre-emphasis and came to us by law in connection with fm stations.>>>

Don’t believe that too much. It’s the digital television standard and it doesn’t have pre-emphasis. And it doesn’t have any trouble making it to my living room in digital form…

I’ve been doing this for almost a year and I went back to the analog TV in the corner (before we turned ananlog off) in order to record one show while I watched another. It was pretty dreadful and I have terrific television reception. I guess I used to watch this all the time and it was normal.

I think rental CDs and television reception should look the same now. Perfect.

That’s not to say it all actually comes out the same. There are two shows on PBS with distorted audio. Badly distorted. I wrote to the producers wondering if anybody on the production staff actually listed to the air show.


the pre-emphasis may not be necessary in a digital age, but no matter: it’s what broadcasters already have. it is the same as fm pre-emph causing music to get compressed to a farethewell and now becoming the standard. the gov’t might require all teevee broadcasters to go digital, but those broadcasters, especially during a time when revenues are way down, are not going to replace their entire audio chain or retrain their engineers. and in all probability, there’s some requirement, once thought useful in analog broadcasting, that hasn’t been repealed. this is largely speculation (the part about broadcast music, alas, is not), but i bet it’s on or very close to the mark.

which is part of why i want to make the recording referenced at the top of this thread as close to “direct to disc” as possible.

Regarding 24-bit files, you could make a recording of anything on the Sony at 24/44.1, transfer that file to Audacity, bang on it really hard and see if anything breaks.

Regarding recording levels, and assuming Audacity on Ubuntu doesn’t choke on 24-bit files, if you’re concerned that the vocalist could be more 12 dB louder during performance than during soundcheck, go for -18dB - you’ll still have 21 bits to work with. Remember that we asked the performer to do their loudest bit during soundcheck to set levels. Once you get the file into Audacity do Effects > Amplify to bring the peaks up to 0.

– Bill

thing is, my goal is to do the absolute minimum post-production possible, with the result being a burn to red book audio CD. i really need in post only to cut it into tracks (and i could do this in the recorder, even after the fact, but i’d just as soon not) and to fade in and out so the cuts won’t be jarring.

my thoughts as to the recording (insofar as i can predict, having never been in the room, though i’ve recorded the artist before) tend toward these:

i am likely to use the limiter, not because limiters are good but because this limiter is good. it constantly grabs a second set of tracks and holds 'em in memory at -12. if the signal being recorded overmodulates, it normalizes the part in memory and uses that instead. of course i plan to set the level so that the limiter will not be necessary, but having it in reserve keeps some sudden loud event from splashing against the ceiling and ruining the cut.

i’ll probably use the onboard mics in x-y pattern, so relation between voice and guitar will be a placement issue, which is potentially a problem for me in that the audience probably won’t like it if it’s too close and blocks the view of the artist at all. there, i’m stuck with the best i can do.

what mostly puzzles me is the bit depth. this recorder has a facility that does a little dab of shaped dithering such that it effectively records to a 20-bit depth while actually making a 16-bit file. that’s fine, gives me a little headroom. but sony cautions against editing in applications that change the bit depth, citing the potential for sony’s arithmetic arguing with the application’s arithmetic. which is why i’d like, if i could, to keep it 16-bits during post – that problem then doesn’t arise. but i don’t know if audacity will let me do this, even for something as relatively trivial as fades.

(we’ll have a second show the following night, and i’ll probably pull out the stops and record it 24/96, throw it in audacity, and do whatever i think makes it sound better. then i’ll compare the two and, if anything worth noting as to quality results, try to post samples of both someplace.)

to make it even more fun, i’ll be using the recorder to feed the analog audio line in of a video camera, too, real time just to see how the result compares to trying to track it later. and if that souinds insane, well, i have no defense to offer!

16 bit will provide enough dynamic range for just about any audio recording (if you have your amp turned up loud enough to hear the the bottom few bits, then the top few bits will rattle your windows and set off car alarms in the street), but that is assuming that everything else is perfect. When recording, nothing is ever perfect (find me a singer that can consistently hold a peak level +/- 6dB).

Ampex was far more forgiving of peak overload - tape saturation, so long as it is not too extreme, will give quite a musical compression - but had a pretty high inherent noise floor without fancy compression/decompression/pre-emphasis/de-emphasis.

For digital recording, 24/32 bit provides a ridiculously high dynamic range (in most cases far greater than the rest of the recording equipment), but means that you can leave a lot of headroom with no fear of running out of bits at the bottom end. For high quality live recording, I prefer to use as little compression (or other processing) as possible during the recording - usually just a hardware peak limiter to prevent an otherwise good take from being wrecked by a spurious peak.

Although a digital format of 24 bit or higher has enormous dynamic range, the dynamic range of the rest of the system still needs to be considered. Different microphones have different sensitivities and should ideally be kept well within their operating range. Too high a signal and they will compress, then distort, then start doing very nasty things (the sound of condenser plates arcing is emotionally disturbing). Too low a signal and you’re running into microphone self-noise. Ideally you would choose your microphone according to what you are recording. In practice and with a limited budget, good compromises can usually be found by careful microphone placement, but there are limits (you can’t get a “close mic’d” drum sound with a shotgun microphone, and you can’t get a clean recording of speech from 50 feet with a dynamic vocal microphone).

Back to the Sony pcm D50. I’ve not used one, but I do use a much cheaper recorder of a similar type (Zoom H2) and have had excellent results from it. I would expect the Sony (at 3 x the price) to be better.

The Zoom H2 has 2 recording level controls - there is a simple H/M/L sensitivity switch, and a 0-127 level adjustment. The sensitivity switch is the important one. The switch changes the loading on the microphones and adjusts their sensitivity, whereas the the level adjustment scales the A/D conversion much in the same way as amplifying in Audacity. An analogy from the world of digital cameras is that the switch is like an optical zoom, whereas the level adjustment is like a digital zoom. While the digital zoom on a camera can make the picture bigger, it does so by looking at less pixels (just the ones from the middle of the picture) and spreading them out, so the picture is bigger, but the image quality goes down - just the same as cropping the picture in PhotoShop.

Using the Zoom H2 fairly close will usually produce much cleaner recordings than recording at distance, as the amount of extraneous noise pollution and room reverb is generally much greater than it sounds to your ears. Human hearing has a remarkable way of “focussing in” on what you are listening to and filtering out unwanted noise - microphones do not do this. I usually have the recording level set to 100 and leave it at that, and use the level switch to get the recording level in a reasonable (not clipping) range for close up recording. If the recording level goes too high even with the microphone set to the least sensitive setting, then I move the microphone back a bit. The Zoom does not have an internal limiter, just an AGC which is thankfully off by default.

For recording speech or vocals I use a fine mesh pop filter (not the foam wind-shield provided) and (ideally) place the microphone about 30cm from the singers mouth. The Zoom H2 also has a configurable pickup pattern (90 degrees front, 120 degrees back, omni) to choose from. When recording in an acoustically nice space, the 120 degree or omni-directional are nice to use. The 90 degree (cardioid) is better for isolating the recording source.

The file format settings go all the way from 48kbps MP3 (which will give about 4 days of low quality recording on a 2GB flash card) up to 24/96 WAV which will give about an hour at very high quality (though this requires a good, fast flash card to handle the data rate).

If I am going to be putting the recording onto CD I will record at 24bit 44.1kHz (CDs use 44.1kHz), then transfer the file over to Audacity via USB (the zoom can be connected to a computer so as to appear as an external disk). As the file is 24 bit, I can adjust the volume virtually losslessly, but Audacity will apply dither when it processes the sound (processing is done at 32 bit) so I will convert to 32bit (this conversion is lossless). If there will only be editing and no processing, the format may be left at 24bit.

If there are any significant high peaks, I will probably apply a bit of compression using the SC4 compressor plug-in (rather than Chris’s Dynamic compressor plug-in).

While Chris’s Dynamic Compressor is excellent, it affects virtually the entire dynamic range, and this is usually not what I want. Chris’s Dynamic Compressor is brilliant for levelling out (reducing) dynamics, and so is ideal for making CDs that you want to listen to in noisy environments (such as in a car), but for listening to music I would usually like to be able to hear the dynamics as they are an essential expressive part of the music. All I want to do is to drop the peaks a little so that I can get a good (not too quiet) level on the CD without clipping, and for this the SC4 compressor is better.

After editing, I will Amplify to around -0.3dB and Export as 16bit WAV for burning to CD (allowing Audacity to apply dither). There are a choice of dither settings that can be argued about, but the default setting is good. If you want to be totally obsessive about sound quality, Audacity can be built from the source-code to use “libsamplerate” instead of the default “libresample”. This option should not be used in conjunction with VST plug-in support.

First off, I agree with what stevethefiddle just wrote, especially recording at 24/44.1, then working in Audacity at 24 bit.
Second, from your description it appears that what the Sony is doing is taking 20 bit samples and converting to 16 bit with dither, resulting in a 16/44.1 file. Personally, I’d avoid that and record at 24/44.1. This kind of implies that the internal converters are only good to 20 bits anyway (which is still plenty). If true that means the bottom 4 bits of a 24 bit recording will be garbage, but who cares at that level, especially if you’ll be converting to 16 bit for the burn to CD. If you do record at 16 bit I can’t understand how further processing would “argue with Sony’s arithmetic”. Fades and level changes do not affect the bit depth anyway. In fact, no effects in Audacity affect the bit depth. It’s only when you export to the final 16/44.1 WAV that the bit depth is changed, with dithering. If you record at 16 bit then no changes to the bit depth will be made. If you record at 24 bit then there’s no concern about Sony’s 20-to-16-bit dithered conversion. I’d avoid doing a 24/96 recording, because you’ll then need to do sample-rate conversion when exporting the WAV files.
Third, your description of how you intend to use the Sony’s built in limiter is right on - set your recording levels so that you expect the limiter to never be triggered, but it will there “just in case”.
Fourth, you’re doing an X-Y recording, not close micing the guitar and voice. With the device pointed at the performer what you’ll get is the voice and guitar pretty much up the middle with room ambience all around. As I’m sure you understand, moving the mics closer to the performer means less room ambience and vice versa. Up/down may affect the balance between guitar and voice depending on how close the mics are to the performer.

Something just occurred to me - will there be stage monitors? If so, it would be nice to position the monitors relative to the mics so that the mics pick up as little of the monitors as possible. And the monitors should only be as loud as absolutely necessary, as there will likely be splash from the monitors off the wall behind the performer.

About Audacity and 32-bit floating point - here’s what the manual says “By default, Audacity uses 32-bit floating-point samples internally while you are working on a project and exports your final mix using 16-bit integers. This gives you somewhat better quality than audio programs that use purely 16-bit or 24-bit audio samples. Audacity’s default sample format can be configured in the Quality Preferences or set individually for each track in the Track Drop-Down Menu.” What is says to me is this. If the Quality Preferences are set to 32-bit float and you import a 16-bit or 24-bit PCM file then Audacity will convert it internally to 32-bit float, do all it’s processing at 32-bit float, then convert to 16-bit PCM on Export. If the Quality Preferences are set to 24-bit PCM and you import a 24-bit PCM file then Audacity does no conversion on import, does all it’s processing at 24-bit PCM, then converts to 16-bit PCM on Export.

– Bill

Not quite.
There is no problem with increasing the bit depth as Audacity can increase the bit depth perfectly without the need for any dithering. The issue comes when reducing the bit depth. How should the conversion handle values that lie between bit values when there are less bits? This is what dither is used for (there’s a pretty good explanation on Wikipedia).

When you import a file into a project, if the bit depth of the imported audio is higher than the project bit depth, then the higher bit depth will be retained.
If the imported audio has a lesser bit depth, than the project, then the bit depth will be increased. This way, maximum fidelity is maintained.
Audacity 1.3.9 can handle multiple tracks with different bit depths.

When you process audio, the processing is done in 32 bit, and the audio is returned to the track at the track bit depth. If the track bit depth is lesser than 32 bit, Audacity will dither the audio when it reduces the it back down to the lesser bit depth. This allows higher quality than processing at the lower bit depth, but not as high as would be achieved if the track was at a higher bit depth.

Let me see if I can think of a good example to illustrate:


  1. Create 3 tracks (it does not matter what the default quality settings are)
  2. From the track drop down menu (click on the track name), set the first track to 16 bit, the second to 24 bit, and the third to 32 bit.
  3. Select all 3 tracks and generate 30 seconds of silence.

At this point you can do a little test - select any of the tracks and call up the “Amplify” dialogue from the effects menu. You will see that it says “New Peak Amplitude: -infinity”. This is because all of the sample values are exactly 0.00000

  1. Select all 3 tracks and apply the “Fade In” effect. Audacity applies the effect.

All of our tracks contained silence, so the result of “fading in silence” should be “silence” - right? NO ! :open_mouth:

  1. Repeat the test of calling up the Amplify effect.
    The 32 bit track indicates “New Peak Amplitude: -infinity” (the samples are still all at 0.00000) but the 24 bit track and the 16 bit track show (after 50dB of amplification) a new peak level of -69.4 dB and -20.3 dB respectively (if “shaped” dither was used. -88.5dB and -40.3dB if triangular dither was used).
    What we are seeing here is the dither noise that is created when the (32 bit) processed sound is converted back to the track bit depth.

The way that we can ensure that dither is only applied once (when the finished recording is Exported to 16 bit) is to convert it to 32 bit before we do any processing.
This is probably the reason that Audacity has a default quality setting of 32 bit.

And please remember that 44100 is a compromise sample rate. It’s only pure up to about 17 KHz and plays tricks to get the rest of the way to 20 KHz. That’s why I use 48000 which will take you up to almost 19 KHz with no tricks. Everybody recognizes both. All the test samples on my web site are at 48000.

It’s still true that the environment and microphone and analog provisions are going to contribute far more damage than the digital system.

Someone posted they would not get a sound feed from the house system. Why?

I do have an odd observation. House feeds only work on the US Pacific Coast. I grew up on the east coast and experienced distortion and feedback and damage of all sorts in public presentations and I naturally assumed that was normal.

It’s not. I went to my first public, not particularly special, presentation in California and was stunned how nice the sound was. And it continued that way. Show after show, lecture after lecture, had great sound. Why is that?


Thanks for the explanation. Nice to know these details. Perhaps this could be integrated into the manual under the bit from the manual that I quoted in my post (in the Sample Rates section).

By default, Audacity uses 32-bit floating-point samples internally while you are working on a project and exports your final mix using 16-bit integers. This gives you somewhat better quality than audio programs that use purely 16-bit or 24-bit audio samples. Audacity’s default sample format can be configured in the Quality Preferences or set individually for each track in the Track Drop-Down Menu.

The use of the word “default” leaves open the interpretation that I made, IMO.

– Bill

Just on my way out at the moment, but if you would like to suggest how the wording could be improved so that it is more clear, I’ll get it checked out for accuracy and update the entry in the manual.

Alternatively, you could sign up to help with the manual by contacting Gale (see this page )

I’d combine a modified version of what’s in the manual with your expansion on it, as follows:

Audacity uses 32-bit floating-point samples > internally > while you are working on a project and > exports > your final mix using 16-bit integers. Audacity’s default sample format > during recording > can be configured in the Quality Preferences or set individually for each track in the Track Drop-Down Menu.

When you > import > a file into a project, if the bit depth of the imported audio is higher than the project bit depth [as set in the Quality Preferences], then the higher bit depth will be retained.
If the imported audio has a lesser bit depth than the project bit depth, then the bit depth of the imported file will be increased to match the project bit depth. This way, maximum fidelity is maintained.
Audacity can handle multiple tracks with different bit depths.

When you process audio, the processing is done in 32-bit floating-point, and the audio is returned to the track at the track bit depth. If the track bit depth is less than 32-bit floating-point, Audacity will dither the audio when it reduces it back down to the lesser bit depth. This allows higher quality than processing at the lower bit depth of the track, but not as high as would be achieved if the track bit depth was 32-bit floating-point.

I think the “during recording” qualifier is needed to differentiate the three things that are happening here (as I understand it so far :slight_smile: ). 1. All internal processing is done at 32-bit floating-point regardless of other settings. 2. Recording is done according to the Quality Preferences setting. 3. When importing, the highest available bit depth is used: either the bit depth of the file being imported, or the Quality Preferences setting.

Have I (finally!) got this right?

– Bill