[Settings for recording] "Audio to buffer" and WASAPI vs MME

Hello,

I have some points to ask:

  1. How to be sure the setting “Recording: Audio to buffer” has a safe value? I experimented when it is too low, it does not start the recording (it is written on your help page too). But I experimented also when I start the recording, it can be blocked quickly, then it starts. That is why I wonder if the recording can be blocked or something similar and start again “as he wants”. And if yes, how can I detect this if I record for 30 mins for example? There is a blank sound or something similar? Or I need to listen fully and detect by myself?
  2. Can you explain more precisely how work the setting “Recording: Audio to buffer” when recording? In comparison to the playing, it is maybe silly, but I thought to see the sound recorded around 5 seconds later like the playing mode does (if “Audio to buffer” = 5000ms), instead of seeing the wave sound instantly.
  3. What is the value of the setting “Recording: Audio to buffer” when WASAPI is selected? 10ms (I read it is the default value in Windows, but I do not know if it is true), or other? Related to the first point, it will be possible to up this value when this setting will work for WASAPI?
  4. Quoted from https://msdn.microsoft.com/en-us/windows/hardware/drivers/audio/low-latency-audio#FAQ : “In summary, each application type has different needs regarding audio latency. If an application does not need low latency, then it should not use the new APIs for low latency.
    → My conclusion: Use MME (in Audacity) instead of WASAPI if you want the “best quality”, is it true?

Moved to the correct Windows board.


Gale

Why is “Audio to buffer” such a concern for you? Are you doing overdubs? If not, I suggest leaving the setting on the default 100 ms. As you noted, the setting has no relationship to how soon the waves are drawn on the screen.

As it says in the 2.1.3 Release Notes, “Audio to buffer” cannot be used to adjust recording latency using WASAPI host.

If you are recording with “Software Playthrough” on, the buffer is at a fixed setting depending on the host choice, and changing “Audio to buffer” will have no effect.

It is impossible to generalise that MME gives better quality than WASAPI. It depends what you are recording and what audio devices you are using and what drivers those devices have. You gave us no information about that. Older devices may quite conceivably work better with MME or Windows DirectSound but on Windows Vista and later those two hosts are still emulated on top of WASAPI, and so go through WASAPI.


Gale

There is an article here you might find useful http://www.soundonsound.com/sos/jan05/articles/pcmusician.htm#1.


Gale

I’m not an expert but I’ll share what I know…

There a lot to it, there is a FREE online book about optimizing your computer for audio called [u]Glitch Free[/u]. The book is mostly geared toward musicians using a computer for live performance (where latency is important) but there is lots of useful information for recording too.

IMO - The BEST solution if you need low latency (to monitor yourself while recording) is to get an interface with direct-hardware, zero-latency monitoring (where the monitoring signal doesn’t go through the computer). Then you can use a big buffer (long latency) and don’t worry about it.

Or, if you don’t need to monitor yourself while performing & recording you can go-ahead and use a big buffer. The ONLY downside to a big buffer is latency (delay) and in many situations there is no downside to a few milliseconds of delay.

That is why I wonder if the recording can be blocked or something similar and start again “as he wants”.

Yes, that’s right. A buffer is a “holding tank”. When recording, the audio data stream flows smoothly out of the ADC into the buffer. Then when the multitasking system gets around to it, it reads the buffer and transfers the data to your hard drive (or in some cases, RAM) in a quick burst. If the buffer doesn’t get read in time, you get buffer overflow and data is lost. (There will be a missing piece of audio). The “glitch” may be heard as a click or pop where the waveforms re-join, or it may just “sound bad” and in some cases you may notice that the playing time is short or that the overall tempo is too fast when you play-back the audio.

The operating system is always multitasking, even when you’re only running one application. But, it’s “safer” to run only one application while recording, and to minimize background operations. Note that you can get buffer overflow without high-overall CPU usage… If some application/driver “hogs” the CPU/data bus for a few milliseconds too long, you get a glitch.

And if yes, how can I detect this if I record for 30 mins for example? There is a blank sound or something similar? Or I need to listen fully and detect by myself?

Unfortunately, I don’t know of any way of detecting buffer overflow automatically.

It’s not blank sound (silence). Just as an analogy, if we record ABCDEFG and we get buffer overflow, we could get ACEFG… There’s something missing, but no gap or silence.

When you play back (or monitor) there is a playback buffer that works the opposite way… It gets filled with in quick-burst and the audio data streams-out smoothly to the DAC. With playback, the danger is buffer _under_flow. With buffer underflow, there will be little gaps-silence (or with video the video can freeze for a moment). i.e. AB…C…D…EF…G.

My conclusion: Use MME (in Audacity) instead of WASAPI if you want the “best quality”, is it true?

In the real world, most quality issues are on the analog-side.* All of those protocols are usually better than human hearing. (Of course, that assumes no “glitches” or defects/problems.)

WASAPI has an “Exclusive Mode” which in theory can give you “better quality” because it doesn’t allow resampling or the mixing of sounds. Picky-obsessive people who want “bit perfect” audio, use WASAPI exclusive mode or ASIO (a non-Microsoft standard that Audacity doesn’t support). But since it’s less flexible, exclusive mode is trickier to set-up

I’d start with WASAPI, which is the “latest and greatest”, and then try the others if there are problems. [u]Here[/u] is some information about the various protocols and their history.



\

  • You can degrade sound quality digitally by using a low-resolution format (like 8-bit/8kHz “telephone quality”) or by using highly-compressed lossy compression (like low-quality low-bitrate MP3).

Thank your for your answers and for your links.


This option is off.

Is it a bug in Audacity? (I understand your answer as “this is not a bug”)

Can you (or other people), if it is possible, give me example(s) how the quality can be altered between MME and WASAPI? As quality, for me it is the “bit perfect” that @DVDdoug said, I think. Please tell me if the definition of the quality is not the same for you, because maybe I am thinking bad for its definition.

I think so, certainly so for physical inputs.

Are you willing to say what it is you are actually recording, with what device (make and model number)?

There is some information here Missing features - Audacity Support and further copious information dotted around in Forum topics written by me and others. You have to use WASAPI host in exclusive mode to get even “close” to “bit perfection”.

To cut a long story short, “bit perfection” is impossible in Audacity if you are recording something and don’t even accept upconversion from one bit depth to another (which should typically be lossless). This is because Audacity captures sound in 32-bit float and you don’t have a 32-bit float recording device - almost no-one has.

If you are importing audio, and just cut audio out without changing the sample values, “bit perfection” is possible. If you want to apply filters that change sample values, bit perfection is impossible unless you have a 32-bit audio file that was never upconverted from a lower bit depth and then export it as 32-bit float WAV. If you try to do it with less than a 32-bit audio file and with no bit depth conversion, then you will add harmonic distortion as the price of bit perfection .

So, for practical puposes, use some other tool than Audacity if bit perfection matters to you above all else, regardless how the music sounds.


Gale

Well, maybe these points will help you:

  • I am focusing on 48000 Hz, 16 bits;
  • sample rate and bit depth are the same for source and recording (in Audacity so), also my sound card can recording with the same sample rate and bit depth and it is setted to these values;
  • the recording is made from my sound card (Realtek) with “stereo mix” I think in English it is this (I record the sound the computer listening).

I wanted to know more general information before specifying.

Please see the information already provided. Recording in “bit perfection” is impossible in Audacity if you don’t accept upconversion of bit depth. Only you can make this decision.

And with Realtek stereo mix - why bother with “bit perfection” when stereo mix already has lossy conversions from digital to analogue (to play it) and back to digital (for Audacity to capture it)? At least use Windows WASAPI (loopback), which is entirely digital.

Please note recording from the internet may require copyright holders’ permission. We assume you have already sought such permission where necessary.


Gale

I assume the source file is “perfect”. If I record with the same bit depth and sample rate, it is not a “bit perfect”? In this case, even 32 bit float is not “bit perfect” so?

I did not know this. I discovered “stereo mix” when I made tests to record the sound that the computer played, I was with MME in Audacity and I had no problems so I never searched more information than this, until this day.


So at this point, WASAPI is a better choice? Even if “Audio to buffer” is impossible to change (for now at least)?

Do you care about upconversion of bit depth, or not? Once again, Audacity records in 32-bit float. Your recording device cannot record in 32-bit float. You said you wanted to set it to 16-bit. So, the audio is upconverted from 16-bit to 32-bit float when you record it. It is no longer “bit perfect” if you follow the usual definition of that to mean there must be no sample rate or bit depth conversions.

Even if Audacity did not force recording in 32-bit float, Windows will do so, unless you use WASAPI Exclusive Mode.

Upconversions of bit depth are not lossy, generally speaking. It is a good thing to upconvert to 32-bit float, which is why Windows and Audacity do it. 32-bit float is the best format for editing and fast processing.

Yes. Use WASAPI (loopback) - not WASAPI Stereo Mix.

And before you launch Audacity, Go into Windows Sound and enable the “Exclusive Mode” boxes for the audio device which is playing the audio. Leave the Default Format settings alone because Windows ignores them if you choose Exclusive Mode.

Also if you know the sample rate of the song file being streamed on the internet, set Audacity Project Rate (bottom left) to that rate.

Once again, Audio to buffer settings do not matter if you are recording computer playback, as long as the buffer is not set too low. In the case of WASAPI, its default buffer settings will be used by Audacity.


Gale

Is “Effects → Amplify” works better with 32-bit float? I think it is the only effect I could use.

Maybe I did not understand something, but why WASAPI Loopback should not be concerned by “Audio to buffer”?

Amplify ONLY works in 32-bit float, because all Audacity’s internal processing is in 32-bit float. If your Audacity track is 16-bit because you set Default Sample Format in Quaity Preferences to 16-bit, then Amplify will upconvert to 32-bit float to apply the effect, then downconvert to 16-bit to return the audio to the track. This downconversion will apply dither noise, unless you turned that off in “High-quality conversion” in Quality Preferences.

If you leave Default Sample Format at 32-bit float (recommended) then there is no downconversion and so no dither noise applied.

In my opinion, the only point of Audio to buffer is to prevent playback and recording glitches. Audio to buffer would matter in theory if you were monitoring a recording using software playthrough, but in fact Audacity selects its own buffer setting when Software Playthrough is on, so the Audio to buffer setting is ignored.

Audio to buffer also makes no practical difference when you are overdubbing. Where the newly recorded track ends up on the Timeline is always determined by the Latency Correction setting in Recording Preferences.

In your case where you are recording computer playback you cannot monitor that with Software Playthrough, because that would cause feedback echoes.

And in the case of WASAPI, the buffer is preset and not configurable. So, if you use WASAPI loopback, just forget about Audio to buffer.

If you use stereo mix, set Audio to buffer high enough not to cause recording or playback glitches. In most cases the default of 100 ms is fine on Windows.


Gale

and if you turn off “dither” in preferences, then you will get “quantization noise” due to rounding down to 16-bit.

And if that leads to the question which sounds worse, the answer is that quantization noise is usually regarded as worse.


Gale

I will no longer use “stereo mix” because it is the farthest from “bit perfect” as you said.

I go back to 32-bit float for logical reasons that you said too. I did not think that the sound is up-converted to 32-bit float in all cases when the sound going through Windows (excepted “Exclusive Mode” so), that is why I always took the same sample rate and bit depth as the source.

I have one last question I think, about “Exclusive Mode”. If Windows ignores the settings “Shared Mode”, so I can test with VLC for example, and play a sound file of 48000Hz with “Shared Mode” of 44100Hz, Audacity will record with 48000Hz (48000Hz is also setted in the project as a precaution)?

Yes the project rate will set the rate Audacity requests from the device.

We don’t make VLC, so we can’t say what it does. It does have an automatic resampler in the advanced audio preferences, which I have heard is quite aggressive, so it is probably worth turning that off. Windows Media Player I suppose leaves the sample rate at 48000 Hz in the case you mention.

To be sure, I would recommend importing the file into Audacity then you have full control.


Gale

I am not sure to understand how the hosts work with “Shared/Exclusive Mode”. I disabled “Exclusive Mode” in Windows Sound. I tested with Windows Media Player and the sound is OK in contrast of VLC. But I disabled “Exclusive Mode”, so I should listen 44100Hz because “Shared Mode” is setted to this sample rate, no? Audacity seems work like WMP, but:

WASAPI and MME (MME recorded with “stereo mix”) work correctly in any case (playing or recording), but not Windows DirectSound (like VLC).

There is no need to use Windows DirectSound if you don’t want to.

We recommend WASAPI loopback because this is all digital, unlike Stereo Mix.


Gale

Yes but “Shared Mode” does not affect recording and playing with these two hosts while “Exclusive Mode” is disabled. So “Shared Mode” should be applied if “Exclusive Mode” is totally disabled, no?