Hi. I’m trying to record digital audio from my TV set top box. I’m using an Alesis io2 sound interface connected to the PC via USB. I’m also using a simple converter to go from the optical SPDIF output on the STB to the coax SPDIF input on the Alesis. I’m running Audacity 2.0.5 and Windows 7.
It all works after a fashion, but I’m dissatisfied with the recording level. With the default settings the audio is overloading and the signal is highly clipped. Other forum users have noted the same thing, without resolution.
Could someone please explain this? I had assumed that once the signal was in the digital domain there would be no need for level adjustment. The signal is digitised at its source to the correct level. After that it’s just a load of numbers, which, by definition, must be within the correct range. If the samples are 24 bits, then every possible 24 bit value should be valid and none of them should cause overload. In fact, there should be no need for a recording level control at all. Or am I missing something?
I can achieve a satisfactory recording if I turn the recording level control down to about 5% of maximum. But I’d really like to understand what’s happening. I’m looking for bit perfect audio. I’m worried I may be getting re-encoding and loss of quality.
When I converted my MiniDiscs I experienced this with a few of them - and bearing in mind that these were all home recordings made by I was surprised by the level differences and the clipping.
The only solution I found was to record those particular MiniDiscs by taking the analog RCA out from the MiniDisc deck and feeding that to my USB soundcard via its RCA inputs. Not ideal as, like the original poster, I wanted to stay in the digital domain throughout - but needs must …
I’ve heard of that problem before, but I thought it was an issue with the S/PDIF outputs on some MD players being too “hot”, rather than a more general problem that S/PDIF inputs on sound cards were too hot.
It’s digitised at the level it’s received at, clipped or not.
You won’t be beginning to get that unless you choose Windows DirectSound host in Audacity Device Toolbar with “Exclusive Mode” set in Windows “Sound” .
And you won’t get 24-bit recording (only zero-padded 16-bit) unless you use the 2.0.6-alpha Nightly Builds and choose Windows WDM-KS or Windows WASAPI host. Note that Windows WDM-KS may not be compatible with your device and may cause a computer crash or Audacity freeze. Ensuring you have correct (non-ASIO) drivers may help if a crash or freeze occurs.
Thanks for the replies. It looks like Gale may have some inside information from the dev team.
As suggested I downloaded the latest alpha and it has an extra option in the first box “WDM-KS”. I’ve no idea what that means, but early results with it are promising. My objective is to make a direct bit-for-bit recording of the incoming SPDIF stream. I’m not sure of the exact format. I’m assuming it’s 48kHz and 24 bit, but I may be wrong.
I set the Alesis io2 device to exclusive mode in Windows Control Panel> Sound. Then I selected WDM-KS and chose the Alesis as recording source. This seems to do what I want. The recording defaults to the correct level without clipping and the input level control is disabled so you can’t change it. I thought I would need to match exactly the sample rate in Audacity and the incoming sample rate, but it seems to work with whatever rate I select. Can you please confirm whether this should give me the bit perfect result I’m seeking.
Incidentally I couldn’t get it to work properly using WASAPI or Windows Direct Sound. I still got overloading in both cases with the alpha software.
Thanks for your help. Can you refer me to any other docs that explain how it all works and what WDM-KS does.
It’s a Windows interface between applications (such as Audacity) and the sound card driver, a successor to MME and Windows DirectSound.
WDM-KS bypasses kernel mixing, allowing the audio application to access the driver’s kernel module directly. As a result it has the lowest latency of all the Windows audio interfaces, almost matching the low latencies achieved by ASIO .
Yes it supports 24-bit and 48000 Hz but it is not limited to 48000 Hz.
In fact with WDM-KS the Exclusive Mode boxes should have no effect - WDM-KS by definition has exclusive access to the application.
I think you “should” be able to specify 48000 Hz in Project Rate bottom left of Audacity and have the actual rate bottom right of Audacity be 48000 Hz even if Windows Default Format is set to another rate, but I find that doesn’t always happen even with the same audio input. Audacity’s WDM-KS support is still experimental.
So I suggest you set Audacity Project Rate and Default Format to 48000 Hz if that is the rate you require, and definitely set Alesis to 48000 if it has its own control for that anywhere.
Not quite, because Audacity forces capture to 32-bit (irrespective of Default Sample Format in Quality Preferences) . So there will always be an upconversion from 24-bit to 32-bit on capture, though many would regard that upconversion as essentially lossless.
If your Default Sample Format is 32-bit float (the default choice) then after the upconversion on capture the only conversion will be downconversion if you export to a 24-bit format (you may want to turn off High-quality Dither in Quality Preferences).
If you change Default Sample Format to 24-bit float then after the upconversion on capture there will be downconversion to store the audio in the track as 24-bit. And although there will be no downcoversion on export, you should know that Audacity processes audio in 32-bit float irrespective of Default Sample Format or the format of the stored audio. So if you apply filters or any effect that modifies the sample values, there will be upconversion to 32-bit to run the effect then downconversion to 24-bit to return the audio to the track, and you may prefer to leave Default Sample Format at 32-bit.
Note however that on Vista and later, MME, Windows DirectSound (even Exclusive Mode) and WASAPI Shared (non-Exclusive) will do conversion to and from 32-bit float before Audacity even gets the audio. And MME will resample to 44100 Hz on any version of Windows.
Thanks, Gale. I think I now have the result I want using WDM-KS. I don’t mind if Audacity up-converts from 24 to 32 bit. That’s going in the right direction. I hope the WDM-KS feature makes it into production soon.
Some of the points in your post deserve to be better known. eg. Audacity does all its internal processing in 32 bit float. So you should keep the default quality at 32 to avoid unnecessary conversions back and forth.
Also all the stuff about Windows and how it messes with the audio before it gets to Audacity is difficult to understand. Maybe a future article in the Wiki might cover this? Would it have been any easier if I’d used Linux. Or is that another can of worms?
Thanks for pointing out there are two sample rates displayed in Audacity. I now see that default project rate is at the bottom left and actual rate at the bottom right (and another rate on the actual track). The correct value is displayed on the right, even if I select something silly on the left.
A funny thing to report. This is on 64bit Windows 8.1 Update 1 with Audacity version 2.04 (which I kept for the WDM-KS support). My Windows Default Format is set to 192 000 Hz.
If I open “music_a.wav” as posted in another thread on this very forum, the “Project Rate” (bottom left of Audacity) is always shown as “22050 Hz”, but I get different values for the “Actual Rate” (bottom right of Audacity) depending on which “Audio Host” I select.
If I choose MME as “Audio Host” from the toolbar, and the speakers as output, the “Actual Rate” is shown as “22050 Hz”.
Same thing if I choose Windows DirectSound as “Audio Host”.
Now if I choose “WASAPI” as “Audio Host”, the “Actual Rate” is shown as “192 000 Hz”!
And if I choose “Windows WDM-KS” as “Audio Host”, the “Actual Rate” is shown as “44100 Hz”!
So it is difficult to know what “Actual Rate” (bottom right of Audacity) might exactly mean!
Most of the information is on Wiki or in the Manual but it is rather scattered between articles.
Vista introduced a new Windows audio stack and in essence those changes have persisted in Windows 7 and Windows 8.
http://wiki.audacityteam.org/wiki/Windows_7_OS#sample_rates should probably point out that Windows DirectSound Exclusive Mode does not guarantee no intermediate processing on Vista and later because of the Windows conversion to 32-bit float and back. An Audacity developer found out that information fairly recently when looking at Microsoft documentation but it’s not clear if that conversion absolutely always happens.
I know less about Linux audio than Windows, but most Linux distributions use PulseAudio which is a sound server/mixer that sits between the audio device and the ALSA (sound) API of the Linux kernel. Choosing the “default” input or output in Audacity will route audio through that pulse “middleman” which then has its own rules about handing sample rate and bit depth conversions.
If you choose the (hw) inputs and outputs you get direct access to ALSA and the sound device which you would probably want to do if you were seeking for “bit perfection” on Linux.
Yes I see the same results on 64-bit Win 8.1 Update 1 with Default Format at 192000 Hz, whether Exclusive Mode is “On” in Windows Sound for the playback device or not.
The “Actual Rate” when recording is the rate communicated by the sound card to Audacity.
The “Actual Rate” when playing should be the rate communicated by Audacity to the sound card. So it has a different meaning than the recording case.
You also need to look in Audacity at Help > Audio Device Info… for the sample rates that PortAudio detects as supported for the device and API. Irrespective if Exclusive Mode is on or off, with WASAPI I only see whatever rate the current Default Format is at, but with WDM-KS I see
44100
48000
88200
96000
192000
If the Project Rate is set to a rate the device/API does not support, then Audacity should resample to the highest or next highest rate available and display this in Actual Rate. So this seems to work with Project Rate at 22050 Hz - Actual Rate under WASAPI is the current Default Format and Actual Rate under WDM-KS is the next highest (44100 Hz).
At 6000 Hz Project Rate, WASAPI Actual Rate is still the current Default Format and WDM-KS Actual Rate is now 88200 Hz.
At 384000 Project Rate, WASAPI Actual Rate is still the current Default Format and WDM-KS Actual Rate is now 192000 Hz.
I am not sure whether WASAPI supported rates in Audio Device Info should ideally expand to other than the Default Format rate if Exclusive Mode is “On”, but currently it doesn’t.
MME and Windows Direct Sound behave somewhat differently - between the lowest and highest supported rate of the device under those API’s (8000 Hz and 192000 Hz in my case) you can enter any nonsensical Project Rate like 111111 Hz and that Actual Rate appears on playback. That is probably a bug. If Project Rate is above the highest rate supported by the device then I get the dreaded “Error opening sound device”, except that because Windows DirectSound supports up to 200000 Hz, I can get playback with Project Rate and Actual Rate between 193000 Hz and 200000 Hz. I can also get DirectSound playback above the highest Audacity “Rate to Try” (384000 Hz), resampled to 192000 Hz Actual Rate.
Also note that Audio Device Info supported rates under MME are nothing more than a listing of the Audacity “Rates to Try”, so you should look at the other hosts for a more a correct picture.
To add another complication, MME always resamples to 44100 Hz, so will play at that rate whatever rate Audacity communicates to the sound device.
@Gale
Thanks a lot for explaining in such great detail.
As you say, this is full of “complications”.
Only one thing is clear to me now: the “Actual Rate” when recording is the rate communicated by the sound card to Audacity.
However, the “Actual Rate” communicated by Audacity to the sound card when playing seems to be rather unpredictable… But this is not so important: I don’t use Audacity to listen to music; I use Foobar, which seems to give a faithful reading of the “actual rate” of its own output.