Convert 96khz interleaved mono audio to 48khz stereo with nyquist

8bit_coder · August 7, 2020, 1:22pm

FIrst off, I’m using Windows 10 Pro 64 bit and Audacity 2.3.3

I have a strange problem I’d like to be able to fix with a nyquist script:

I have an HDMI capture card that windows identifies as 96khz mono. However, the capture card is actually using 48khz stereo audio, interleaved as mono. I’ve tried this program: https://github.com/ToadKing/mono-to-stereo and it fixes the issue perfectly, but I’d like to be able to do this in audacity with an audio file as well. How would I go about creating a script for that? Thanks.

steve · August 7, 2020, 3:46pm

In Audacity, Nyquist does not act directly on audio streams, or even on audio files. It act on data passed to it from audio tracks.

Are you able to get audio from your HDMI capture card into Audacity? If so, how do you do that, and what does it create (how does it appear) in the Audacity project?

8bit_coder · August 7, 2020, 4:35pm

Yes, the capture card shows up as a microphone(Digital Audio Interface) and the only recording options are changing the input volume, and it’s stuck at mono, 96khz and it records like any other microphone.

steve · August 7, 2020, 4:42pm

How do you know that it is actually 48 kHz stereo interleaved as mono, and not actually 96 kHz mono?
Assuming that you are correct, that’s a fault / bug in the device driver. If it really is capturing 2 channels at 48 kHz, the drivers should enable Windows to identify it as such. Have you checked on the manufacturers website to see if working Windows 10 drivers are available?

8bit_coder · August 7, 2020, 5:22pm

I know it’s 48khz stereo interleaved as 96khz mono because the program that “fixes” the issue says that’s whats happening. I’m sure that it’s stereo since I’ve tested it with the program and I get full stereo output. Unfortunately, the capture card doesn’t really have drivers and the ones that windows installs are all that I can use. That program also only slightly mitigates the issue. It takes the audio, fixes it into stereo, and then outputs it through the computer speakers. Problem is, it directly writes to the WASAPI buffer and audacity doesn’t really deal with that too well(it just records it as silent). So the solution would be to get some kind of nyquist script that takes the recorded audio and fixes it just like the program does.

8bit_coder · August 7, 2020, 5:54pm

I actually found some info on the card’s audio issue through a linux ALSA patch note:
https://www.spinics.net/lists/stable-commits/msg164409.html

MacroSilicon MS2109 based HDMI capture cards

These claim 96kHz 1ch in the descriptors, but are actually 48kHz 2ch.
They also need QUIRK_AUDIO_ALIGN_TRANSFER, which makes one wonder if
they pretend to be 96kHz mono as a workaround for stereo being broken
by that...

They also have swapped L-R channels, but that's for userspace to deal
with.

DVDdoug · August 7, 2020, 6:21pm

I’ve never done any Nyquist programming but I assume it’s possible to copy every-other sample into the left/right channels of a new stereo track and then change/correct the sample rate.

You could also manually correct the WAV file header with a hex editor (Number of channels, Sample rate, Bytes per sample). If the left & right channels are reversed you could fix that in Audacity after fixing the header.

Since your handle is ‘8bit_coder’ you can probably handle either of those approaches. Or, maybe you can write a program in the language of your choice to correct the WAV file.

P.S.
If it was me… If I had to do this once in awhile I’d manually “hack” the WAV header. If I had to do it every day I’d write a stand-alone program to fix the WAV tile (probably in C++). I’m not sure if I’d write a single-propose program or if I’d make a WAV file editor where I could just type-in any values (in decimal). Maybe a utility like that already exists… I add an added option of swapping the left & right channel data.

I don’t have the skills to write a driver and that might require proprietary information about the hardware.

steve · August 7, 2020, 7:45pm

That sounds possible. Do you have a small test file that I can experiment with?

8bit_coder · August 7, 2020, 8:42pm

Actaully, I just realized that I was recording at 48khz, now that I’m recording in 96khz I did a test where I put a kick on the left channel and a snare on the right.(File is attached at the end) In the kick’s low frequency part(since it’s hard panned at the left) you can see that one sample is the left channel, and the other is the right channel(which is silent because nothing else is playing other than the kick) and vice versa when the snare is played:

And I tried importing the wav file as raw data, with these settings:

and it worked! Kinda. There’s a 50% chance that the channels get split one sample forwards or backwards, causing the channels to get swapped. Granted, this is easy to fix, but I just have to make sure not to trim the audio and just run the conversion first. But that’s where the nyquist script comes in(DVDdoug I code primarily in Java and it’s more towards the game oriented side of things). All the script has to do is to take every other sample and put it in one channel of a stereo track. Although, I have no idea how to write something like that in nyquist. The REAL fix for this would be to modify the driver that windows uses in order to fix the issue at the source, but that’s way too much work just for some stereo audio.
Here is the audio file that has the two hard-panned sounds.(Kick is left, snare is right)

steve · August 7, 2020, 9:14pm

Yes, I was thinking that. It looks like the “Start offset” should be 44 samples.

steve · August 7, 2020, 9:35pm

As a proof of concept (this won’t work with long tracks, but it works with the sample that you posted)

Import the track normally (it imports as mono, 96000 Hz)
Change the track sample rate to 48000 (from the track’s drop-down menu https://manual.audacityteam.org/man/audio_track_dropdown_menu.html)
Duplicate the track (Select the track, then “Ctrl + D”)
Join the two tracks to create a stereo track (https://manual.audacityteam.org/man/splitting_and_joining_stereo_tracks.html)

The above steps need to be done manually, then we will use Nyquist to do the number crunching.

Apply this code to the track using the Nyquist Prompt (https://manual.audacityteam.org/man/nyquist_prompt.html)

;version 4
(setf ln (truncate len))
(setf input (snd-fetch-array  (aref *track* 0) ln ln))
(setf ln  (/ ln 2))
(setf left (make-array ln))
(setf right (make-array ln))
(dotimes (i ln)
  (setf (aref left i) (aref input (* 2 i)))
  (setf (aref right i) (aref input (1+ (* 2 i)))))

(vector
  (snd-from-array 0 48000 left)
  (snd-from-array 0 48000 right))

Notice that the result is identical to Import RAW with a 44 sample offset.

8bit_coder · August 7, 2020, 10:24pm

Ah, I see why it can’t be used for long tracks. It’s using an array, and if we go by the 32bit length that an array can be, we get about 11 hours and 30 minutes that it can process before we run out of memory. I’m going to try and see if there’s more elegant ways to do this(doing all the steps in one command) but it’s going to be difficult since there’s surprisingly little documentation and examples about doing this kind of stuff using nyquist(that are up to date anyways) that I could find by googling. But so far, the solution does seem to be working as intended. Even if 11 hours is the limit, I’m still fine with just trying to get the script to do the rest of the work. So far I’ve been able to find that force-srate can make nyquist use a different sample rate(albeit it doesn’t correctly CHANGE the track’s sampling rate) but it seems like step 3 and 4’s taking a mono track and making it into stereo is a bit harder than it sounds since it seems like nyquist can’t “create” or remove tracks, it can only modify existing ones. I tried making a macro to do those steps but I didn’t see a command for “make stereo track” so I couldn’t get much past that.

steve · August 8, 2020, 9:46am

It’ll be less than that because the sound array uses slightly over 18 bytes per sample.

The way that longer selections would be handled in Nyquist is to use a fixed array size, and make repeated calls to snd-fetch-array, which steps through the “sound” collecting consecutive samples. The output would still need to be assembled in RAM unless written directly to disk.

There is a fast and easy way to do this - use “Import RAW”.

Nyquist documentation:
https://manual.audacityteam.org/man/nyquist.html

https://www.cs.cmu.edu/~rbd/doc/nyquist/
https://www.cs.cmu.edu/~rbd/doc/nyquist/indx.html
https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-index.htm
https://www.audacity-forum.de/download/edgar/nyquist/nyquist-doc/xlisp/xlisp-ref/xlisp-ref-index.htm
http://dept-info.labri.fr/~idurand/enseignement/lst-info/PFS/Common/Strandh-Tutorial/indentation.html