Trouble with large WAV files

tcmullet · April 25, 2025, 9:04pm

I am recording a lot of 24bit 96khz stereo files with Audacity these days. I recently captured the longest one so far, 3 hours 9 minutes. I knew it would be over the 2 GB limit that I used to butt heads against in CoolEdit96. But I have been believing that that’s no longer a limit, at least with Audacity.

The file is a bit over 6 GB. I captured it with Audacity 2.3.0 as that’s what’s currently on the capture PC. But I do all my editing on a much faster PC that has had 3.7.1, and I just upgraded to 3.7.3. All three versions truncate the opening or importing of the file at 1 hour 5 min.

Why isn’t the whole thing loading in? And I have to ask this… I believe all the audio is there, so HOW can I somehow save the data, even if I have to break it up into pieces? I REALLY need to save these 3 hours of audio, as I have lost the source from which it was captured. I’ve tried simply playing the file with my default media player of SMPlayer. It starts to play, but stops at 1:05. MediaInfo does show it as as “Wave: 6.11 GiB, 3 h 9 min” and “PCM”.

If I knew this was going to be a problem, there were steps I could have taken to prevent this, but I had no reason to believe I couldn’t re-import a wav of ANY size. (This is 2025 and we have way more than FAT16 partitions.) Am on Win 10.

I recently learned that the recording process stores the data in many little pieces that get strung together when you export it to a WAV file. This would explain why that even if there’s a limit to the size of WAV files processable, that that wouldn’t stop Audacity from RECORDING the original capture and exporting to a 6 GB WAV file. I’ve lost Audacity’s intermediate “many little pieces” as all this was done many days ago.

DVDdoug · April 25, 2025, 9:25pm

The WAV limit is 4GB due to a 32-bit “size” field in the header. Usually when you go-over the counter rolls-over to zero (losing the most significant bit) and it starts counting again. That’s why you get a “random” length.

If you import it as raw data you should be able to get the whole thing. Then maybe export it as FLAC. There are other formats like BWF or W64 that don’t have the limits but I’ve never used them.

…I never understood that original 2GB limit. I think that was the original Microsoft spec. 2GB is the maximum you can “count to” with a 32-bit signed integer (with the MSB used for the sign) but a negative file size doesn’t make any sense.

tcmullet · April 25, 2025, 11:13pm

I’ve tried Import Raw data, both on my big file and also on a tiny one-minute file. It’s totally failing. I enter 24-bit signed, little endian, 96k and stereo. After very long time opening (I think this 3.7.3 opens much more slowly than 2.3.0 did; a separate problem), it came up 3 hours of garbage, not randomly, but a very low volume distorted version of the spoken word in the program. I discovered the “Detect” option. When executed that, it’s mostly WRONG. Instead of the 24-bit signed PCM, it says “Unsigned 8-bit PCM”. Instead of “2 channels (Stereo)”, it says “1 Channel (Mono)”. (It does get 96k correctly.) If I open/import the small file as audio, it IS signed 24-bit, stereo, 96k. Your “detector” is defective. If I let it import with 8-bit mono, it’s garbage. If I use the correct settings as reflected in MediaInfo, it gives the very low volume distorted version.

I’m confident that if anyone else creates a tiny sample of 24-bit 96khz stereo little endian wav file, the import of it as data will totally fail as it has with me.

DVDdoug · April 26, 2025, 12:09am

The offset should of 44 bytes. That’s where the header ends and the audio begins.

If that doesn’t work, try 45 or 46. There are 8-bits in a byte so one of those should get the 24-bit samples back in order. (Although if the offset is fouled-up the left & right channels, which alternate, could be reversed.)

It should be little-endian (which can also scramble the byte order) but in my quick experiment “default” also worked.

Thankfully, you know most of the format details…

I don’t know how it works but it’s kind-of stupid that it doesn’t read the file header if there is one… MediaInfo did it. They are probably assuming if you have a WAV file you don’t need to import RAW so it’s trying to analyze the data… Maybe it’s confused by reading the header as audio data and running the detection algorithm on it. But I suspect detection is “imperfect” anyway because the actual audio data is just a sequence of numbers.

P.S.
I consider you lucky that it saved the whole 6GB. But I wish it would have give you a warning!

system · May 25, 2025, 9:04pm

This topic was automatically closed after 30 days. New replies are no longer allowed.