Filenames with Non-ASCII characters cannot be read from Virtual Drive (the virtual drive I use is pCloud).
Example filename: tést.mp3
The files/filenames are created on MacOS and can be read by Audacity if the files are stored locally. But if I try to import the same file from my Virtual Drive, it cannot be read.
Filenames with ASCII characters can be read fine from Virtual Drive.
Error from Audacity: Importing tést.mp3… Elapsed time: (continuous, so I press cancel) Error Importing - Operation was canceled by the user
_
I have contacted pCloud support about this, they are aware of the problem and are trying to fix it… however this issue might be software dependant.
My conversation with pCloud ending with them saying “We will contact Audacity about this”.
This bug report is to put the issue on the radar if pCloud fails to make contact.
_
I am interested to know if other Virtual Drives on MacOS have the same problem.
To reproduce, rename an .mp3 file to tést.mp3, store the file on the Virtual Drive, click and drag into Audacity to import.
I am available to help with more information. I work with Non-ASCII characters all the time and recently switched to MacOS so this has significant impact on my workflow.
This problem may be because of the weird way that macOS handles multi-byte characters.
I took a quick look at pCloud’s technical documentation, and it says that it uses UTF-8 encoding for file names. I would have expected this to work fine with Audacity as Audacity also uses UTF-8, and modern Macs also use UTF-8 by default.
However, macOS uses UTF-8 with Normalization Form D (NFD) by default for file names in its file system. This means characters like é are decomposed into their base character e plus a combining accent ´. Most other systems either use Normalization Form C (NFC), where é is stored as a single, precomposed character (Windows), or are Normalization agnostic (Linux / Unix).
If pCloud is running on Linux or Windows, then it’s possible that the file names are being invisibly changed by macOS from one form of normalization to another, causing the names to (invisibly) change.
What happens if you write a file to a local drive (using accented characters), copy it over to pCloud, then copy it back to a local drive? Does it survive the round trip and still work in Audacity?
I thought the same about UTF-8 - in theory, all systems align… but I didn’t know about Normalization Forms. Sounds likely to be a problem.
To confirm that, is it possible to increase the verbosity of the debug tool? I’m trying to find exactly what Audacity reads (the UTF-8 values) with a working local file and not-working virtual drive file.
_
What happens if you write a file to a local drive (using accented characters), copy it over to pCloud, then copy it back to a local drive? Does it survive the round trip and still work in Audacity?
The file survives the round trip. Only when importing from pCloud I have the issue.
_
Overall, I’m growing pessimistic, I’m aware of pClouds criticism on Mac. If there is another brand of Virtual Drive somebody has attached, I am really interested to know if the behaviour is the same.
Opening projects directly from a “web drive” (such as pCloud, Google Drive, One Drive, iCloud, …) is not recommended and is likely to cause problems. This is because an Audacity project (.AUP3) is an SQLite database, and Audacity uses “secure transaction mode” to ensure reliable updating of the database as you work. Web drives rarely support secure database transactions.
The recommended approach to using web drives with Audacity is to only use them to store closed projects. When you want to work on a project, move it to an ordinary drive first.
Not my specialist area, but I think probably not. As far as I can see, Audacity is handling character encoding correctly.
My best guess is that there isn’t actually a bug, but rather that “correct” encoding on one end of the virtual drive is different from “correct” on the other end - like two people having a phone conversation but misunderstanding each other because they have different dialects. When files are copied in the normal way, the virtual drive software handles the necessary conversions, but when Audacity access files directly from the virtual drive, the man in the middle (the virtual drive manager) is bypassed, allowing non-translated file names to be seen (wrong normalization).