|
From: Kevin K. <kev...@gm...> - 2023-02-06 04:19:20
|
On Fri, Feb 3, 2023 at 4:39 PM Poor Yorick <org...@po...> wrote: > On 2023-02-03 18:19, Donald G Porter via Tcl-Core wrote: > > On 1/27/23 10:36, apnmbx-public--- via Tcl-Core wrote: > >> > > > > Both 2) and 3) may impose constraints and demand revision to the > > Tcl_Filesystem interface and its Tcl_FSMatchInDirectoryProc slot. The > > encoding to be used to interpret the bytes of a filename might better > > be an attribute of a Tcl_Filesystem or of a mount point rather than an > > application-wide (and not thread-stable?) notion of a system encoding > > pulled in through a side channel. > > Even this won't solve the problem. Posix filesystems don't maintain a > known > encoding as part of their configuration. An ext4 filesystem mounted at > root > may have filenames encoded in utf-8, and then another ext4 filesysem > mounted > somewhere else might have filenames encoded in another encoding. No > matter > what encoding Tcl attributes to this combined set of files, it's going > to be > wrong at some point. > The only known solution to that problem is to perform a temporary [encoding system iso8859-1] prior to [open] (or any other activity manipulating a path name, and then construct all path names by concatenating results of [encoding convertto] before and after the offending mount point - of course, reverrting [encoding system] as soon as the path name is sent to the OS. That will have the effect of treating the path names as sequences of bytes and pushing encoding management onto the user of the filesystem. It's truly nasty - but I don't have any better ideas unless we start having virtual filesystems mirroring the Posix mount points - which I suppose would be doable, but I'm not sure it's worth the effort for this one bizarre case. (Which, by the way, is just about the only legitimate use I've found for changing [encoding system].) -- 73 de ke9tv/2, Kevin |