From: <apn...@ya...> - 2024-07-04 17:30:53
|
The consensus in prior discussions, that I had reluctantly accepted, was to keep the status quo in Tcl 9 of hard coding the code page to be utf-8 via the manifest. This despite the fact that I did not hear of any actual concrete issues with reverting and keeping the code page detection same as in Tcl 8 while I had listed several issues in hardcoding it to UTF-8. The rationale behind moving to UTF-8 was that the world is moving to it, we should too, compatibility with Unix etc. Having looked into it deeper, I will just list couple of additional factors I have discovered. While the primary argument folks seem to have made is that the world is moving to utf-8, I suspect this has come from a Unix perspective. In particular, the *only* executables that hardcoded the code page to utf-8 via the manifest were tclsh90 and wish90 when I searched (strings+grep) across 3 Win10/11 systems including the entire Windows system directory, multiple browsers, Office, Visual Studio, Python, Ruby and a host of smaller applications. The only one that even had an activeCodePage manifest entry was Firefox and even there it was set to Legacy and not UTF-8. This makes me nervous. Is Tcl supposed to push the Windows world to UTF-8? Does it make sense for Tcl to be the odd man out and to exactly what benefit? [Note applications like notepad do support UTF-8 and encodings but that is by application configurable choice and encoding guessing, not with a manifest entry. That should be the case with Tcl applications as well. The push to UTF-8 should come from Tcl applications, not Tcl itself.] Also concerning, as the SDK documentation states, even the GDI subsystem does not support this manifest entry. I don't know to what extent that affects Tcl or Tk where wide character API's are used for the most part, but the fact that a major Windows subsystem does not support this per-process code pages setting seemingly gives lie to UTF-8 improving compatibility with other applications (on Windows). I only raise this in case some minds might be changed. Else I don't plan to pursue this further as reverting anything should require a high bar in terms of consensus. /Ashok -----Original Message----- From: Harald Oehlmann <har...@el...> Sent: Thursday, July 4, 2024 6:02 PM To: tcl...@li... Subject: Re: [TCLCORE] Propose reverting [encoding system] on Windows Am 04.07.2024 um 14:17 schrieb Jan Nijtmans: > Op zo 30 jun 2024 om 23:18 schreef Jan Nijtmans <jan...@gm...>: >> It shouldn't be too difficult to get the old code-page from the >> registry, it can be found in the following registry key: >> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage\ACP > > This is how to retrieve the code page it had in Tcl 8.x: > > $ tclsh90 > % encoding system > utf-8 > % package require registry > 1.3.7 > % registry get > HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Control\\Nls\\CodePage > ACP > 1252 > > Hope this helps, > Jan Nijtmans Thanks, Jan. As you mentioned, a recipe should be added to the migration notes. Here it is with some additional tests and fallback to 8859-1 proc winNativeCodepageGet {} { if {[catch { package require registry set cp [registry get HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Control\\Nls\\CodePage ACP] }]} {return "iso8859-1"} if {$cp eq "65001"} {return "utf-8"} set cp "cp$cp" if {$cp in [encoding names]} {return $cp} return "iso8859-1" } Take care, Harald |