From: <apn...@ya...> - 2025-04-15 19:02:06
|
Jan, I noticed the commit you made in tclWinTest.c for testing TIP 716. Thanks for the test but the test construction is broken with respect to the intent of TIP 716. The fundamental purpose of TIP 716 is to ensure that DLL's (that may know nothing about Tcl) that use ANSI API's continue to work as before. The use of ANSI API's involves calling the Win32 WideCharToMultiByte function first to encode as per the user's code page, obtained through GetACP. The encoded string is then passed to the ANSI API. The important point I've tried to emphasize in the TIP is that if you force GetACP to return utf-8, (a) the DLL may not be coded to handle variable length encoding of 4 bytes (a lot of legacy ANSI based code assumes DBCS or MBCS with max length 2), and (b) the application at the other end of the data transfer (like the database server in the DB2 case) expects the real user's code page setting and fails when it sees utf-8 instead. In your modifications to tclWinTest.c, you are encoding the file name using Tcl_UtfToExternalDString using Tcl's system encoding (utf-8) but passing it to an ANSI API expecting the user's code page encoding. This will simply not work. It should use Unicode API's, or encoding using WideCharToMultiByte(GetACP().). The whole point of TIP 716 is that your changes should fail as you should be encoding using the result of GetACP(), not [encoding system]! The rules are (as per TIP 716): 1. Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. 2. If you have to use ANSI API's, use Windows encoding settings, not Tcl encoding settings. To summarize the whole situation again, In Tcl 8.x AND 9.0 non-Windows platforms, [encoding system] is derived from user settings (GetACP) making them consistent both with each other and with other applications. In 9.0 on new Windows platforms, in effect GetACP and [encoding system] are both hardcoded to utf-8 because of the manifest. They are consistent with each other but only within that process, not with other applications and not compatible with some DLL's leading to the problems described in the TIP. This consistency means that your changes to tclWinTest.c would be OK for 9.0 (but only because the file system API's, unlike GDI, supports UTF-8!). In TIP 716, the explicit choice is made to (a) hardcode [encoding system] to UTF-8 for compatibility with 9.0 but not hardcode GetACP , thereby making it inconsistent with [encoding system]. This inconsistency is forced because reverting to the 8.x / 9.0 non-Windows method would mean incompatibility with 9.0 on Windows in terms of the default encoding for channels. This inconsistency however eliminates some of the issues listed in 716 at the cost of having to follow the rules mentioned above. Note Tcl/Tk already uses Unicode Win32 API's. This whole headache has arisen because of the implicit decision in Tcl 9 to split Tcl's system encoding setting from Windows user's encoding setting but it's too late to turn back the clock now. Either we stick with 9.0 and the reported issues, or move with TIP 716 and stick to Unicode API's in Tcl (which we already do anyways). /Ashok From: apnmbx-public--- via Tcl-Core <tcl...@li...> Sent: Monday, April 14, 2025 10:22 AM To: tcl...@li... Subject: [TCLCORE] TIP 716 ready for comments <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Apologies for my usual verbosity, but when I brought this up in the mailing list prior to the previous release, the comments wandered into why UTF-8 should be the default. I've tried to better explain that is not the issue. I will point out that the original change to the manifest, which made UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as it overrides user settings and is a change in behavior of a public API and command. It's water under the bridge now that 9.0 has shipped so the TIP maintains status quo and only changes the implementation. It also adds a new encoding user command and an -encoding option to exec as a workaround for compatibility issues introduced by forcing a UTF-8 default. Note the implementation in the tip-716 branch is mostly complete but not ready for review. I am only looking for comments on the proposal before proceeding further with tests and docs. /Ashok |