From: <apn...@ya...> - 2025-04-14 04:52:25
|
<https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Apologies for my usual verbosity, but when I brought this up in the mailing list prior to the previous release, the comments wandered into why UTF-8 should be the default. I've tried to better explain that is not the issue. I will point out that the original change to the manifest, which made UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as it overrides user settings and is a change in behavior of a public API and command. It's water under the bridge now that 9.0 has shipped so the TIP maintains status quo and only changes the implementation. It also adds a new encoding user command and an -encoding option to exec as a workaround for compatibility issues introduced by forcing a UTF-8 default. Note the implementation in the tip-716 branch is mostly complete but not ready for review. I am only looking for comments on the proposal before proceeding further with tests and docs. /Ashok |
From: Harald O. <har...@el...> - 2025-04-14 11:53:39
Attachments:
OpenPGP_signature.asc
|
Ashok, thanks for the great initiative. You are always our saver of preliminary decisions, like Size_t and ptrDiff_t. I am personally trapped in two senses: - my own printing dll's use 8 bit API and don't print German Umlauts (äöüÄÖÜ) any more. - there is no way on Tcl 9 on the script level to find the system encoding of Tcl 8.6. So, sourcing files using "source -encoding native" is not possible, because "native" is not known on the script level. You mention a new command "[encoding user]" in an example. I suppose, this will solve this issue and "encoding user" will return what "encoding system" returns in 8.6. About the TIP: - GREAT !!!! - describe "encoding user" - On Windows, environment variables are less comment. As a consequence, using an environment variable for the default set of "exec -encoding" is, at least, strange on Windows. This is a minor point. IMHO it is aso a security risk, that an application does not work as expected due to external influence. Thanks for ALL ! Harald Am 14.04.2025 um 06:51 schrieb apnmbx-public--- via Tcl-Core: > TIP 716: New command "encoding user", remove UTF-8 manifest setting on > Windows <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> is ready > for comments. It proposes reversion of a change made in 9.0 to tclsh and > wish Windows manifests while keeping compatibility with 9.0.{0,1}. > > Apologies for my usual verbosity, but when I brought this up in the > mailing list prior to the previous release, the comments wandered into > why UTF-8 should be the default. I've tried to better explain that is > not the issue. > > I will point out that the original change to the manifest, which made > UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as > it overrides user settings and is a change in behavior of a public API > and command. It's water under the bridge now that 9.0 has shipped so the > TIP maintains status quo and only changes the implementation. It also > adds a new /encoding user/ command and an /-encoding/ option to /exec / > as a workaround for compatibility issues introduced by forcing a UTF-8 > default. > > Note the implementation in the tip-716 branch is mostly complete but not > ready for review. I am only looking for comments on the proposal before > proceeding further with tests and docs. > > /Ashok > |
From: <apn...@ya...> - 2025-04-14 15:05:31
|
Harald, Could you try the tip-716 branch to see if fixes your print dll umlaut issue? Also, the encoding user command is documented. “Correspondingly, a new command encoding user will be added on all platforms and will return the result of Tcl_GetEncodingNameForUser.” I suppose I should add the syntax synopsis for both the C API and the command. I am not particularly tied to use of environment variables but note Tcl does use several, even on Windows. /Ashok -----Original Message----- From: Harald Oehlmann <har...@el...> Sent: Monday, April 14, 2025 5:23 PM To: tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Ashok, thanks for the great initiative. You are always our saver of preliminary decisions, like Size_t and ptrDiff_t. I am personally trapped in two senses: - my own printing dll's use 8 bit API and don't print German Umlauts (äöüÄÖÜ) any more. - there is no way on Tcl 9 on the script level to find the system encoding of Tcl 8.6. So, sourcing files using "source -encoding native" is not possible, because "native" is not known on the script level. You mention a new command "[encoding user]" in an example. I suppose, this will solve this issue and "encoding user" will return what "encoding system" returns in 8.6. About the TIP: - GREAT !!!! - describe "encoding user" - On Windows, environment variables are less comment. As a consequence, using an environment variable for the default set of "exec -encoding" is, at least, strange on Windows. This is a minor point. IMHO it is aso a security risk, that an application does not work as expected due to external influence. Thanks for ALL ! Harald Am 14.04.2025 um 06:51 schrieb apnmbx-public--- via Tcl-Core: > TIP 716: New command "encoding user", remove UTF-8 manifest setting on > Windows < <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> is ready > for comments. It proposes reversion of a change made in 9.0 to tclsh and > wish Windows manifests while keeping compatibility with 9.0.{0,1}. > > Apologies for my usual verbosity, but when I brought this up in the > mailing list prior to the previous release, the comments wandered into > why UTF-8 should be the default. I've tried to better explain that is > not the issue. > > I will point out that the original change to the manifest, which made > UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as > it overrides user settings and is a change in behavior of a public API > and command. It's water under the bridge now that 9.0 has shipped so the > TIP maintains status quo and only changes the implementation. It also > adds a new /encoding user/ command and an /-encoding/ option to /exec / > as a workaround for compatibility issues introduced by forcing a UTF-8 > default. > > Note the implementation in the tip-716 branch is mostly complete but not > ready for review. I am only looking for comments on the proposal before > proceeding further with tests and docs. > > /Ashok > |
From: Harald O. <har...@el...> - 2025-04-14 20:09:23
Attachments:
OpenPGP_signature.asc
|
Ashok, I can confirm that "encoding user" works as expected. The print dll test will take some days. Another term for non Windows users which may lead to confusion: "Win32 API" is used for 32 (x86) and 64 bit (x64,ARM64) Windows Thanks for all, Harald Am 14.04.2025 um 16:54 schrieb apn...@ya...: > Harald, > > Could you try the tip-716 branch to see if fixes your print dll umlaut > issue? > > Also, the encoding user command is documented. “/Correspondingly, a new > command encoding user will be added on all platforms and will return the > result of Tcl_GetEncodingNameForUser.”/ I suppose I should add the > syntax synopsis for both the C API and the command. > > I am not particularly tied to use of environment variables but note Tcl > does use several, even on Windows. > > /Ashok > > -----Original Message----- > From: Harald Oehlmann <har...@el...> > Sent: Monday, April 14, 2025 5:23 PM > To: tcl...@li... > Subject: Re: [TCLCORE] TIP 716 ready for comments > > Ashok, > > thanks for the great initiative. > > You are always our saver of preliminary decisions, like Size_t and > > ptrDiff_t. > > I am personally trapped in two senses: > > - my own printing dll's use 8 bit API and don't print German Umlauts > > (äöüÄÖÜ) any more. > > - there is no way on Tcl 9 on the script level to find the system > > encoding of Tcl 8.6. So, sourcing files using "source -encoding native" > > is not possible, because "native" is not known on the script level. You > > mention a new command "[encoding user]" in an example. I suppose, this > > will solve this issue and "encoding user" will return what "encoding > > system" returns in 8.6. > > About the TIP: > > - GREAT !!!! > > - describe "encoding user" > > - On Windows, environment variables are less comment. As a consequence, > > using an environment variable for the default set of "exec -encoding" > > is, at least, strange on Windows. This is a minor point. IMHO it is aso > > a security risk, that an application does not work as expected due to > > external influence. > > Thanks for ALL ! > > Harald > > Am 14.04.2025 um 06:51 schrieb apnmbx-public--- via Tcl-Core: > > > TIP 716: New command "encoding user", remove UTF-8 manifest setting on > > > Windows <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md > <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md>> is ready > > > for comments. It proposes reversion of a change made in 9.0 to tclsh and > > > wish Windows manifests while keeping compatibility with 9.0.{0,1}. > > > > > > Apologies for my usual verbosity, but when I brought this up in the > > > mailing list prior to the previous release, the comments wandered into > > > why UTF-8 should be the default. I've tried to better explain that is > > > not the issue. > > > > > > I will point out that the original change to the manifest, which made > > > UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as > > > it overrides user settings and is a change in behavior of a public API > > > and command. It's water under the bridge now that 9.0 has shipped so the > > > TIP maintains status quo and only changes the implementation. It also > > > adds a new /encoding user/ command and an /-encoding/ option to /exec / > > > as a workaround for compatibility issues introduced by forcing a UTF-8 > > > default. > > > > > > Note the implementation in the tip-716 branch is mostly complete but not > > > ready for review. I am only looking for comments on the proposal before > > > proceeding further with tests and docs. > > > > > > /Ashok |
From: <apn...@ya...> - 2025-04-15 19:02:06
|
Jan, I noticed the commit you made in tclWinTest.c for testing TIP 716. Thanks for the test but the test construction is broken with respect to the intent of TIP 716. The fundamental purpose of TIP 716 is to ensure that DLL's (that may know nothing about Tcl) that use ANSI API's continue to work as before. The use of ANSI API's involves calling the Win32 WideCharToMultiByte function first to encode as per the user's code page, obtained through GetACP. The encoded string is then passed to the ANSI API. The important point I've tried to emphasize in the TIP is that if you force GetACP to return utf-8, (a) the DLL may not be coded to handle variable length encoding of 4 bytes (a lot of legacy ANSI based code assumes DBCS or MBCS with max length 2), and (b) the application at the other end of the data transfer (like the database server in the DB2 case) expects the real user's code page setting and fails when it sees utf-8 instead. In your modifications to tclWinTest.c, you are encoding the file name using Tcl_UtfToExternalDString using Tcl's system encoding (utf-8) but passing it to an ANSI API expecting the user's code page encoding. This will simply not work. It should use Unicode API's, or encoding using WideCharToMultiByte(GetACP().). The whole point of TIP 716 is that your changes should fail as you should be encoding using the result of GetACP(), not [encoding system]! The rules are (as per TIP 716): 1. Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. 2. If you have to use ANSI API's, use Windows encoding settings, not Tcl encoding settings. To summarize the whole situation again, In Tcl 8.x AND 9.0 non-Windows platforms, [encoding system] is derived from user settings (GetACP) making them consistent both with each other and with other applications. In 9.0 on new Windows platforms, in effect GetACP and [encoding system] are both hardcoded to utf-8 because of the manifest. They are consistent with each other but only within that process, not with other applications and not compatible with some DLL's leading to the problems described in the TIP. This consistency means that your changes to tclWinTest.c would be OK for 9.0 (but only because the file system API's, unlike GDI, supports UTF-8!). In TIP 716, the explicit choice is made to (a) hardcode [encoding system] to UTF-8 for compatibility with 9.0 but not hardcode GetACP , thereby making it inconsistent with [encoding system]. This inconsistency is forced because reverting to the 8.x / 9.0 non-Windows method would mean incompatibility with 9.0 on Windows in terms of the default encoding for channels. This inconsistency however eliminates some of the issues listed in 716 at the cost of having to follow the rules mentioned above. Note Tcl/Tk already uses Unicode Win32 API's. This whole headache has arisen because of the implicit decision in Tcl 9 to split Tcl's system encoding setting from Windows user's encoding setting but it's too late to turn back the clock now. Either we stick with 9.0 and the reported issues, or move with TIP 716 and stick to Unicode API's in Tcl (which we already do anyways). /Ashok From: apnmbx-public--- via Tcl-Core <tcl...@li...> Sent: Monday, April 14, 2025 10:22 AM To: tcl...@li... Subject: [TCLCORE] TIP 716 ready for comments <https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Apologies for my usual verbosity, but when I brought this up in the mailing list prior to the previous release, the comments wandered into why UTF-8 should be the default. I've tried to better explain that is not the issue. I will point out that the original change to the manifest, which made UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as it overrides user settings and is a change in behavior of a public API and command. It's water under the bridge now that 9.0 has shipped so the TIP maintains status quo and only changes the implementation. It also adds a new encoding user command and an -encoding option to exec as a workaround for compatibility issues introduced by forcing a UTF-8 default. Note the implementation in the tip-716 branch is mostly complete but not ready for review. I am only looking for comments on the proposal before proceeding further with tests and docs. /Ashok |
From: Jan N. <jan...@gm...> - 2025-04-15 21:14:24
|
Op di 15 apr 2025 om 21:02 schreef apnmbx-public--- via Tcl-Core <tcl...@li...>: > I noticed the commit you made in tclWinTest.c for testing TIP 716. Thanks for the test but the test construction is broken with respect to the intent of TIP 716. I'm still in "experimenting" state .... Feel free to use the outcome of those experiments to clarify the TIP #716 text and/or modify the tcl9.0/doc documentation. Hope this helps, Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-15 19:46:36
Attachments:
OpenPGP_signature.asc
|
Am 14.04.2025 um 16:54 schrieb apn...@ya...: > Harald, > > Could you try the tip-716 branch to see if fixes your print dll umlaut > issue? Unfortunately, my dll with A-suffix API prints: 1AÄÖÜäöü for 8.6 and 1AÄÖÃoeäöü for 9.0 and TIP715 branch 8.6 was build with MS.VC 6 32 bit 9.0 was build with VS 2022 32 bit Sorry, Harald |
From: <apn...@ya...> - 2025-04-16 02:18:24
|
Harald, It *appears* from the output you are passing UTF-8 data to the DLL which is expecting cp1252 or similar. For example, % set x 1AÄÖÜäöü 1AÄÖÜäöü % set utf [encoding convertto utf-8 $x] 1AÃÃÃäöü % encoding convertfrom cp1252 $utf 1AÄÖÜäöü Which is the output you are seeing. My *guess* (because it works on Tcl8) is you are using one of the Tcl encoding routines which default to [encoding system] (utf-8) and passing that string to the DLL. Note this has not changed in TIP 716 to maintain 9.0 compatibility. Can you elaborate on how you are passing data? I would like to resolve as many of these encoding issues on Windows as we can, whether they relate to the manifest/716 or not, before the next patchlevel. /Ashok -----Original Message----- From: Harald Oehlmann <har...@el...> Sent: Wednesday, April 16, 2025 1:16 AM To: apn...@ya...; tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Am 14.04.2025 um 16:54 schrieb <mailto:apn...@ya...> apn...@ya...: > Harald, > > Could you try the tip-716 branch to see if fixes your print dll umlaut > issue? Unfortunately, my dll with A-suffix API prints: 1AÄÖÜäöü for 8.6 and 1AÄÖÃoeäöü for 9.0 and TIP715 branch 8.6 was build with MS.VC 6 32 bit 9.0 was build with VS 2022 32 bit Sorry, Harald |
From: Harald O. <har...@el...> - 2025-04-16 09:21:20
Attachments:
OpenPGP_signature.asc
|
Ashok, you were totally correct. Here is the relevant code snippet: pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); Tcl_UtfToExternalDString( NULL, pStr, lStr, &sPar1); TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( &sPar1 )) So, TextOutA gets utf-8 in TCL9 and tip-715 and cp12xx on TCL 8.6. I wonder, that the manifest aparently has no influence on TextOutA. Normally, it should work with 9.0, as data and interpretation is utf-8. I don't want to dig too deep here. TIP 715 is great and already the "encoding user" is very important. Thanks for all, Harald Am 16.04.2025 um 04:17 schrieb apnmbx-public--- via Tcl-Core: > Harald, > > It **appears** from the output you are passing UTF-8 data to the DLL > which is expecting cp1252 or similar. > > For example, > > % set x 1AÄÖÜäöü > > 1AÄÖÜäöü > > % set utf [encoding convertto utf-8 $x] > > 1AÃÃÃäöü > > % encoding convertfrom cp1252 $utf > > 1AÄÖÜäöü > > Which is the output you are seeing. My **guess** (because it works on > Tcl8) is you are using one of the Tcl encoding routines which default to > [encoding system] (utf-8) and passing that string to the DLL. Note this > has not changed in TIP 716 to maintain 9.0 compatibility. > > Can you elaborate on how you are passing data? I would like to resolve > as many of these encoding issues on Windows as we can, whether they > relate to the manifest/716 or not, before the next patchlevel. > > /Ashok > > -----Original Message----- > From: Harald Oehlmann <har...@el...> > Sent: Wednesday, April 16, 2025 1:16 AM > To: apn...@ya...; tcl...@li... > Subject: Re: [TCLCORE] TIP 716 ready for comments > > Am 14.04.2025 um 16:54 schrieb apn...@ya... <mailto:apnmbx- > pu...@ya...>: > > > Harald, > > > > > > Could you try the tip-716 branch to see if fixes your print dll umlaut > > > issue? > > Unfortunately, my dll with A-suffix API prints: > > 1AÄÖÜäöü > > for 8.6 and > > 1AÄÖÃoeäöü > > for 9.0 and TIP715 branch > > 8.6 was build with MS.VC 6 32 bit > > 9.0 was build with VS 2022 32 bit > > Sorry, > > Harald > > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core -- ELMICRON Dr. Harald Oehlmann GmbH Koesener Str. 85 06618 NAUMBURG - Germany Phone: +49 3445 781120 Direct: +49 3445 781127 www.Elmicron.de German legal references: Geschaeftsfuehrer: Dr. Harald Oehlmann UST Nr. / VAT ID No.: DE206105272 HRB 212803 Stendal |
From: <apn...@ya...> - 2025-04-16 16:36:23
|
Harald wrote: > I wonder, that the manifest aparently has no influence on TextOutA. That's right. TextOut is part of Windows GDI which Microsoft explicitly documents as not supporting UTF-8 *even if the code page is set to UTF-8 via the manifest*. There is a "beta" (Microsoft's word, not mine) mode you can set in the registry to enable this but it has been beta mode since *2019* fwiw! And of course it will affect all applications and users on the system. 9.0 and TIP 716 behave the same way because the latter preserves 9.0's [encoding system] to be always utf-8 (modulo platform). It is unfortunate that extensions have to be modified, needless pain imo, but at least you can fix it since you have the source. The real hard nut is when you do not own the source and have no means to modify it as happened in the DB2 TPC case. /Ashok -----Original Message----- From: Harald Oehlmann <har...@el...> Sent: Wednesday, April 16, 2025 2:51 PM To: tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Ashok, you were totally correct. Here is the relevant code snippet: pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); Tcl_UtfToExternalDString( NULL, pStr, lStr, &sPar1); TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( &sPar1 )) So, TextOutA gets utf-8 in TCL9 and tip-715 and cp12xx on TCL 8.6. I wonder, that the manifest aparently has no influence on TextOutA. Normally, it should work with 9.0, as data and interpretation is utf-8. I don't want to dig too deep here. TIP 715 is great and already the "encoding user" is very important. Thanks for all, Harald Am 16.04.2025 um 04:17 schrieb apnmbx-public--- via Tcl-Core: > Harald, > > It **appears** from the output you are passing UTF-8 data to the DLL > which is expecting cp1252 or similar. > > For example, > > % set x 1AÄÖÜäöü > > 1AÄÖÜäöü > > % set utf [encoding convertto utf-8 $x] > > 1AÃÃÃäöü > > % encoding convertfrom cp1252 $utf > > 1AÄÖÜäöü > > Which is the output you are seeing. My **guess** (because it works on > Tcl8) is you are using one of the Tcl encoding routines which default to > [encoding system] (utf-8) and passing that string to the DLL. Note this > has not changed in TIP 716 to maintain 9.0 compatibility. > > Can you elaborate on how you are passing data? I would like to resolve > as many of these encoding issues on Windows as we can, whether they > relate to the manifest/716 or not, before the next patchlevel. > > /Ashok > > -----Original Message----- > From: Harald Oehlmann <har...@el...> > Sent: Wednesday, April 16, 2025 1:16 AM > To: apn...@ya...; tcl...@li... > Subject: Re: [TCLCORE] TIP 716 ready for comments > > Am 14.04.2025 um 16:54 schrieb apn...@ya... <mailto:apnmbx- > pu...@ya...>: > > > Harald, > > > > > > Could you try the tip-716 branch to see if fixes your print dll umlaut > > > issue? > > Unfortunately, my dll with A-suffix API prints: > > 1AÄÖÜäöü > > for 8.6 and > > 1AÄÖÃoeäöü > > for 9.0 and TIP715 branch > > 8.6 was build with MS.VC 6 32 bit > > 9.0 was build with VS 2022 32 bit > > Sorry, > > Harald > > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core -- ELMICRON Dr. Harald Oehlmann GmbH Koesener Str. 85 06618 NAUMBURG - Germany Phone: +49 3445 781120 Direct: +49 3445 781127 www.Elmicron.de German legal references: Geschaeftsfuehrer: Dr. Harald Oehlmann UST Nr. / VAT ID No.: DE206105272 HRB 212803 Stendal |
From: Jan N. <jan...@gm...> - 2025-04-16 11:06:06
|
Op wo 16 apr 2025 om 11:21 schreef Harald Oehlmann: > Here is the relevant code snippet: > > pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); > Tcl_UtfToExternalDString( NULL, pStr, lStr, &sPar1); > TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( > &sPar1 )) My recommendation would be to do this: #if (TCL_MAJOR_VERSION < 9) /* In case of Tcl 8.6 */ # define Tcl_UtfToWCharDString Tcl_UtfToUniCharDString # endif pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); Tcl_DStringInit( &sPar1 ); Tcl_UtfToWCharDString( pStr, lStr, &sPar1); TextOutW(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( &sPar1 )/sizeof(WCHAR)); That's what Tcl_UtfToWCharDString was meant for, and it works with any encoding, with or without TIP #716. Hope this helps, Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-16 11:35:29
Attachments:
OpenPGP_signature.asc
|
Jan, thanks for the hint. I have now replaced it with the idom introduced in Tcl 8.0 (at least Tcl 8.4): Tcl_WinUtfToTChar( pStr, lStr, &sPar1); IMHO this looks nicer, as no #if is required. I suppose, there is an #if under the hood, but it is not visible to me. But I feel, I get again something completly wrong. That is all soooooooooooooo complicated... Why now two commands, where there was one before and a requirement for an #if, where there was none before. The migration: https://core.tcl-lang.org/tcl/wiki?name=Migrating+C+extensions+to+Tcl+9&p does not tell anything on this... Thanks for all, Harald Am 16.04.2025 um 13:05 schrieb Jan Nijtmans: > Op wo 16 apr 2025 om 11:21 schreef Harald Oehlmann: >> Here is the relevant code snippet: >> >> pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); >> Tcl_UtfToExternalDString( NULL, pStr, lStr, &sPar1); >> TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( >> &sPar1 )) > > My recommendation would be to do this: > > #if (TCL_MAJOR_VERSION < 9) /* In case of Tcl 8.6 */ > # define Tcl_UtfToWCharDString Tcl_UtfToUniCharDString > # endif > > pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr); > Tcl_DStringInit( &sPar1 ); > Tcl_UtfToWCharDString( pStr, lStr, &sPar1); > TextOutW(pdlg.hDC, X0, Y0, Tcl_DStringValue( &sPar1 ), Tcl_DStringLength( > &sPar1 )/sizeof(WCHAR)); > > That's what Tcl_UtfToWCharDString was meant for, and it > works with any encoding, with or without TIP #716. > > Hope this helps, > Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-04-16 12:03:52
|
Op wo 16 apr 2025 om 13:35 schreef Harald Oehlmann: > thanks for the hint. I have now replaced it with the idom introduced in > Tcl 8.0 (at least Tcl 8.4): > > Tcl_WinUtfToTChar( pStr, lStr, &sPar1); That's fine too. Tcl_WinUtfToTChar is platform-dependant (Windows-only), and works fine. The function Tcl_WinTCharToUtf() (the other way round) has the strange property that the "length" parameter must be specified in "bytes", not in WCHAR's. That's one source of confusion. Another one is that Tcl_WinUtfToTChar and Tcl_WinTCharToUtf don't accept -1 as length parameter, which should mean "until the terminating NULL". Tcl_UtfToWCharDString() is much more useful than Tcl_WinTCharToUtf(). > The migration: > https://core.tcl-lang.org/tcl/wiki?name=Migrating+C+extensions+to+Tcl+9&p > does not tell anything on this... So, that should be extended. All is described here: <https://core.tcl-lang.org/tips/doc/trunk/tip/548.md> Hope this helps, Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-16 14:17:02
Attachments:
OpenPGP_signature.asc
|
Am 16.04.2025 um 14:03 schrieb Jan Nijtmans: > Op wo 16 apr 2025 om 13:35 schreef Harald Oehlmann: >> thanks for the hint. I have now replaced it with the idom introduced in >> Tcl 8.0 (at least Tcl 8.4): >> >> Tcl_WinUtfToTChar( pStr, lStr, &sPar1); > > That's fine too. Tcl_WinUtfToTChar is platform-dependant > (Windows-only), and works fine. But it is deprecated. The great advantage of Tcl_UtfToWCharDString is, that it appends to the DString, and does not replace it. > The function Tcl_WinTCharToUtf() (the other way round) > has the strange property that the "length" parameter > must be specified in "bytes", not in WCHAR's. That's > one source of confusion. Another one is that > Tcl_WinUtfToTChar and Tcl_WinTCharToUtf don't > accept -1 as length parameter, which should mean > "until the terminating NULL". Tcl_UtfToWCharDString() > is much more useful than Tcl_WinTCharToUtf(). > >> The migration: >> https://core.tcl-lang.org/tcl/wiki?name=Migrating+C+extensions+to+Tcl+9&p >> does not tell anything on this... > > So, that should be extended. All is described here: > <https://core.tcl-lang.org/tips/doc/trunk/tip/548.md> The man-page is here: https://www.tcl-lang.org/man/tcl9.0/TclLib/Utf.html I think, there should be examples here and use-cases. This page does not mention "Tcl_WinUtfToTChar". This command is documented on the encoding.3 man page in 8.6. In Tcl 9.0, it is not documented at all. That is probably ok to not mention deprecated functions. I may fix all this one day. In documentation, I am not so bad as in coding... Thanks for all, Harald |
From: Ashok N. <apn...@ya...> - 2025-04-21 05:18:47
|
I plan a CFV on TIP 716 at the end of this month. Before then, I'm particularly interested in any opinions on the questions listed in the TIP regarding TCL_EXEC_ENCODING environment variable. Of course, that is only relevant for those in favor of the rest of the TIP! Thanks, Ashok ________________________________ From: apnmbx-public--- via Tcl-Core Sent: Monday, April 14, 2025 10:21 AM To: tcl...@li... Subject: [TCLCORE] TIP 716 ready for comments TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows<https://core.tcl-lang.org/tips/doc/trunk/tip/716.md> is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Apologies for my usual verbosity, but when I brought this up in the mailing list prior to the previous release, the comments wandered into why UTF-8 should be the default. I've tried to better explain that is not the issue. I will point out that the original change to the manifest, which made UTF-8 the default on Win 10 1903+ and Win 11, should have been TIP'ed as it overrides user settings and is a change in behavior of a public API and command. It's water under the bridge now that 9.0 has shipped so the TIP maintains status quo and only changes the implementation. It also adds a new encoding user command and an -encoding option to exec as a workaround for compatibility issues introduced by forcing a UTF-8 default. Note the implementation in the tip-716 branch is mostly complete but not ready for review. I am only looking for comments on the proposal before proceeding further with tests and docs. /Ashok |
From: Jan N. <jan...@gm...> - 2025-04-21 13:37:05
|
Op ma 21 apr 2025 om 07:19 schreef Ashok Nadkarni via Tcl-Core: > > I plan a CFV on TIP 716 at the end of this month. Harald, could you try the TIP #716 branch on the Windows environments you have? Thanks! Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-22 08:04:04
Attachments:
OpenPGP_signature.asc
|
Am 21.04.2025 um 15:36 schrieb Jan Nijtmans: > Op ma 21 apr 2025 om 07:19 schreef Ashok Nadkarni via Tcl-Core: >> >> I plan a CFV on TIP 716 at the end of this month. > > Harald, could you try the TIP #716 branch on the Windows environments you have? > > Thanks! > Jan Nijtmans Jan, thanks! I have tested it already. It had no influence on GDI, what was explained by Ashok. The long awaited "encoding user" command is effective and will allow to be compatible with 8.6 by: source -encoding [encoding user] $file what is a big step forward. I think, the removal of utf-8 from the certificate is a good thing. This should also happen for "wish" which may be explicitly stated in the TIP. Thanks for all, Harald |
From: Jan N. <jan...@gm...> - 2025-04-22 09:14:59
|
Op di 22 apr 2025 om 10:04 schreef Harald Oehlmann: > Jan, > thanks! I have tested it already. It had no influence on GDI, what was > explained by Ashok. > The long awaited "encoding user" command is effective and will allow to > be compatible with 8.6 by: > source -encoding [encoding user] $file > what is a big step forward. > I think, the removal of utf-8 from the certificate is a good thing. > This should also happen for "wish" which may be explicitly stated in the > TIP. Yes, I am aware that you tested it already. But I'm asking to retest, because "trunk" has been merged in the TIP branch. This has effect on some testcases, you'll see. B.T.W: I'm all in favour of a new "encoding user" subcommand, but the other parts of TIP #716 have some side-effects which are not mentioned in the TIP text. I can explain, but it's better you see it with your own eyes first. Thanks! Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-04-25 07:38:00
|
Op di 15 apr 2025 om 21:02 schreef apnmbx-public--- via Tcl-Core <tcl...@li...>: > The rules are (as per TIP 716): > > Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. > If you have to use ANSI API’s, use Windows encoding settings, not Tcl encoding settings. So, let's establish that TIP #716 changes the rules! I did some investigation which extensions will be broken by TIP #716. But - first - let's remember the rules how they were in Tcl 8.x. 1) Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise: Tcl_WinUtfToTChar(str, strlen(str), &ds); _wfopen((WCHAR *)Tcl_DStringValue(ds), L"r") 2) If you have to use ANSI API’s, use the following: Tcl_DStringInit(&ds); Tcl_DStringToExtenal(NULL, str, -1, &ds); fopen(Tcl_DStringValue(ds), "r") 3) If you only care about ASCII characters, just do: fopen(str, "r") In Tcl 9.0.0/9.0.3, all 3 rules start working correctly with UTF-8 (except when using the GDI API but there is no known extension using that). Whatever choice is made, 1), 2) or 3), all work fine with Tcl 9.0.0/9.0.1. If TIP #716 is accepted, 2) and 3) won't work correctly with characters outside ASCII any more. Which extensions will suffer from that? I found the following: blt/src/bltWinImage.c (line 1194) expect/exp_main_sub.c (line 813) pikchr/pikchr.c (line 8136) tkimg/sgi/sgi.c (line 1512) VecTcl/WavReader/generic/wavreader.c (line 88) So, as long as BLT images, pikchr, sgi or waveform filenames use ASCII characters only, nothing to worry about. When running in Tcl 9.0.0 or 9.0.1, nothing to worry about either. But with TIP #716, the rules change, and 2) or 3) can no longer handle characters outside ASCII. Hope this helps, Jan Nijtmans |
From: da S. P. J <pet...@fl...> - 2025-04-25 14:41:24
|
I will be so happy to see the backside of Tcl_DStringToExternal and its hyena cousins. ☺ From: Jan Nijtmans <jan...@gm...> Date: Friday, April 25, 2025 at 02:38 To: tcl...@li... <tcl...@li...> Subject: Re: [TCLCORE] TIP 716 ready for comments Op di 15 apr 2025 om 21: 02 schreef apnmbx-public--- via Tcl-Core <tcl-core@ lists. sourceforge. net>: > The rules are (as per TIP 716): > > Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. > If Op di 15 apr 2025 om 21:02 schreef apnmbx-public--- via Tcl-Core <tcl...@li...>: > The rules are (as per TIP 716): > > Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. > If you have to use ANSI API’s, use Windows encoding settings, not Tcl encoding settings. So, let's establish that TIP #716 changes the rules! I did some investigation which extensions will be broken by TIP #716. But - first - let's remember the rules how they were in Tcl 8.x. 1) Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise: Tcl_WinUtfToTChar(str, strlen(str), &ds); _wfopen((WCHAR *)Tcl_DStringValue(ds), L"r") 2) If you have to use ANSI API’s, use the following: Tcl_DStringInit(&ds); Tcl_DStringToExtenal(NULL, str, -1, &ds); fopen(Tcl_DStringValue(ds), "r") 3) If you only care about ASCII characters, just do: fopen(str, "r") In Tcl 9.0.0/9.0.3, all 3 rules start working correctly with UTF-8 (except when using the GDI API but there is no known extension using that). Whatever choice is made, 1), 2) or 3), all work fine with Tcl 9.0.0/9.0.1. If TIP #716 is accepted, 2) and 3) won't work correctly with characters outside ASCII any more. Which extensions will suffer from that? I found the following: blt/src/bltWinImage.c (line 1194) expect/exp_main_sub.c (line 813) pikchr/pikchr.c (line 8136) tkimg/sgi/sgi.c (line 1512) VecTcl/WavReader/generic/wavreader.c (line 88) So, as long as BLT images, pikchr, sgi or waveform filenames use ASCII characters only, nothing to worry about. When running in Tcl 9.0.0 or 9.0.1, nothing to worry about either. But with TIP #716, the rules change, and 2) or 3) can no longer handle characters outside ASCII. Hope this helps, Jan Nijtmans _______________________________________________ Tcl-Core mailing list Tcl...@li... https://urldefense.com/v3/__https://lists.sourceforge.net/lists/listinfo/tcl-core__;!!MvWE!FTxd301TfyO5GgcGfdBSuSnO-jh1562z8eQNudPLkWdmxnKFL9fojX7lFMF-seoyoO-rIpaQa8FobgbaN2F45Mu0k43tNA$<https://urldefense.com/v3/__https:/lists.sourceforge.net/lists/listinfo/tcl-core__;!!MvWE!FTxd301TfyO5GgcGfdBSuSnO-jh1562z8eQNudPLkWdmxnKFL9fojX7lFMF-seoyoO-rIpaQa8FobgbaN2F45Mu0k43tNA$> |
From: Harald O. <har...@el...> - 2025-04-25 07:47:27
Attachments:
OpenPGP_signature.asc
|
Am 25.04.2025 um 09:37 schrieb Jan Nijtmans: > Op di 15 apr 2025 om 21:02 schreef apnmbx-public--- via Tcl-Core > <tcl...@li...>: >> The rules are (as per TIP 716): >> >> Use the Unicode Windows API as Tcl/Tk already do. Then these issues do not arise. >> If you have to use ANSI API’s, use Windows encoding settings, not Tcl encoding settings. > > So, let's establish that TIP #716 changes the rules! > > I did some investigation which extensions will be broken by TIP #716. > But - first - > let's remember the rules how they were in Tcl 8.x. > > 1) Use the Unicode Windows API as Tcl/Tk already do. Then these issues > do not arise: > Tcl_WinUtfToTChar(str, strlen(str), &ds); > _wfopen((WCHAR *)Tcl_DStringValue(ds), L"r") > > 2) If you have to use ANSI API’s, use the following: > Tcl_DStringInit(&ds); > Tcl_DStringToExtenal(NULL, str, -1, &ds); > fopen(Tcl_DStringValue(ds), "r") > > 3) If you only care about ASCII characters, just do: > fopen(str, "r") > > In Tcl 9.0.0/9.0.3, all 3 rules start working correctly with UTF-8 > (except when using > the GDI API but there is no known extension using that). Whatever > choice is made, 1), 2) or 3), all work fine with Tcl 9.0.0/9.0.1. If > TIP #716 is > accepted, 2) and 3) won't work correctly with characters outside ASCII any more. > Which extensions will suffer from that? I found the following: > > blt/src/bltWinImage.c (line 1194) > expect/exp_main_sub.c (line 813) > pikchr/pikchr.c (line 8136) > tkimg/sgi/sgi.c (line 1512) > VecTcl/WavReader/generic/wavreader.c (line 88) > > So, as long as BLT images, pikchr, sgi or waveform filenames > use ASCII characters only, nothing to worry about. When running > in Tcl 9.0.0 or 9.0.1, nothing to worry about either. But with > TIP #716, the rules change, and 2) or 3) can no longer handle > characters outside ASCII. > > Hope this helps, > Jan Nijtmans Dear Jan, thanks for the action. I will test later or next week. The example given by Ashok was a driver to the IBM data base which was loaded as a dll into TCL. This 3rd party driver did not work any more due to the manifest change. There is nothing TCL related here. The only issue is, that any load DLL is influenced by the manifest. This is only fixable by removing the UTF-8 manifest. There is no problem with this, as TCL does not rely on it. And no compatible extion should rely on it. It is a work around, not a feature. Thanks for all, Harald |
From: Jan N. <jan...@gm...> - 2025-04-25 09:46:43
|
Op vr 25 apr 2025 om 09:47 schreef Harald Oehlmann: > Dear Jan, > thanks for the action. I will test later or next week. > The example given by Ashok was a driver to the IBM data base which was > loaded as a dll into TCL. This 3rd party driver did not work any more > due to the manifest change. My guess is that the API used by this IBM DLL is expecting cp1252, but Tcl is providing/retrieving utf-8-encoded data for it. Do you know where the Tcl wrapper-code for this IBM dll is? I think there's nothing wrong with the IBM dll, It's just that Tcl should respect the IBM API. Is the code here? https://github.com/memmertoIBM/db2tcl/ Regards, Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-25 09:57:34
Attachments:
OpenPGP_signature.asc
|
Am 25.04.2025 um 11:46 schrieb Jan Nijtmans: > Op vr 25 apr 2025 om 09:47 schreef Harald Oehlmann: >> Dear Jan, >> thanks for the action. I will test later or next week. >> The example given by Ashok was a driver to the IBM data base which was >> loaded as a dll into TCL. This 3rd party driver did not work any more >> due to the manifest change. > > My guess is that the API used by this IBM DLL is expecting > cp1252, but Tcl is providing/retrieving utf-8-encoded data > for it. Do you know where the Tcl wrapper-code for this > IBM dll is? I think there's nothing wrong with the IBM dll, > It's just that Tcl should respect the IBM API. > > Is the code here? > https://github.com/memmertoIBM/db2tcl/ > > Regards, > Jan Nijtmans To my knowledge, the dll does not work any more internally, as the interface DLL to the DB depends on it. The issue is not related to TCL expect that the manifest value makes it not working any more. To my knowledge, the concerned user case has no direct TCL interface in the sense, that data from TCL is transfered to the data base and vice versa. That is why it took so long to find the issue. Harald |
From: Ashok N. <apn...@ya...> - 2025-04-25 13:58:31
Attachments:
db2tcl.c
db2tclcmds.c
|
Jan, Sorry, could not respond earlier as have been traveling all week. The DB2 error occurs at initialization to the db on a call conn->rc = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv); Conn->rc is the result code (-1 indicating failure), henv would normally contain the returned handle. Any attempt to retrieve additional error information with SQLRegDiagRec fails. I am not sure why but could be because the error retrieval mechanisms also cannot handle a UTF-8 code page. So to elaborate on Harald's correct interpretation of my original post, the failure mode does not involve any Tcl code at all. No data is passed so no encoding mismatch is involved. Your conjecture about Tcl passing it UTF-8 encoded data is off base. It never gets that far to even attempt that (not even as far as SqlConnect). Your comment that Tcl should respect the IBM API is spot on. Except Tcl 9 does not! It is expected that the user's real code page will be retrieved with GetACP() and Tcl 9's manifest hijacks that API to force a UTF-8 return with no recourse or workaround available. How would you suggest the above basic and fundamental API call be written to work around the issue even if the extension writer is willing? To remove Tcl from the equation, also see this issued reported with R 4.2 and DB2 drivers - RODBC odbcConnect fails for ibm db2 connection in Prairie Trillium · Issue #10509 · rstudio/rstudio<https://github.com/rstudio/rstudio/issues/10509>. Guess what R 4.2 and Tcl 9 have in common? The UTF-8 manifest. The motivation for R 4.2 using the manifest is spelled out in the link I included in the TIP. It is completely inapplicable to Tcl for reasons I also outlined in the TIP. I have not looked at the link you included but the failing extension code demonstrating the error is attached. /Ashok ________________________________ From: Jan Nijtmans My guess is that the API used by this IBM DLL is expecting cp1252, but Tcl is providing/retrieving utf-8-encoded data for it. Do you know where the Tcl wrapper-code for this IBM dll is? I think there's nothing wrong with the IBM dll, It's just that Tcl should respect the IBM API. Is the code here? https://github.com/memmertoIBM/db2tcl/ Regards, Jan Nijtmans _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Ashok N. <apn...@ya...> - 2025-04-25 14:06:40
|
And other posts on the same (DB2 drivers not supporting UTF-8) - SQLAllocHandle of the driver on SQL_HANDLE_ENV failed when connecting to IBM-DB2 database - MATLAB Answers - MATLAB Central<https://www.mathworks.com/matlabcentral/answers/1786300-sqlallochandle-of-the-driver-on-sql_handle_env-failed-when-connecting-to-ibm-db2-database> and windows - DB2 service does not start - Stack Overflow<https://stackoverflow.com/questions/60296057/db2-service-does-not-start> It would really help folks making up their mind on TIP 716 (including me!) if you could spell out your motivation for adding the manifest entry to Tcl. Perhaps there is an overriding benefit to it but without a TIP or any kind of discussion prior to its introduction in Tcl 9, there is no way for anyone to know. /Ashok ________________________________ From: Ashok Nadkarni via Tcl-Core Sent: Friday, April 25, 2025 7:27 PM To: Jan Nijtmans; tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Jan, Sorry, could not respond earlier as have been traveling all week. The DB2 error occurs at initialization to the db on a call conn->rc = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv); Conn->rc is the result code (-1 indicating failure), henv would normally contain the returned handle. Any attempt to retrieve additional error information with SQLRegDiagRec fails. I am not sure why but could be because the error retrieval mechanisms also cannot handle a UTF-8 code page. So to elaborate on Harald's correct interpretation of my original post, the failure mode does not involve any Tcl code at all. No data is passed so no encoding mismatch is involved. Your conjecture about Tcl passing it UTF-8 encoded data is off base. It never gets that far to even attempt that (not even as far as SqlConnect). Your comment that Tcl should respect the IBM API is spot on. Except Tcl 9 does not! It is expected that the user's real code page will be retrieved with GetACP() and Tcl 9's manifest hijacks that API to force a UTF-8 return with no recourse or workaround available. How would you suggest the above basic and fundamental API call be written to work around the issue even if the extension writer is willing? To remove Tcl from the equation, also see this issued reported with R 4.2 and DB2 drivers - RODBC odbcConnect fails for ibm db2 connection in Prairie Trillium · Issue #10509 · rstudio/rstudio<https://github.com/rstudio/rstudio/issues/10509>. Guess what R 4.2 and Tcl 9 have in common? The UTF-8 manifest. The motivation for R 4.2 using the manifest is spelled out in the link I included in the TIP. It is completely inapplicable to Tcl for reasons I also outlined in the TIP. I have not looked at the link you included but the failing extension code demonstrating the error is attached. /Ashok ________________________________ From: Jan Nijtmans My guess is that the API used by this IBM DLL is expecting cp1252, but Tcl is providing/retrieving utf-8-encoded data for it. Do you know where the Tcl wrapper-code for this IBM dll is? I think there's nothing wrong with the IBM dll, It's just that Tcl should respect the IBM API. Is the code here? https://github.com/memmertoIBM/db2tcl/ Regards, Jan Nijtmans _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |