From: Harald O. <har...@el...> - 2025-04-25 17:38:17
Attachments:
OpenPGP_signature.asc
|
Am 16.04.2025 um 14:03 schrieb Jan Nijtmans: > Op wo 16 apr 2025 om 13:35 schreef Harald Oehlmann: >> The migration: >> https://core.tcl-lang.org/tcl/wiki?name=Migrating+C+extensions+to+Tcl+9&p >> does not tell anything on this... > > So, that should be extended. All is described here: > <https://core.tcl-lang.org/tips/doc/trunk/tip/548.md> The migration page is now extended by the change of the MS-Windows system encoding: https://core.tcl-lang.org/tcl/info/e16612341c1416c5 and the deprecated Tcl_Winxxx functions: https://core.tcl-lang.org/tcl/info/d6ff4a080d8f9e87 I try to use full examples. The Tcl_Winxxx functions were never fully documented. I hope, this works. About TIP 548: https://core.tcl-lang.org/tips/doc/trunk/tip/548.md I am hesitant to document more. Basically, I don't understand the text. Or it is incomplete. The type "wchar_t" is 16 bit on Windows, but may be another size on other platforms. But this is a platform issue, not a TCL issue. I am also not happy by the word "utf-8". Here, not the utf-8 encoding is ment, but the TCL variant replacing the 0 Codepoint. Or is this not the case, e.g. the functions may not handle 0 bytes. Also, there is nothing written about eventual surrogates for the 16 bit functions. More questions than answers as usual.... Thanks for all, Harald P.S. I have retried my test with the updated branch. This does not make any difference (and should IMHO not). The result is ok and now, we have migration text ;-). |
From: Jan N. <jan...@gm...> - 2025-04-25 22:13:17
|
Op vr 25 apr 2025 om 16:06 schreef Ashok Nadkarni <apn...@ya...>: > > And other posts on the same (DB2 drivers not supporting UTF-8) - SQLAllocHandle of the driver on SQL_HANDLE_ENV failed when connecting to IBM-DB2 database - MATLAB Answers - MATLAB Central and windows - DB2 service does not start - Stack Overflow Those posts tell me something. First, compare the story with this Tcl ticket: <https://core.tcl-lang.org/tcl/info/0b9332722a> Tcl had the same problem 5 years ago. It got the ACP value, prepended it with "cp" and then used it as an encoding name. There is no "cp65001" encoding, it should be handled as "utf-8". My guess is that the IBM dll does the same thing. No surprise the initialisation fails. They really should fix this bug, it shouldn't be difficult. Is it already reported to them? The DB2 dll simply doesn't work in any utf-8 environment. Does that make sense? Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-04-28 05:46:18
Attachments:
OpenPGP_signature.asc
|
Am 26.04.2025 um 00:12 schrieb Jan Nijtmans: > Op vr 25 apr 2025 om 16:06 schreef Ashok Nadkarni <apn...@ya...>: >> >> And other posts on the same (DB2 drivers not supporting UTF-8) - SQLAllocHandle of the driver on SQL_HANDLE_ENV failed when connecting to IBM-DB2 database - MATLAB Answers - MATLAB Central and windows - DB2 service does not start - Stack Overflow > > Those posts tell me something. First, compare the story with this Tcl ticket: > <https://core.tcl-lang.org/tcl/info/0b9332722a> > Tcl had the same problem 5 years ago. It got the ACP value, prepended it > with "cp" and then used it as an encoding name. There is no "cp65001" > encoding, it should be handled as "utf-8". > > My guess is that the IBM dll does the same thing. No surprise > the initialisation fails. They really should fix this bug, it shouldn't > be difficult. Is it already reported to them? The DB2 dll > simply doesn't work in any utf-8 environment. > > Does that make sense? > > Jan Nijtmans > Dear Jan, indeed, it might be the same issue related to the ticket, who knows. The magic manifest key is a partial work-around for programs using the 8 bit Windows API. It was introduced in Windows 10. It only affects the ANSI/char/MCBS Windows API which is superseeded since 1993 by the wide Windows API. What does this workaround do: - switch some (not all) ANSI API functions from the local encoding (cp1252 in Western Europe) to use UTF-8 as char encoding - switch the reporting of the native codepage - as the manifest is in the exe, it applies to all loaded DLLs Unfortunately, this breaks a lot of legacy DLLs. They are not fixed as the source code or maintainer is not available. In addition, there is no reason to fix. The DLLs work. The upper workaround is only useful (if for any case), if a small application with only WinAPI dependencies should be easily ported. It is not useful for TCL as it breaks legacy DLLs. There is no useful application, as TCL uses the wide Windows API. Ashok gets a bit nerved, as this was changed without a TIP. Here, I am not his opinion. The intention was clear and when reading from it, I was initialy also in favor: Unicode? Yes, we want it. Now, we are wiser and we should revert this change. Ashok goes the formal way by a TIP, what is noble. IMHO, it could just be reverted as a not valid bugfix. I really appreciate all your work. But putting energy in here is IMHO lost and may better be investigated in the great three tickets by Christian ;-). Thanks for all and take care, Harald |
From: Jan N. <jan...@gm...> - 2025-04-28 07:16:17
|
Op vr 25 apr 2025 om 19:37 schreef Harald Oehlmann: > P.S. I have retried my test with the updated branch. This does not make > any difference (and should IMHO not). The result is ok and now, we have > migration text ;-). 1) So, did you see the following test failures? 2) Does TIP #716 tell anything about this? Hope I have your attention now ;-) Jan NIjtmans P.S.: Github ACTIONS shows the same failure: <https://github.com/tcltk/tcl/actions/runs/14635843756/job/41066664724> 109==== cmdAH-16.2 Tcl_FileObjCmd: readable FAILED 110==== Contents of test case: 111file readable $gorpfile 112---- Test setup failed: 113D:\a\tcl\tcl\win\górp.file: no such file or directory 114---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 115 while executing 116"testchmod 0o444 $gorpfile" 117 ("uplevel" body line 1) 118 invoked from within 119"uplevel 1 $setup" 120---- errorCode(setup): POSIX ENOENT {no such file or directory} 121==== cmdAH-16.2 FAILED 122 123 124 125==== cmdAH-17.2 Tcl_FileObjCmd: writable FAILED 126==== Contents of test case: 127file writable $gorpfile 128---- Test setup failed: 129D:\a\tcl\tcl\win\górp.file: no such file or directory 130---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 131 while executing 132"testchmod 0o555 $gorpfile" 133 ("uplevel" body line 1) 134 invoked from within 135"uplevel 1 $setup" 136---- errorCode(setup): POSIX ENOENT {no such file or directory} 137==== cmdAH-17.2 FAILED 138 139 140 141==== cmdAH-17.3 Tcl_FileObjCmd: writable FAILED 142==== Contents of test case: 143file writable $gorpfile 144---- Test setup failed: 145D:\a\tcl\tcl\win\górp.file: no such file or directory 146---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 147 while executing 148"testchmod 0o222 $gorpfile" 149 ("uplevel" body line 1) 150 invoked from within 151"uplevel 1 $setup" 152---- errorCode(setup): POSIX ENOENT {no such file or directory} 153==== cmdAH-17.3 FAILED 154 155cmdIL.test |
From: Harald O. <har...@el...> - 2025-04-28 09:44:03
Attachments:
OpenPGP_signature.asc
|
Jan, thank you for the information. That is great. If we need it, we need it. Thanks, Harald Am 28.04.2025 um 09:15 schrieb Jan Nijtmans: > Op vr 25 apr 2025 om 19:37 schreef Harald Oehlmann: >> P.S. I have retried my test with the updated branch. This does not make >> any difference (and should IMHO not). The result is ok and now, we have >> migration text ;-). > > 1) So, did you see the following test failures? > 2) Does TIP #716 tell anything about this? > > Hope I have your attention now ;-) > Jan NIjtmans > > P.S.: Github ACTIONS shows the same failure: > <https://github.com/tcltk/tcl/actions/runs/14635843756/job/41066664724> > > 109==== cmdAH-16.2 Tcl_FileObjCmd: readable FAILED > 110==== Contents of test case: > 111file readable $gorpfile > 112---- Test setup failed: > 113D:\a\tcl\tcl\win\górp.file: no such file or directory > 114---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory > 115 while executing > 116"testchmod 0o444 $gorpfile" > 117 ("uplevel" body line 1) > 118 invoked from within > 119"uplevel 1 $setup" > 120---- errorCode(setup): POSIX ENOENT {no such file or directory} > 121==== cmdAH-16.2 FAILED > 122 > 123 > 124 > 125==== cmdAH-17.2 Tcl_FileObjCmd: writable FAILED > 126==== Contents of test case: > 127file writable $gorpfile > 128---- Test setup failed: > 129D:\a\tcl\tcl\win\górp.file: no such file or directory > 130---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory > 131 while executing > 132"testchmod 0o555 $gorpfile" > 133 ("uplevel" body line 1) > 134 invoked from within > 135"uplevel 1 $setup" > 136---- errorCode(setup): POSIX ENOENT {no such file or directory} > 137==== cmdAH-17.2 FAILED > 138 > 139 > 140 > 141==== cmdAH-17.3 Tcl_FileObjCmd: writable FAILED > 142==== Contents of test case: > 143file writable $gorpfile > 144---- Test setup failed: > 145D:\a\tcl\tcl\win\górp.file: no such file or directory > 146---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory > 147 while executing > 148"testchmod 0o222 $gorpfile" > 149 ("uplevel" body line 1) > 150 invoked from within > 151"uplevel 1 $setup" > 152---- errorCode(setup): POSIX ENOENT {no such file or directory} > 153==== cmdAH-17.3 FAILED > 154 > 155cmdIL.test |
From: Jan N. <jan...@gm...> - 2025-04-28 11:23:11
|
Op ma 28 apr 2025 om 11:43 schreef Harald Oehlmann: > That is great. > If we need it, we need it. Well, I would like the DB2 problem to be fixed too. TIP #716 describes a new function Tcl_GetEncodingNameForUser() which could be of help fixing the testcases. So, I tried it: <https://core.tcl-lang.org/tcl/info/8a040c000aa41a0e> It's on the TIP of the "tip-716" now. This demonstrates that whatever function we use in testcases, the function needs to be exported through the stub table. Thinking further, a function returning the encoding _ name_ in itself is not so useful. It would be more useful to have a function returning the Tcl_Encoding itself. If we want the name only, we can always use Tcl_GetEncodingName(). That's my inspiration for TIP #718. ;-) I think "encoding name" in itself is useful. Hope this helps, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-04-28 12:34:30
|
Jan, You always have our attention but I am not sure about the converse! I already commented on your test cases in Re: [TCLCORE] TIP 716 ready for comments | Tcl<https://sourceforge.net/p/tcl/mailman/message/59173666/> and how they needed to be written. Your response was that you were still experimenting. In any case, I have fixed your failing chmod tests. TL;DR use wide character API's to avoid these exact issues. In particular, from the TIP (emphasized in the TIP itself): For compatibility reasons, the Tcl_GetEncodingNameForUser function will not be public via stubs in 9.0.2 but will be public in 9.1. Further attempts to clarify my preference for 716, as much as I have to hold my nose when proposing it ... There are four transfer points at which encodings matter. Tcl core <-> Tcl extensions <->third party DLL's<->external applications. In Tcl 8.6, all was hunky dory because all above components agreed on the encodings [encoding system] == Tcl.GetACP() == OtherProcess.GetACP() (i.e. User's encoding in registry). Both Tcl 9 and TIP 716 break this in slightly different ways. In Tcl 9, [encoding system] == Tcl.GetACP() == utf-8 (within the Tcl process only!) but Tcl.GetACP() != OtherProcess.GetACP(). This causes two forms of breakage. First, because Tcl.GetAcp() != OtherProcess.GetACP(), transfers to other applications breaks (e.g. Piping to Windows command line programs, database servers etc.). Second, because Tcl.GetACP() == utf8 applies to shared libraries, this breaks shared libraries that cannot handle UTF-8, e.g. GDI etc. (By shared libraries, I don't mean Tcl extensions but DLL's linked to Tcl or extensions that know nothing about Tcl.) TIP 716 chooses a different mode of breakage! [encoding system] == utf-8 (bad idea imo but required for Tcl 9 compatibility w.r.t. default channel encodings), BUT [encoding system] != Tcl.GetACP() and instead Tcl.GetACP() == OtherProcess.GetACP(). Neither Tcl 9.0 nor 716 are optimal, unlike 8.6. Why then do I prefer 716? Because with 716 the breaking points can be fixed within Tcl and Tcl extension code by switching to wide character API's or using Tcl encoding API's. It's all code within the control of the extension writer. My changes to your test cases above fall into this category. On the other hand, Tcl 9.0 behavior has the potential for issues like DB2 where the extension writer has *no recourse* other than his own tclsh build with the manifest removed. That in turn immediately makes the "modified" tclsh incompatible with "stock" tclsh (files written on the same system by one will be unreadable by the other) which is not a desirable thing. I would seriously consider withdrawing 716 if you can figure out how to fix the DB 2 kind of issue. /Ashok ________________________________ From: Jan Nijtmans Sent: Monday, April 28, 2025 12:45 PM To: Harald Oehlmann Cc: tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Op vr 25 apr 2025 om 19:37 schreef Harald Oehlmann: > P.S. I have retried my test with the updated branch. This does not make > any difference (and should IMHO not). The result is ok and now, we have > migration text ;-). 1) So, did you see the following test failures? 2) Does TIP #716 tell anything about this? Hope I have your attention now ;-) Jan NIjtmans P.S.: Github ACTIONS shows the same failure: <https://github.com/tcltk/tcl/actions/runs/14635843756/job/41066664724> 109==== cmdAH-16.2 Tcl_FileObjCmd: readable FAILED 110==== Contents of test case: 111file readable $gorpfile 112---- Test setup failed: 113D:\a\tcl\tcl\win\górp.file: no such file or directory 114---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 115 while executing 116"testchmod 0o444 $gorpfile" 117 ("uplevel" body line 1) 118 invoked from within 119"uplevel 1 $setup" 120---- errorCode(setup): POSIX ENOENT {no such file or directory} 121==== cmdAH-16.2 FAILED 122 123 124 125==== cmdAH-17.2 Tcl_FileObjCmd: writable FAILED 126==== Contents of test case: 127file writable $gorpfile 128---- Test setup failed: 129D:\a\tcl\tcl\win\górp.file: no such file or directory 130---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 131 while executing 132"testchmod 0o555 $gorpfile" 133 ("uplevel" body line 1) 134 invoked from within 135"uplevel 1 $setup" 136---- errorCode(setup): POSIX ENOENT {no such file or directory} 137==== cmdAH-17.2 FAILED 138 139 140 141==== cmdAH-17.3 Tcl_FileObjCmd: writable FAILED 142==== Contents of test case: 143file writable $gorpfile 144---- Test setup failed: 145D:\a\tcl\tcl\win\górp.file: no such file or directory 146---- errorInfo(setup): D:\a\tcl\tcl\win\górp.file: no such file or directory 147 while executing 148"testchmod 0o222 $gorpfile" 149 ("uplevel" body line 1) 150 invoked from within 151"uplevel 1 $setup" 152---- errorCode(setup): POSIX ENOENT {no such file or directory} 153==== cmdAH-17.3 FAILED 154 155cmdIL.test _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Jan N. <jan...@gm...> - 2025-04-28 13:09:14
|
Op ma 28 apr 2025 om 14:34 schreef Ashok Nadkarni: > TL;DR use wide character API's to avoid these exact issues. So, why isn't there a "Compatibility" section in the TIP describing that starting with Tcl 9.0.2 the Windows ANSI API cannot be used any more in the same way as in Tcl 8.6/9.0.0/9.0.1? This should also be put in the Tcl 9.0.2 release notes, if this TIP is accepted. If the majority is OK with this breakage, I won't stand in the way. Jan Nijtmans Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-04-30 05:27:13
|
I have no objection to a Compatibility section as such but note the brokenness of ANSI API's is not new to 9.0.2. The following idiom I presume you are referring to, Tcl_UtfToExternalDString(NULL,...,&ds); CreateFileA(Tcl_Value(&ds)...); can produce garbage file names as far back as 8.0. I don't know if it makes sense to add a compatibility note that says in effect "Use of ANSI API's continues to be broken in 9.0.2 ...". As far as release notes, of course, an item will be added as it is user visible change. Manpages also need to be updated. There is nothing useful in the manpages currently about [encoding system] or the differences between platforms, or even between Windows versions, so that is something I intend to add irrespective of whether 716 passes or not. CFV will be after all the above is done and we discuss the whole topic in the next online meet. Thanks /Ashok ________________________________ From: Jan Nijtmans Sent: Monday, April 28, 2025 6:38 PM To: Ashok Nadkarni Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments Op ma 28 apr 2025 om 14:34 schreef Ashok Nadkarni: > TL;DR use wide character API's to avoid these exact issues. So, why isn't there a "Compatibility" section in the TIP describing that starting with Tcl 9.0.2 the Windows ANSI API cannot be used any more in the same way as in Tcl 8.6/9.0.0/9.0.1? This should also be put in the Tcl 9.0.2 release notes, if this TIP is accepted. If the majority is OK with this breakage, I won't stand in the way. Jan Nijtmans Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-04-30 07:44:32
|
Op wo 30 apr 2025 om 07:26 schreef Ashok Nadkarni: > The following idiom I presume you are referring to, > > Tcl_UtfToExternalDString(NULL,...,&ds); > CreateFileA(Tcl_Value(&ds)...); > > can produce garbage file names as far back as 8.0. I don't know if it makes sense to add a compatibility note that says in effect "Use of ANSI API's continues to be broken in 9.0.2 ...". Yes, this is what Iḿ referring to. In 8.x it works fine, as long as the filename is restricted to CP1252. In 9.0.0/9.0.1 it works fine as long as the filename is restricted to UTF-8 ;-). With TIP #716 it works fine as long as the filename is restricted to ASCII (which is even more broken than 8.x ...) > CFV will be after all the above is done and we discuss the whole topic in the next online meet. Still, because of the DB2 failure (and maybe more dll's, the TIP says "many" but I am not aware of any other) we have to do something about it, so the best approach is to accept TIP #716 and handle the damage later. Hope this helps, Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-04-30 10:43:53
|
Op wo 30 apr 2025 om 09:43 schreef Jan Nijtmans: > > The following idiom I presume you are referring to, > > > > Tcl_UtfToExternalDString(NULL,...,&ds); > > CreateFileA(Tcl_Value(&ds)...); For completeness, another idiom I'm referring to: const char *str = Tcl_GetStringFromObj(....) CreateFileA(str, ...) Surprisingly many extensions use that. In 8.x it works fine, as long as the filename is restricted to ASCII. In 9.0.0/9.0.1 it works fine as long as the filename is restricted to UTF-8 ;-). With TIP #716 it works fine as long as the filename is restricted to ASCII (which is the same brokenness as 8.x ...) It's up to you how to incorporate this information in the TIP text, or - if you want - leave it out. Some examples: https://sqlite.org/src/file?name=src/tclsqlite.c&ci=ba7d5bad32ad6aac&ln=2604 https://github.com/auriocus/VecTcl/blame/master/WavReader/generic/wavreader.c (line 88) Regards, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-05-01 15:06:06
|
Jan, First, thanks for the considerable time you have spent on poking at 716. It is appreciated. I will summarize the discussion and points you brought up in the TIP. In any case, I do not plan on a CFV until the options for proceeding are discussed in the next meeting. Regarding your comment below, regarding 9.0.0/9.0.1 because it is not true for all Windows platforms supported by 9.0 and not all Windows API's, I would hesitate to say it works fine. /Ashok ________________________________ From: Jan Nijtmans Sent: Wednesday, April 30, 2025 4:13 PM To: Ashok Nadkarni Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments Op wo 30 apr 2025 om 09:43 schreef Jan Nijtmans: > > The following idiom I presume you are referring to, > > > > Tcl_UtfToExternalDString(NULL,...,&ds); > > CreateFileA(Tcl_Value(&ds)...); For completeness, another idiom I'm referring to: const char *str = Tcl_GetStringFromObj(....) CreateFileA(str, ...) Surprisingly many extensions use that. In 8.x it works fine, as long as the filename is restricted to ASCII. In 9.0.0/9.0.1 it works fine as long as the filename is restricted to UTF-8 ;-). With TIP #716 it works fine as long as the filename is restricted to ASCII (which is the same brokenness as 8.x ...) It's up to you how to incorporate this information in the TIP text, or - if you want - leave it out. Some examples: https://sqlite.org/src/file?name=src/tclsqlite.c&ci=ba7d5bad32ad6aac&ln=2604 https://github.com/auriocus/VecTcl/blame/master/WavReader/generic/wavreader.c (line 88) Regards, Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-05-05 09:51:35
|
Op ma 14 apr 2025 om 06:52 schreef Ashok: > > TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Let me share two more reservations on the TIP text: 1) Two times in the TIP, "MingW msvcrt" builds are mentioned as behaving differently from ucrt builds. Quoting: "The key difference with respect to the current implementation issue that this does not impact extensions that call GetACP solving the first issue listed above, or using MingW msvcrt builds." "An example is components built with MingW64 gcc using the msvcrt runtime" This is not correct. I did test that (see ticket https://core.tcl-lang.org/tcl/tktview/8ffd8cabd1) which was done using a MingW msvcrt build. The The GetACP() function functions exactly the same. There is an issue mixing different runtimes, but that's related to using stdin/stdout: different runtimes each have their own FILE implementation. So, opening a FILE in an extension using UCRT and writing it from other extension using MSVCRT is expected to crash. My suggestion: remove it from the TIP text (or provide a testcase proving your point, I will be happy to try it in my environment) 2) There is no usecase for exporting Tcl_GetEncodingNameForUser() I tried to use it, but the way it is now it's a bad idea. I am thinking about a separate TIP for a more useful approach. Stay tuned. The reason I suspect that you didn't do any tests in the MinGW msvcrt environment is that compilation in this environment failed: <https://github.com/tcltk/tcl/actions/runs/14725688360> I corrected that here, as a free service ;-) <https://core.tcl-lang.org/tcl/info/479fc6ad0dff8cdd> Hope this helps, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-05-05 17:21:48
|
Jan, Let me first respond to your second reservation below regarding Tcl_GetEncodingNameForUser(). Since your objection merely stated "it's a bad idea" without elaborating as to why, I have to make some guesses as to what exactly you object to. Based on your TIP 718 which suggests a new function TclWinGetUserEncoding(), I assume you would prefer the function return a Tcl_Encoding as opposed to a character string for the encoding name. I did think about that too, but there are two reasons I went with the current function signature returning the encoding name as a string. The first is that it is consistent with the existing API Tcl_GetEncodingFromEnvironment() which also returns a string, not a Tcl_Encoding. There is something to be said for consistency when two API's have similar functionality. Secondly, while a Tcl_Encoding may be more convenient for Tcl_UtfTo* and friends, it is less so for channel operations. Tcl_SetChannelOption takes encoding names, not Tcl_Encoding as the values for the -encoding option, so you can say Tcl_SetChannelOption(chan, "-encoding", Tcl_GetEncodingNameForUser(&ds)) (or something similar) Given Tcl_Encoding and the encoding name can be converted in both directions, it is six of one and half-a-dozen of the other, one being easier for one set of Tcl functions and the other for another set. I went with the encoding name because it is the more fundamental operation and consistent with Tcl_GetEncodingFromEnvironment(). Having said that I have no objection to providing both. Of course, the above does not apply if your comment about Tcl_GetEncoding... being a bad API was based on some other factor I have not considered. Can't really tell unless you are more specific. Now, with regard to the mingw msvcrt comment... It seems to me from your comment about GetACP() in MingW that you think I was suggesting the manifest would not work in MingW. That is not so. GetACP() will work the same way and return utf-8 in the presence of a manifest but the problem is MSVCRT does not support UTF-8. I do not know if you read the link that I had referenced in the TIP https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt To quote from there - It doesn't support the UTF-8 locale ("It" being MSVCRT) When the official release document says MSVCRT does not support UTF-8, I take that at face value. That is why I did not bother to test it. Presuming the tests you reference are the ones you added for TIP 716, they indicate that UTF-8 works for one particular configuration of mingw for one version of gcc for two specific functions (fopen and chmod I think) for one specific west european character. You are free to presume that means msvcrt has no issues with UTF-8. I, however, prefer to go with the official statement and not presume success of one small test case is proof. So I will keep that reference in the TIP. I can however add your comment that despite the official MSys position you believe MSVCRT works fine based on the tests you did. As an aside, I do not understand why you are raising the issue of passing FILE* between different C runtimes? When did I ever raise that issue and what does it have to do with the current TIP 716 discussion? Don't we have enough confusion without adding completely irrelevant considerations? /Ashok ________________________________ From: Jan Nijtmans Sent: Monday, May 5, 2025 3:20 PM To: apn...@ya... Cc: tcl...@li... Subject: Re: [TCLCORE] TIP 716 ready for comments Op ma 14 apr 2025 om 06:52 schreef Ashok: > > TIP 716: New command "encoding user", remove UTF-8 manifest setting on Windows is ready for comments. It proposes reversion of a change made in 9.0 to tclsh and wish Windows manifests while keeping compatibility with 9.0.{0,1}. Let me share two more reservations on the TIP text: 1) Two times in the TIP, "MingW msvcrt" builds are mentioned as behaving differently from ucrt builds. Quoting: "The key difference with respect to the current implementation issue that this does not impact extensions that call GetACP solving the first issue listed above, or using MingW msvcrt builds." "An example is components built with MingW64 gcc using the msvcrt runtime" This is not correct. I did test that (see ticket https://core.tcl-lang.org/tcl/tktview/8ffd8cabd1) which was done using a MingW msvcrt build. The The GetACP() function functions exactly the same. There is an issue mixing different runtimes, but that's related to using stdin/stdout: different runtimes each have their own FILE implementation. So, opening a FILE in an extension using UCRT and writing it from other extension using MSVCRT is expected to crash. My suggestion: remove it from the TIP text (or provide a testcase proving your point, I will be happy to try it in my environment) 2) There is no usecase for exporting Tcl_GetEncodingNameForUser() I tried to use it, but the way it is now it's a bad idea. I am thinking about a separate TIP for a more useful approach. Stay tuned. The reason I suspect that you didn't do any tests in the MinGW msvcrt environment is that compilation in this environment failed: <https://github.com/tcltk/tcl/actions/runs/14725688360> I corrected that here, as a free service ;-) <https://core.tcl-lang.org/tcl/info/479fc6ad0dff8cdd> Hope this helps, Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-05-05 21:39:27
|
Op ma 5 mei 2025 om 19:22 schreef Ashok Nadkarni: > Now, with regard to the mingw msvcrt comment... > > It seems to me from your comment about GetACP() in MingW that you think I was suggesting the manifest would not work in MingW. That is not so. GetACP() will work the same way and return utf-8 in the presence of a manifest but the problem is MSVCRT does not support UTF-8. > > I do not know if you read the link that I had referenced in the TIP https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt > > To quote from there - It doesn't support the UTF-8 locale ("It" being MSVCRT) > > When the official release document says MSVCRT does not support UTF-8, I take that at face value. Let me clarify then. See: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 With msvcrt, you cannot do a setlocale("en_US.UTF8"), with UCRT it works; Tcl doesn't support that either. The only setlocale() call is here: <https://core.tcl-lang.org/tcl/file?name=win/tclAppInit.c&ci=ead995eddf5fff98&ln=111> MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_, that's a different thing Hope this clarifies enough. Happy hacking, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-05-06 04:47:40
|
No, I'm afraid your post below does not clarify at all. That link you gave pertains to UCRT, not MSVCRT which, unless I read with my blinders on, is not mentioned anywhere on that page. Adding UCRT to the discussion, first with passing of FILE* between runtimes, and now to discussion of setlocale(), only serves to obfuscate further. Further, given the link you gave has no discussion of MSVCRT, what was the source of your statement <MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_,> ? If you can provide the basis for your conclusion, that would really help. /Ashok ________________________________ From: Jan Nijtmans Sent: Tuesday, May 6, 2025 3:08 AM To: Ashok Nadkarni Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments Op ma 5 mei 2025 om 19:22 schreef Ashok Nadkarni: > Now, with regard to the mingw msvcrt comment... > > It seems to me from your comment about GetACP() in MingW that you think I was suggesting the manifest would not work in MingW. That is not so. GetACP() will work the same way and return utf-8 in the presence of a manifest but the problem is MSVCRT does not support UTF-8. > > I do not know if you read the link that I had referenced in the TIP https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt > > To quote from there - It doesn't support the UTF-8 locale ("It" being MSVCRT) > > When the official release document says MSVCRT does not support UTF-8, I take that at face value. Let me clarify then. See: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 With msvcrt, you cannot do a setlocale("en_US.UTF8"), with UCRT it works; Tcl doesn't support that either. The only setlocale() call is here: <https://core.tcl-lang.org/tcl/file?name=win/tclAppInit.c&ci=ead995eddf5fff98&ln=111> MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_, that's a different thing Hope this clarifies enough. Happy hacking, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-05-06 07:47:46
|
Jan, It occurs to me that 9.1 will stop supporting Windows releases prior to Win 10. Given that, we can simply drop support for MinGW/MSVCRT as all Windows systems will have UCRT installed. If that is agreeable, we can drop the discussion around MSVCRT and focus on other points of contention in 716/718. /Ashok ________________________________ From: Ashok Nadkarni via Tcl-Core Sent: Tuesday, May 6, 2025 10:17 AM To: Jan Nijtmans Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments No, I'm afraid your post below does not clarify at all. That link you gave pertains to UCRT, not MSVCRT which, unless I read with my blinders on, is not mentioned anywhere on that page. Adding UCRT to the discussion, first with passing of FILE* between runtimes, and now to discussion of setlocale(), only serves to obfuscate further. Further, given the link you gave has no discussion of MSVCRT, what was the source of your statement <MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_,> ? If you can provide the basis for your conclusion, that would really help. /Ashok ________________________________ From: Jan Nijtmans Sent: Tuesday, May 6, 2025 3:08 AM To: Ashok Nadkarni Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments Op ma 5 mei 2025 om 19:22 schreef Ashok Nadkarni: > Now, with regard to the mingw msvcrt comment... > > It seems to me from your comment about GetACP() in MingW that you think I was suggesting the manifest would not work in MingW. That is not so. GetACP() will work the same way and return utf-8 in the presence of a manifest but the problem is MSVCRT does not support UTF-8. > > I do not know if you read the link that I had referenced in the TIP https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt > > To quote from there - It doesn't support the UTF-8 locale ("It" being MSVCRT) > > When the official release document says MSVCRT does not support UTF-8, I take that at face value. Let me clarify then. See: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 With msvcrt, you cannot do a setlocale("en_US.UTF8"), with UCRT it works; Tcl doesn't support that either. The only setlocale() call is here: <https://core.tcl-lang.org/tcl/file?name=win/tclAppInit.c&ci=ead995eddf5fff98&ln=111> MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_, that's a different thing Hope this clarifies enough. Happy hacking, Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-05-06 09:57:27
|
Op di 6 mei 2025 om 06:47 schreef Ashok Nadkarni: > Further, given the link you gave has no discussion of MSVCRT, what was the source of your statement <MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_,> ? If you can provide the basis for your conclusion, that would really help. Start reading the "UTF-8 support" chapter, Quoting: "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C Runtime supports using a UTF-8 code page. The change means that char strings passed to C runtime functions can expect strings in the UTF-8 encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using setlocale" Then realize that Tcl uses setlocale(), by setting the locale to "C". That means for the C runtime that everything is just bytes, no interpretation to the bytes is done. That's the mode Tcl uses. Microsoft only implemented the ".UTF8" locale in UCR > 1803, not in any earlier runtimes. Since Tcl doesn't use this mode anyway, who cares! The C runtime only knows about bytes then. Hope this helps, Jan Nijtmans |
From: Ashok N. <apn...@ya...> - 2025-05-06 13:36:16
|
Jan, I specifically said the issue is about MSVCRT, not UCRT and you respond again with UCRT docs from the same page! :shrug: ________________________________ From: Jan Nijtmans <jan...@gm...> Sent: Tuesday, May 6, 2025 3:26 PM To: Ashok Nadkarni <apn...@ya...> Cc: Tcl Core List <tcl...@li...> Subject: Re: [TCLCORE] TIP 716 ready for comments Op di 6 mei 2025 om 06:47 schreef Ashok Nadkarni: > Further, given the link you gave has no discussion of MSVCRT, what was the source of your statement <MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_,> ? If you can provide the basis for your conclusion, that would really help. Start reading the "UTF-8 support" chapter, Quoting: "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C Runtime supports using a UTF-8 code page. The change means that char strings passed to C runtime functions can expect strings in the UTF-8 encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using setlocale" Then realize that Tcl uses setlocale(), by setting the locale to "C". That means for the C runtime that everything is just bytes, no interpretation to the bytes is done. That's the mode Tcl uses. Microsoft only implemented the ".UTF8" locale in UCR > 1803, not in any earlier runtimes. Since Tcl doesn't use this mode anyway, who cares! The C runtime only knows about bytes then. Hope this helps, Jan Nijtmans |
From: Kevin K. <kev...@gm...> - 2025-05-06 18:29:15
|
I do worry a bit about Tcl extensions whose purpose it is to wrap third-party code that may be linked with a different C runtime. While I know - or rather can relearn when I must - how to manage different runtimes within the same process, I've certainly seen a lot of third-party libraries that demand that the whole process use the same runtime (usually MSVCRT) because otherwise memory corruption results - many programmers are unaware of the possibility that malloc() and free() may be coming from different runtimes if the respective callers were built in different DLLs. Would we want to continue supporting MSVCRT(D) and statically linked LIBC's for that reason? On Tue, May 6, 2025 at 3:48 AM Ashok Nadkarni via Tcl-Core < tcl...@li...> wrote: > Jan, > > It occurs to me that 9.1 will stop supporting Windows releases prior to > Win 10. Given that, we can simply drop support for MinGW/MSVCRT as all > Windows systems will have UCRT installed. If that is agreeable, we can drop > the discussion around MSVCRT and focus on other points of contention in > 716/718. > > /Ashok > > > ------------------------------ > *From:* Ashok Nadkarni via Tcl-Core > *Sent:* Tuesday, May 6, 2025 10:17 AM > *To:* Jan Nijtmans > *Cc:* Tcl Core List > *Subject:* Re: [TCLCORE] TIP 716 ready for comments > > No, I'm afraid your post below does not clarify at all. That link you gave > pertains to UCRT, not MSVCRT which, unless I read with my blinders on, is > not mentioned anywhere on that page. Adding UCRT to the discussion, first > with passing of FILE* between runtimes, and now to discussion of > setlocale(), only serves to obfuscate further. > > Further, given the link you gave has no discussion of MSVCRT, what was > the source of your statement <*MSVCRT supports the "utf-8" _encoding_, > not the "utf-8" _locale_*,> ? If you can provide the basis for your > conclusion, that would really help. > > /Ashok > > > ------------------------------ > *From:* Jan Nijtmans > *Sent:* Tuesday, May 6, 2025 3:08 AM > *To:* Ashok Nadkarni > *Cc:* Tcl Core List > *Subject:* Re: [TCLCORE] TIP 716 ready for comments > > Op ma 5 mei 2025 om 19:22 schreef Ashok Nadkarni: > > Now, with regard to the mingw msvcrt comment... > > > > It seems to me from your comment about GetACP() in MingW that you think > I was suggesting the manifest would not work in MingW. That is not so. > GetACP() will work the same way and return utf-8 in the presence of a > manifest but the problem is MSVCRT does not support UTF-8. > > > > I do not know if you read the link that I had referenced in the TIP > https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt > > > > To quote from there - It doesn't support the UTF-8 locale ("It" being > MSVCRT) > > > > When the official release document says MSVCRT does not support UTF-8, I > take that at face value. > > Let me clarify then. See: > > https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 > > With msvcrt, you cannot do a setlocale("en_US.UTF8"), with UCRT it works; > > Tcl doesn't support that either. The only setlocale() call is here: > < > https://core.tcl-lang.org/tcl/file?name=win/tclAppInit.c&ci=ead995eddf5fff98&ln=111 > > > > MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_, > that's a different thing > > Hope this clarifies enough. > > Happy hacking, > Jan Nijtmans > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > -- 73 de ke9tv/2, Kevin |
From: Ashok N. <apn...@ya...> - 2025-05-07 16:44:33
|
Kevin, I'm afraid I don't really know the answer to your concern about the impact of dropping MSVCRT support. My impression is that most devs are migrating to UCRT but that is only an impression, not hard evidence. /Ashok ________________________________ From: Kevin Kenny <kev...@gm...> Sent: Tuesday, May 6, 2025 11:58 PM To: Ashok Nadkarni <apn...@ya...> Cc: Jan Nijtmans <jan...@gm...>; Tcl Core List <tcl...@li...> Subject: Re: [TCLCORE] TIP 716 ready for comments I do worry a bit about Tcl extensions whose purpose it is to wrap third-party code that may be linked with a different C runtime. While I know - or rather can relearn when I must - how to manage different runtimes within the same process, I've certainly seen a lot of third-party libraries that demand that the whole process use the same runtime (usually MSVCRT) because otherwise memory corruption results - many programmers are unaware of the possibility that malloc() and free() may be coming from different runtimes if the respective callers were built in different DLLs. Would we want to continue supporting MSVCRT(D) and statically linked LIBC's for that reason? On Tue, May 6, 2025 at 3:48 AM Ashok Nadkarni via Tcl-Core <tcl...@li...<mailto:tcl...@li...>> wrote: Jan, It occurs to me that 9.1 will stop supporting Windows releases prior to Win 10. Given that, we can simply drop support for MinGW/MSVCRT as all Windows systems will have UCRT installed. If that is agreeable, we can drop the discussion around MSVCRT and focus on other points of contention in 716/718. /Ashok ________________________________ From: Ashok Nadkarni via Tcl-Core Sent: Tuesday, May 6, 2025 10:17 AM To: Jan Nijtmans Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments No, I'm afraid your post below does not clarify at all. That link you gave pertains to UCRT, not MSVCRT which, unless I read with my blinders on, is not mentioned anywhere on that page. Adding UCRT to the discussion, first with passing of FILE* between runtimes, and now to discussion of setlocale(), only serves to obfuscate further. Further, given the link you gave has no discussion of MSVCRT, what was the source of your statement <MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_,> ? If you can provide the basis for your conclusion, that would really help. /Ashok ________________________________ From: Jan Nijtmans Sent: Tuesday, May 6, 2025 3:08 AM To: Ashok Nadkarni Cc: Tcl Core List Subject: Re: [TCLCORE] TIP 716 ready for comments Op ma 5 mei 2025 om 19:22 schreef Ashok Nadkarni: > Now, with regard to the mingw msvcrt comment... > > It seems to me from your comment about GetACP() in MingW that you think I was suggesting the manifest would not work in MingW. That is not so. GetACP() will work the same way and return utf-8 in the presence of a manifest but the problem is MSVCRT does not support UTF-8. > > I do not know if you read the link that I had referenced in the TIP https://www.msys2.org/docs/environments/#msvcrt-vs-ucrt > > To quote from there - It doesn't support the UTF-8 locale ("It" being MSVCRT) > > When the official release document says MSVCRT does not support UTF-8, I take that at face value. Let me clarify then. See: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 With msvcrt, you cannot do a setlocale("en_US.UTF8"), with UCRT it works; Tcl doesn't support that either. The only setlocale() call is here: <https://core.tcl-lang.org/tcl/file?name=win/tclAppInit.c&ci=ead995eddf5fff98&ln=111> MSVCRT supports the "utf-8" _encoding_, not the "utf-8" _locale_, that's a different thing Hope this clarifies enough. Happy hacking, Jan Nijtmans _______________________________________________ Tcl-Core mailing list Tcl...@li...<mailto:Tcl...@li...> https://lists.sourceforge.net/lists/listinfo/tcl-core -- 73 de ke9tv/2, Kevin |