From: Jan N. <jan...@gm...> - 2025-03-20 15:51:22
|
This is a CFV warning for TIP #626. for Tcl 9.1+: Command arguments > 2^31 elements <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> If you think this is a bad idea, speak up now. If not, I'll start the vote in a few days. Regards, Jan Nijtmans |
From: Harald O. <har...@el...> - 2025-03-21 09:17:05
Attachments:
OpenPGP_signature.asc
|
Jan, I think, this is a ggod idea. For me, cutting old stuff is always good. It hurts once, but does not bite 10 times later. I also see the decision to not release TCL/Tk 8.7 was a good decision. Now, we really migrate to 9.0 and have a better and cleaner environment. Sorry for anybody struggelling, but 9.0 is really better, specially for Tk. Take care, Harald Am 20.03.2025 um 16:50 schrieb Jan Nijtmans: > This is a CFV warning for TIP #626. for Tcl 9.1+: > Command arguments > 2^31 elements > <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> > > If you think this is a bad idea, speak up now. If not, > I'll start the vote in a few days. > > Regards, > Jan Nijtmans |
From: Dipl. I. S. G. B. <se...@us...> - 2025-03-21 11:43:17
|
Unfortunately exactly this is not old stuff. It looks rather like elimination of self-induced issue (introduced by TIP 627). I never understand the necessity to use 64-bit types wherever possible. And when in case of string/list length it may be indeed justified (however one could also provide them with new types like bigstring or biglist), by commands it is not only questionable (who really needs 2 billions arguments by the command?), it additionally bothers with extra memory consume (per command), with new layer to hold it backward compatible (and performance cost involved), etc. In my opinion TIP 627 was a mistake (together with TIP 626 and co)... let alone another stuff like, Command::refCount or Command::cmdEpoch... But however I'd not expect the exceeding of 2 billions references, it is indeed imaginable (as for refCount), cmdEpoch may overflow without any issue, but it is also 64-bit now. There is only one case where a command/proc really could get 2 billions arguments - it is an expansion of really large list, but... In such case it'd rather not named arguments but have something like `args` parameter, what is also a list object, so why not just consider it as special case and make some "transparent" expansion possible instead? It'd be much nicer enhancement without to lost backwards compatibility (generation of crutches for extra layer to support both 32-bits and 64-bits arguments), enlarge memory usage etc. The issue with extra memory consumption (by usage of larger types en masse) is also important, because despite the memory is cheap nowadays, but not the CPU cache (and that extra memory definitely causes more cache-washouts, invalidations, etc). The issue is - such en masse types replacement would bother for all the regular cases too, not only there where it is may be needed (I guess not even in 1%, probably we speaking about fractions). If one would think twice, one could consider that all that is not just something like "640K ought to be enough for anybody", but reasonable and legitimate arguments against 64-bit types everywhere at all costs. Pity nobody seems to care about and even votes of 3 people are enough for such grave mods. Regards, Serg. 21.03.2025 10:16, Harald Oehlmann wrote: > Jan, > I think, this is a ggod idea. > For me, cutting old stuff is always good. > It hurts once, but does not bite 10 times later. > > I also see the decision to not release TCL/Tk 8.7 was a good decision. > Now, we really migrate to 9.0 and have a better and cleaner environment. > Sorry for anybody struggelling, but 9.0 is really better, specially for Tk. > > Take care, > Harald > > Am 20.03.2025 um 16:50 schrieb Jan Nijtmans: > >> This is a CFV warning for TIP #626. for Tcl 9.1+: Command arguments > 2^31 elements <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md [1]> If you think this is a bad idea, speak up now. If not, I'll start the vote in a few days. Regards, Jan Nijtmans > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core [2] Links: ------ [1] https://core.tcl-lang.org/tips/doc/trunk/tip/626.md [2] https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Harald O. <har...@el...> - 2025-03-21 12:23:48
Attachments:
OpenPGP_signature.asc
|
Dear Sergey, thank you for the comment. I appreciate your valuable opinion and try to get this in. We mostly have the issue, that many people on voting time don't know about the consequences. You are *the* expert on timing and memory usage. Maybe, we may get the discussion to a level of real numbers: * what is the impact in time and memory usage? I don't think, Jan will react on this post without concrete numbers... It is often hard to get Wizard-level opinions from you, Christians, Marc etc. and this is so valuable. Thanks for all, Harald Am 21.03.2025 um 12:00 schrieb Dipl. Ing. Sergey G. Brester: > Unfortunately exactly this is not old stuff. It looks rather like > elimination of self-induced issue (introduced by TIP 627). > I never understand the necessity to use 64-bit types wherever possible. > And when in case of string/list length it may be indeed justified > (however one could also provide them with new types like bigstring or > biglist), by commands it is not only questionable (who really needs 2 > billions arguments by the command?), it additionally bothers with extra > memory consume (per command), with new layer to hold it backward > compatible (and performance cost involved), etc. > > In my opinion TIP 627 was a mistake (together with TIP 626 and co)... > let alone another stuff like, Command::refCount or Command::cmdEpoch... > But however I'd not expect the exceeding of 2 billions references, it is > indeed imaginable (as for refCount), cmdEpoch may overflow without any > issue, but it is also 64-bit now. > > There is only one case where a command/proc really could get 2 billions > arguments - it is an expansion of really large list, but... In such case > it'd rather not named arguments but have something like `args` > parameter, what is also a list object, so why not just consider it as > special case and make some "transparent" expansion possible instead? > It'd be much nicer enhancement without to lost backwards compatibility > (generation of crutches for extra layer to support both 32-bits and 64- > bits arguments), enlarge memory usage etc. > > The issue with extra memory consumption (by usage of larger types en > masse) is also important, because despite the memory is cheap nowadays, > but not the CPU cache (and that extra memory definitely causes more > cache-washouts, invalidations, etc). The issue is - such en masse types > replacement would bother for all the regular cases too, not only there > where it is may be needed (I guess not even in 1%, probably we speaking > about fractions). > > If one would think twice, one could consider that all that is not just > something like "640K ought to be enough for anybody", but reasonable and > legitimate arguments against 64-bit types everywhere at all costs. > Pity nobody seems to care about and even votes of 3 people are enough > for such grave mods. > > Regards, > Serg. > > 21.03.2025 10:16, Harald Oehlmann wrote: > >> Jan, >> I think, this is a ggod idea. >> For me, cutting old stuff is always good. >> It hurts once, but does not bite 10 times later. >> >> I also see the decision to not release TCL/Tk 8.7 was a good decision. >> Now, we really migrate to 9.0 and have a better and cleaner environment. >> Sorry for anybody struggelling, but 9.0 is really better, specially for Tk. >> >> Take care, >> Harald >> >> >> Am 20.03.2025 um 16:50 schrieb Jan Nijtmans: >>> This is a CFV warning for TIP #626. for Tcl 9.1+: Command arguments > >>> 2^31 elements <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md >>> <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md>> If you think >>> this is a bad idea, speak up now. If not, I'll start the vote in a >>> few days. Regards, Jan Nijtmans |
From: Eric B. <eri...@gm...> - 2025-03-21 13:15:19
|
Hi, My voice doesn't count, but I totally agree with Sergey. What's the point of creating such a command? Also just try this, which is less than 2**31: # Prepare list of parameters set L {} for {set i 0} {$i < 2**31-1} {incr i} { lappend L a$i } I don't write the code next, I had to close the application before it terminates... But imagine bytecoding the proc placing parameters in LVT... Eric Le ven. 21 mars 2025 à 13:24, Harald Oehlmann <har...@el...> a écrit : > Dear Sergey, > thank you for the comment. > I appreciate your valuable opinion and try to get this in. > > We mostly have the issue, that many people on voting time don't know > about the consequences. > > You are *the* expert on timing and memory usage. > Maybe, we may get the discussion to a level of real numbers: > > * what is the impact in time and memory usage? > > I don't think, Jan will react on this post without concrete numbers... > > It is often hard to get Wizard-level opinions from you, Christians, Marc > etc. and this is so valuable. > > Thanks for all, > Harald > > Am 21.03.2025 um 12:00 schrieb Dipl. Ing. Sergey G. Brester: > > Unfortunately exactly this is not old stuff. It looks rather like > > elimination of self-induced issue (introduced by TIP 627). > > I never understand the necessity to use 64-bit types wherever possible. > > And when in case of string/list length it may be indeed justified > > (however one could also provide them with new types like bigstring or > > biglist), by commands it is not only questionable (who really needs 2 > > billions arguments by the command?), it additionally bothers with extra > > memory consume (per command), with new layer to hold it backward > > compatible (and performance cost involved), etc. > > > > In my opinion TIP 627 was a mistake (together with TIP 626 and co)... > > let alone another stuff like, Command::refCount or Command::cmdEpoch... > > But however I'd not expect the exceeding of 2 billions references, it is > > indeed imaginable (as for refCount), cmdEpoch may overflow without any > > issue, but it is also 64-bit now. > > > > There is only one case where a command/proc really could get 2 billions > > arguments - it is an expansion of really large list, but... In such case > > it'd rather not named arguments but have something like `args` > > parameter, what is also a list object, so why not just consider it as > > special case and make some "transparent" expansion possible instead? > > It'd be much nicer enhancement without to lost backwards compatibility > > (generation of crutches for extra layer to support both 32-bits and 64- > > bits arguments), enlarge memory usage etc. > > > > The issue with extra memory consumption (by usage of larger types en > > masse) is also important, because despite the memory is cheap nowadays, > > but not the CPU cache (and that extra memory definitely causes more > > cache-washouts, invalidations, etc). The issue is - such en masse types > > replacement would bother for all the regular cases too, not only there > > where it is may be needed (I guess not even in 1%, probably we speaking > > about fractions). > > > > If one would think twice, one could consider that all that is not just > > something like "640K ought to be enough for anybody", but reasonable and > > legitimate arguments against 64-bit types everywhere at all costs. > > Pity nobody seems to care about and even votes of 3 people are enough > > for such grave mods. > > > > Regards, > > Serg. > > > > 21.03.2025 10:16, Harald Oehlmann wrote: > > > >> Jan, > >> I think, this is a ggod idea. > >> For me, cutting old stuff is always good. > >> It hurts once, but does not bite 10 times later. > >> > >> I also see the decision to not release TCL/Tk 8.7 was a good decision. > >> Now, we really migrate to 9.0 and have a better and cleaner environment. > >> Sorry for anybody struggelling, but 9.0 is really better, specially for > Tk. > >> > >> Take care, > >> Harald > >> > >> > >> Am 20.03.2025 um 16:50 schrieb Jan Nijtmans: > >>> This is a CFV warning for TIP #626. for Tcl 9.1+: Command arguments > > >>> 2^31 elements <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md > >>> <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md>> If you think > >>> this is a bad idea, speak up now. If not, I'll start the vote in a > >>> few days. Regards, Jan Nijtmans > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
From: Jan N. <jan...@gm...> - 2025-03-21 13:17:31
|
Op vr 21 mrt 2025 om 13:24 schreef Harald Oehlmann: > You are *the* expert on timing and memory usage. > Maybe, we may get the discussion to a level of real numbers: > > * what is the impact in time and memory usage? If a benchmark would help to convince people, please share benchmark results here. I didn't do that, because TIP #626 only reduces lines of code. For example, have a look at those lines in the TclEvalEx() function (which is a heavily-used function!), and similar lines in TclNRExecuteByteCode(): <https://core.tcl-lang.org/tcl/artifact/4e2f21469d278886?ln=5408-5414> <https://core.tcl-lang.org/tcl/artifact/250a9712ef70bfb3?ln=2827-2832> Because everywhere when we put a list ad command, we have to check for the maximum command-size and produce an error if the command-size is to large. In the TIP#626 implementation, this error-handling is all gone. Hope this helps, Jan Nijtmans |
From: Jan N. <jan...@gm...> - 2025-03-21 13:40:55
|
Op vr 21 mrt 2025 om 14:16 schreef Jan Nijtmans: > In the TIP#626 implementation, > this error-handling is all gone. Thanks for reminding me: <https://core.tcl-lang.org/tcl/info/124360b3fda00f0f> since TclCommandWordLimitError() isn't used anywhere any more it is now removed completely. Then about the cost: Only when extensions create their own command (which they usually do when they are initialized), it takes an additional allocation. And when the function is called, there's a small wrapper-function involved (which just passes the parameters through). In my judgement, that's a much smaller cost than the maintenance cost of using TclCommandWordLimitError() everywhere a list element count becomes a command arg count. I'm sure there are cases in Tcl 9.0 where such a check would be needed, but where it's missing now. Hope this helps, Jan Nijtmans |
From: Dipl. I. S. G. B. <se...@us...> - 2025-03-21 19:05:26
|
Well, as already said it is anyway very questionable approach, and even a generation of lists with 2**31 elements in tcl is very doubtful (even repeating the same element 2**31 times expects alone by creation 16GB memory, by duplication for eval of proc further 16GB, so 32GB after end, without to do something with the list). Not to mention the memory consume, evaluation times and co in the real case (not in "obscure" case like bottom example with lrepeat). Again, a pure tcl-list with 2**31 real elements (even numbers, let alone the strings and co) would need too many memory, so it makes the usage of them senseless. No matter direct or expanded to command args, let alone the implicit duplication (CoW, CoE, etc) if the object is shared. And all this to avoid something like that? % llength [set l [lrepeat [expr {2**31}] _]] 2147483648 % proc test args {llength $args}; test {*}$l Number of words (2147483649) in command exceeds limit 2147483647. The impact on memory usage (by replacing int32 with int64) is not interesting per se (as I said the memory is cheap), but must be considered in conjunction with CPU cache/granularity/coherence, rather as possible cause for performance deterioration. The issue with the performance is obvious without to measure something: * the usage of extra layer (cmdWrapperProc) would make it already slower directly. How can it be otherwise if still one function gets involved per invocation of any command (invokeStk instructions and co)? * the few extra bytes in tcls internal structures wouldn't show it direct with simple measurements, but would surely play a role, because CPU cache is not endless and every single byte (ADDED EN MASSE), can cause more washouts, cache invalidations, etc. And if in case of L3 it is not so grave, in case of L1 or L2 we'd speak about factor 100x (0.33ns L1- vs. 33.33ns MEM-access) * the measurement would be complex, but as already mentioned - totally unneeded, because it's basically obvious. However if you really want - here is even though artificial construct, but it can illustrate the regression: timerate -calibrate {}; proc test k { upvar cmd cmd body body if {![info exists cmd($k)]} { # just a lot of procs with dummy bodies (if called with 0): set pbody [string repeat "if {$a} {" 100][string repeat "}" 100] set i 0; while {$i<=$k} { proc [set cmd($i) test$i] {a args} $pbody incr i } # reorder it additionally to array hash buckets arbitrary but deterministic, # (so they are "random" in the same way - able to use it in different shells): set i 0; set s $k; set seed 16807; while {$i<=$k/2} { set seed [expr {($i + $seed / (1 + $s<<3) + 2**($i % 64) * 3**($seed % 39) % 0x17A311E02F4623 + 1) % $k}] set s [expr {$seed % $k}] set p [lindex $cmd($i)]; set cmd($i) $cmd($s); set cmd($s) $p; incr i } } # body calling all the procs at once (in specific "random" order, targeting different mem-blocks): if {![info exists body($k)]} { set body(k) {} set i 0; while {$i<=$k} { append body(k) "$cmd($i) 0 0 0 0 0; "; incr i } timerate $body(k) 100; # warm-up this body } timerate $body(k) 5000; # measure 5 seconds } GREEN - 9.1 without TIPs 627/626 & no 64-bit counters in Command- and other structs (Tcl_Size <-> int), RED - 9.1 with TIPs 627/626 & 64-bit counters % TEST 10000 + 7488.61 ΜS/# 667 # 133.54 #/SEC 4994.904 NET-MS - 9769.15 ΜS/# 511 # 102.36 #/SEC 4992.038 NET-MS % TEST 1000 + 362.284 ΜS/# 13800 # 2760.3 #/SEC 4999.517 NET-MS - 415.089 ΜS/# 12044 # 2409.1 #/SEC 4999.330 NET-MS % TEST 100 + 29.7158 ΜS/# 168078 # 33652.2 #/SEC 4994.565 NET-MS - 31.1066 ΜS/# 160574 # 32147.6 #/SEC 4994.904 NET-MS Although the last one (yes, it would already begin by 100 procs) is not so conspicuous anymore, because it seems to match the CPU cache size and therefore simply has fewer cache-washouts, I think it'd more or less illustrate only the extra cost of new interim layer (cmdWrapperProc and co), but may be just some fluctuations or measurement uncertainty (however looks stable). Whereas the jump between 1000 and 10000 is obviously non-linear (not 10x), so the CPU cache washouts wreaking havoc here definitely, but MORE AGGRESSIVE FOR THE RED PLAYER. All this is single-threaded, so you would notice much more regression multi-threaded or under parasitic load, because L3 (and on some platforms also L2) will be shared between multiple cores and therefore can cause more cache invalidations. However the result may surely depend on CPU cache size, etc. One would probably see almost the same picture comparing it with 8.7 (where Tcl_Size is int), but it is not quite correct test, since 9.x is slightly different (got a bit more features etc) and for instance its invocation and TEBC behaviour may differ. But the tendency is noticeable. Sure, every change, even if causes a performance regression, may be justified, but I don't understand how it could be applied here - even by big-data researches or data-science work, I'd never ever try to use such large things as plain tcl-objects in memory, let alone do their processing there (without indices, DB, special objects or handling). And my objections are about the whole complex of such 64-bit "enhancements" and not about particular case of TIP #626, what imho targets just the consequences of TIP #627 and other similar changes. Anyway, the application is and remains 64-bit-capable without the need to address more than 2 billions elements inside every single internal of it. Sure it must be able to handle more than 2GB/4GB of memory as whole process, but not for single list, command or string. It is as already said very doubtful requirement and neither would affect the usage of 64-bit architecture in some regard, nor can justify the effort to change, the deprecation of int-usage everywhere or loss of backward compatibility of the code (pieces of code). Hope this helps, Serg. 21.03.2025 13:23, Harald Oehlmann wrote: > Dear Sergey, > thank you for the comment. > I appreciate your valuable opinion and try to get this in. > > We mostly have the issue, that many people on voting time don't know about the consequences. > > You are *the* expert on timing and memory usage. > Maybe, we may get the discussion to a level of real numbers: > > * what is the impact in time and memory usage? > > I don't think, Jan will react on this post without concrete numbers... > > It is often hard to get Wizard-level opinions from you, Christians, Marc etc. and this is so valuable. > > Thanks for all, > Harald |
From: Jan N. <jan...@gm...> - 2025-03-21 21:18:07
|
Op vr 21 mrt 2025 om 20:05 schreef Dipl. Ing. Sergey G. Brester via Tcl-Core <tcl...@li...>: > green - 9.1 without TIPs 627/626 & no 64-bit counters in Command- and other structs (Tcl_Size <-> int), > red - 9.1 with TIPs 627/626 & 64-bit counters My conclusion (but you have the right to differ in your opinion). Since TIP #626 doesn't change any struct members, any change in timing you see here is can be attributed to your additional Tcl_Size <-> int changes, not to the TIP #626 implementation. What will increase is the stack usage, because any function having a Tcl_Size objc argument will use 4 bytes more on the stack than int objc arguments. That will only be noticed in deep recursive function calls, that's not what you are doing in this example. Neither does Tcl do that in the new NR implementation. Thanks! Jan Nijtmans |
From: Christian W. <Chr...@t-...> - 2025-03-21 22:37:45
|
Pathetic opinion of someone who's not in charge: In a Bird's-eye view (or U-boat perspective) the number of parameters to a procedure ideally should be equal to the number of elements a list can contain. Pure symmetry. Aesthetics. And it may be infinite at least in theory of course, disregarding any contemporary machinery and algorithms. I.e. "argc" or "objc" shall be a bignum (I'm kidding). Just another two hundredths of your favorite currency unit, Christian |
From: <apn...@ya...> - 2025-03-23 17:29:10
|
Jan, Having had only a cursory look so far, I only have a few preliminary comments and questions. The implementation touches a lot of files and although most are nominal size changes, it will take some time to review. Please wait at least a couple of weeks, if not more, before calling for a vote. I have the same reservations as Sergey and Eric but I also see the symmetry with list lengths that Christian W. cited so I am ambivalent at the moment. I did not see any test cases. I am of the opinion that for any TIP, not just 626, a reasonable test suite should be in place before a CFV. This is particularly important when so many files and API's are touched by the TIP. The existing test suite does not suffice because it does not cover the enhancement the TIP proposes. Early 9.0 betas shipped without any tests for 64-bit support and as might be expected, practically no string commands worked for large data. That is not acceptable imo. Accordingly, for 626, tests should be added as well, probably in bigdata.test, testing both byte-compiled and uncompiled forms as they exercise different code paths. Does the TIP only address the generic framework for passing arguments or are all core commands expected to also work? I ask because just my second attempt to exercise the functionality failed (I am not sure how to practically generate a large number of arguments other than use of {*}): % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s can't read "s": no such variable A strange error. May be just an isolated bug but regardless, test cases that exercise at least basic commands are needed to ensure there is not a more general fundamental issue. This is on Stefan's monster 128GB machine so I don't think memory is an issue. /Ashok -----Original Message----- From: Jan Nijtmans <jan...@gm...> Sent: Thursday, March 20, 2025 9:21 PM To: Tcl Core List <tcl...@li...> Subject: [TCLCORE] CFV warning: TIP #626: Command arguments > 2^31 elements This is a CFV warning for TIP #626. for Tcl 9.1+: Command arguments > 2^31 elements < <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> If you think this is a bad idea, speak up now. If not, I'll start the vote in a few days. Regards, Jan Nijtmans _______________________________________________ Tcl-Core mailing list <mailto:Tcl...@li...> Tcl...@li... <https://lists.sourceforge.net/lists/listinfo/tcl-core> https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Jan N. <jan...@gm...> - 2025-03-23 18:54:20
|
Op zo 23 mrt 2025 om 18:28 schreef <apn...@ya...>: > Jan, > Having had only a cursory look so far, I only have a few preliminary comments and questions. > > The implementation touches a lot of files and although most are nominal size changes, it will take some time to review. Please wait at least a couple of weeks, if not more, before calling for a vote. That's OK. You will see the pattern soon. > I did not see any test cases. Since ALL of the commands are modified to use Tcl_Size objc, all current testcases test the new behavior. All current extensions (like thread or Itcl) still use the "int objc" functions, that's how you can verify that the original functions still work OK too. > Does the TIP only address the generic framework for passing arguments or are all core commands expected to also work? I ask because just my second attempt to exercise the functionality failed (I am not sure how to practically generate a large number of arguments other than use of {*}): > > % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s > > can't read "s": no such variable You already wrote a testcase for that, see: <https://core.tcl-lang.org/tcl/info/eb734aab392c0b31> This testcase should be modified now to no longer give an error. All core commands and extensions are expected to work as they do now. The only exception I am aware of is "nsf", since it pokes into the internal structures. Hope this helps, Jan Nijtmans |
From: <apn...@ya...> - 2025-03-24 01:33:40
|
Jan, Sorry, I understood neither of the points you made. > Since ALL of the commands are modified to use Tcl_Size objc, all current testcases test the new behavior. This means nothing. The new behavior defined by the TIP requires commands to accept > 2**31 arguments. Nothing in the test suite verifies this. I'll point out again that the same exact logic could be applied to the 32-bit -> 64bit transition for 9.0. "All structures now use Tcl_Size so all current testcases test..". And this thinking was probably why tests were not added there either. And yet in reality most commands did not work with lengths > 2**31 even in beta 1 until the implementation was specifically tested in bigdata.test. If you were in fact right about the current test suite being adequate, why would the simple test I did fail? > You already wrote a testcase for that, see:. The test you mention verifies that 9.0 raises an error instead of crashing when argument lengths are exceeded. It does not verify correct operation of commands. I am not sure I see the relevance to the "no such variable" failure example I gave. -----Original Message----- From: Jan Nijtmans <jan...@gm...> Sent: Monday, March 24, 2025 12:24 AM To: Tcl Core List <tcl...@li...> Subject: Re: [TCLCORE] CFV warning: TIP #626: Command arguments > 2^31 elements Op zo 23 mrt 2025 om 18:28 schreef < <mailto:apn...@ya...> apn...@ya...>: > Jan, > Having had only a cursory look so far, I only have a few preliminary comments and questions. > > The implementation touches a lot of files and although most are nominal size changes, it will take some time to review. Please wait at least a couple of weeks, if not more, before calling for a vote. That's OK. You will see the pattern soon. > I did not see any test cases. Since ALL of the commands are modified to use Tcl_Size objc, all current testcases test the new behavior. All current extensions (like thread or Itcl) still use the "int objc" functions, that's how you can verify that the original functions still work OK too. > Does the TIP only address the generic framework for passing arguments or are all core commands expected to also work? I ask because just my second attempt to exercise the functionality failed (I am not sure how to practically generate a large number of arguments other than use of {*}): > > % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s > > can't read "s": no such variable You already wrote a testcase for that, see: < <https://core.tcl-lang.org/tcl/info/eb734aab392c0b31> https://core.tcl-lang.org/tcl/info/eb734aab392c0b31> This testcase should be modified now to no longer give an error. All core commands and extensions are expected to work as they do now. The only exception I am aware of is "nsf", since it pokes into the internal structures. Hope this helps, Jan Nijtmans _______________________________________________ Tcl-Core mailing list <mailto:Tcl...@li...> Tcl...@li... <https://lists.sourceforge.net/lists/listinfo/tcl-core> https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Gustaf N. (sslmail) <ne...@wu...> - 2025-03-24 09:52:23
|
> If you think this is a bad idea, speak up now. If not, > I'll start the vote in a few days. Since you asked for comments… - “bad idea” is strong word, but I do have some reservations - I appreciate the symmetry in the design, and I believe that allowing a large number of elements is a step in the right direction. - however, i am not full happy with the implementation: * having to use the wrapper in the standard case does not look right to me * The wrapper introduces a slight performance overhead for each invocation, even if it is likely negligible in most cases. * The wrapper can break extensions that require a tight coupling with the Tcl core. Currently, NSF is not wrapper-aware and still uses the legacy interfaces for NSF methods, even when compiled with Tcl9. While this issue can be fixed, it involves some work and adds complexity to support both types of interfaces. In my opinion, a better approach might have been to move directly to a 64-bit objc interface for Tcl9 to eliminate the need for a wrapper altogether. Perhaps the decision was influenced by the incremental nature of the tips, or there might be other reasons I’m not aware of. But maybe, it is not too late. All the best -g |
From: Jan N. <jan...@gm...> - 2025-03-24 10:07:43
|
Op ma 24 mrt 2025 om 10:52 schreef Gustaf Neumann: > In my opinion, a better approach might have been to move directly to a 64-bit objc interface for Tcl9 to eliminate the need for a wrapper altogether. > Perhaps the decision was influenced by the incremental nature of the tips, or there might be other reasons I’m not aware of. But maybe, it is not too late. The main reason was that doing this ALL extensions have to be modified using "Tcl_Size objc". I tried that with extensions like Itcl and Itk, and encountered crashes, which were only fixed a few weeks ago. I didn't want to delay Tcl 9.0 for this. > * The wrapper introduces a slight performance overhead for each invocation, even if it is likely negligible in most cases. That's why the TIP proposes to deprecate all "int argc"and "int objc" functions, in favor of the Tcl_Size objc ones. The wrapper then can be phased out, and removed in Tcl 10 ;-) The maintenance win, no conversions from Tcl_Size <-> int any more, is, IMHO worth much more than the cost of this tiny wrapper. I already converted Tk, Itcl andTk (in a separate branch) to use the new routines, so the cost of the wrapper then becomes 0 (apart from a little code bloat). All extensions can do the same. So, I agree that it would have been better to move directly to a 64-bit objc-interface, but that would have caused a delay in Tcl 9.0. Hope this helps, Jan Nijtmans |
From: Rolf A. <tcl...@po...> - 2025-03-24 10:25:53
|
Count me into the camp with mixed feelings. Jan Nijtmans writes: > Op zo 23 mrt 2025 om 18:28 schreef <apnmbx-public-/E15...@pu...>: > [...] >> I did not see any test cases. Ashok is for sure right with that this needs test cases. > Since ALL of the commands are modified to use Tcl_Size objc, all > current testcases > test the new behavior. All current extensions (like thread or Itcl) > still use the "int objc" > functions, that's how you can verify that the original functions still > work OK too. > >> Does the TIP only address the generic framework for passing >> arguments or are all core commands expected to also work? I ask >> because just my second attempt to exercise the functionality failed >> (I am not sure how to practically generate a large number of >> arguments other than use of {*}): >> >> % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s >> >> can't read "s": no such variable I confirm what Ashok reports here; I get the same. If I haven't missed something watching top during the test run you'll need around 60 GByte free memory to reproduce by yourself. > You already wrote a testcase for that, see: > <https://core.tcl-lang.org/tcl/info/eb734aab392c0b31> > This testcase should be modified now to no longer give an error. > > All core commands and extensions are expected to work as they do now. Given the example above, this doesn't seem to be fully true even for the core commands. First step would be to ensure that the core commands in fact work as expected. If that is done then the next step should be to have a closer look at what this does to the extensions ecosphare. > The only exception I am aware of is "nsf", since it pokes into the > internal structures. > > Hope this helps, > Jan Nijtmans rolf |
From: Rolf A. <tcl...@po...> - 2025-03-24 12:35:25
|
Rolf Ade writes: > Jan Nijtmans writes: >> Op zo 23 mrt 2025 om 18:28 schreef <apnmbx-public-/E15...@pu...>: >> [...] >>> I did not see any test cases. > > Ashok is for sure right with that this needs test cases. > >>> >>> % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s >>> >>> can't read "s": no such variable > > I confirm what Ashok reports here; I get the same. If I haven't missed > something watching top during the test run you'll need around 60 GByte > free memory to reproduce by yourself. Digging a bit this gets stranger. If run interactivly in a tip-626 branch tclsh I see what Ashok reports. If run as script it works as expected. Perhaps this observation triggers some idea about the root cause. rolf |
From: Donald G P. <don...@ni...> - 2025-03-24 13:46:05
|
On 3/20/25 11:50, Jan Nijtmans wrote: > This is a CFV warning for TIP #626. for Tcl 9.1+: > Command arguments > 2^31 elements > <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> > > If you think this is a bad idea, speak up now. If not, > I'll start the vote in a few days. I regret that I did not notice this unfinished business before the release of Tcl 9. Tcl has long defined the Tcl_ObjCmdProc type for extensions to define their own command procedures. TIP 626 adds a new TclObjCmdProc2 type which is nearly the same, but accepts a Tcl_Size number of command words, instead of an int number. The same thing, but bigger. But different enough to need some disruption to achieve the transition. Several Tcl users see the disruption, and don't see the value provided by it, thinking they have no interest in their extension commands being able to process such large numbers of arguments. Consider this alternative conception. We create a new command procedure type, but not just "the same but bigger". Instead, the new type is "more generalized and powerful". The connection between commands and lists has always been important in Tcl. Starting with the introduction of the {*} syntax, Tcl lists became fundamental to the very definition of the language. Yet, Tcl has no public feature in its C interface to represent lists. Whenever Tcl lists must pass as arguments in C, we either choose to pass a (Tcl_Obj *) and then check that the value is a usable list, or we pass an array of Tcl_Obj*. Imagine we add a public Tcl_List type, so that a single argument can be passed in C that represents a Tcl list. Imagine its functions and operations are kept abstract enough that it can accommodate lists larger than INT_MAX elements. Imagine further that the abstract interface doesn't force data structures of a single array, so that the innovations already present in [lseq] and its internal supports can also be passed freely through Tcl interfaces. Many alternative supporting internals can be conceived. With that innovation, it is far more natural to offer to extensions a new command procedure format that lets their commands directly pull the parsed words of an executing command from an abstract list argument, so that extension operations can gain any efficiencies better internal list representations offer. Consider typedef int (Tcl_CmdProcList) (void *clientData, Tcl_Interp *interp, Tcl_List list); Now this is a new public interface offering all extensions improved access to efficiencies of operation current and future, and the expansion to Tcl_Size number of words in an executing command also comes in almost as an afterthought. This new mechanism for extension-defined commands rests more comfortably alongside Tcl_ObjCmdProc, so I see less urgency with deprecating that older mechanism out of existence. People happy with its limitations could just continue being happy with its limitations, with no need to change anything. An extension can do nothing, or embrace the new capabilities fully, or write very trivial wrappers around their existing Tcl_ObjCmdProcs. Taking this approach, we can immediately fully test these interfaces, even on systems without large memory, by testing with the existing [lseq] machinery which represents very large lists without using very large amounts of memory. This approach should also mesh well (and hopefully accelerate) the innovations brought in with [lseq] and develop tools we will need anyway to improve efficiencies of non-shimmering access to list operations on extension-provided Tcl_ObjTypes. Since this new mechanism would be the only way to create an extension command accepting more than INT_MAX arguments, that would accelerate making use of the new Tcl_List interfaces generally. All I currently offer are ideas, not implementations. But it's another way to go. At least in concept I like it better, but I recognize that at some point working code beats dreamy concept. -- | Don Porter Applied and Computational Mathematics Division | | don...@ni... Information Technology Laboratory | | http://math.nist.gov/~DPorter/ NIST | |______________________________________________________________________| |
From: <apn...@ya...> - 2025-03-29 03:31:53
|
I like the concept of a public "Tcl_List" type. However, given it requires significantly more changes to make use of it, it feels more like a Tcl 10 thing (whenever that comes to be!) /Ashok -----Original Message----- From: Donald G Porter via Tcl-Core <tcl...@li...> Sent: Monday, March 24, 2025 7:16 PM To: tcl...@li... Subject: Re: [TCLCORE] CFV warning: TIP #626: Command arguments > 2^31 elements On 3/20/25 11:50, Jan Nijtmans wrote: > This is a CFV warning for TIP #626. for Tcl 9.1+: > Command arguments > 2^31 elements > <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> > > If you think this is a bad idea, speak up now. If not, > I'll start the vote in a few days. I regret that I did not notice this unfinished business before the release of Tcl 9. Tcl has long defined the Tcl_ObjCmdProc type for extensions to define their own command procedures. TIP 626 adds a new TclObjCmdProc2 type which is nearly the same, but accepts a Tcl_Size number of command words, instead of an int number. The same thing, but bigger. But different enough to need some disruption to achieve the transition. Several Tcl users see the disruption, and don't see the value provided by it, thinking they have no interest in their extension commands being able to process such large numbers of arguments. Consider this alternative conception. We create a new command procedure type, but not just "the same but bigger". Instead, the new type is "more generalized and powerful". The connection between commands and lists has always been important in Tcl. Starting with the introduction of the {*} syntax, Tcl lists became fundamental to the very definition of the language. Yet, Tcl has no public feature in its C interface to represent lists. Whenever Tcl lists must pass as arguments in C, we either choose to pass a (Tcl_Obj *) and then check that the value is a usable list, or we pass an array of Tcl_Obj*. Imagine we add a public Tcl_List type, so that a single argument can be passed in C that represents a Tcl list. Imagine its functions and operations are kept abstract enough that it can accommodate lists larger than INT_MAX elements. Imagine further that the abstract interface doesn't force data structures of a single array, so that the innovations already present in [lseq] and its internal supports can also be passed freely through Tcl interfaces. Many alternative supporting internals can be conceived. With that innovation, it is far more natural to offer to extensions a new command procedure format that lets their commands directly pull the parsed words of an executing command from an abstract list argument, so that extension operations can gain any efficiencies better internal list representations offer. Consider typedef int (Tcl_CmdProcList) (void *clientData, Tcl_Interp *interp, Tcl_List list); Now this is a new public interface offering all extensions improved access to efficiencies of operation current and future, and the expansion to Tcl_Size number of words in an executing command also comes in almost as an afterthought. This new mechanism for extension-defined commands rests more comfortably alongside Tcl_ObjCmdProc, so I see less urgency with deprecating that older mechanism out of existence. People happy with its limitations could just continue being happy with its limitations, with no need to change anything. An extension can do nothing, or embrace the new capabilities fully, or write very trivial wrappers around their existing Tcl_ObjCmdProcs. Taking this approach, we can immediately fully test these interfaces, even on systems without large memory, by testing with the existing [lseq] machinery which represents very large lists without using very large amounts of memory. This approach should also mesh well (and hopefully accelerate) the innovations brought in with [lseq] and develop tools we will need anyway to improve efficiencies of non-shimmering access to list operations on extension-provided Tcl_ObjTypes. Since this new mechanism would be the only way to create an extension command accepting more than INT_MAX arguments, that would accelerate making use of the new Tcl_List interfaces generally. All I currently offer are ideas, not implementations. But it's another way to go. At least in concept I like it better, but I recognize that at some point working code beats dreamy concept. -- | Don Porter Applied and Computational Mathematics Division | | don...@ni... Information Technology Laboratory | | http://math.nist.gov/~DPorter/ NIST | |______________________________________________________________________| _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Jan N. <jan...@gm...> - 2025-03-27 07:30:23
|
Op ma 24 mrt 2025 om 13:35 schreef Rolf Ade <tcl...@po...>: > > I confirm what Ashok reports here; I get the same. If I haven't missed > > something watching top during the test run you'll need around 60 GByte > > free memory to reproduce by yourself. > > Digging a bit this gets stranger. If run interactivly in a tip-626 > branch tclsh I see what Ashok reports. The root cause is - probably - that somewhere in the code 'int' is still used in stead of Tcl_Size. I now used the C4244 warning in Visual Studio to find such places, and found some. Can you re-try with the latest "tip-626" branch? I'm getting (on a Windows machine with 16 Gb memory: >tclsh91s.exe % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s unable to alloc 67108864040 bytes >tclsh91s.exe % set s [list abc {*}[lrepeat 0x100000000 x]]; lrange $s 0 1 list construction failed: unable to alloc 34359738416 bytes % set s [lrepeat 0x100000000 x]; lrange $s 0 1 x x My machine doesn't have enough memory for such experiments. Note that the last example doesn't crash the interpreter, but cleans-up and gives an error-message. That should be done in more places .... Thanks! Jan Nijtmans |
From: Rolf A. <tcl...@po...> - 2025-03-27 18:43:17
|
Jan Nijtmans writes: > Op ma 24 mrt 2025 om 13:35 schreef Rolf Ade <tcl...@pu...>: >> > I confirm what Ashok reports here; I get the same. If I haven't missed >> > something watching top during the test run you'll need around 60 GByte >> > free memory to reproduce by yourself. >> >> Digging a bit this gets stranger. If run interactivly in a tip-626 >> branch tclsh I see what Ashok reports. > > The root cause is - probably - that somewhere in the code 'int' > is still used in stead of Tcl_Size. I now used the C4244 warning > in Visual Studio to find such places, and found some. Can > you re-try with the latest "tip-626" branch? ~/tcltk/bin/tip-626-default/bin/tclsh9.1 % set s [string cat {*}[lrepeat 0x100000000 x]]; string length $s 4294967296 Yes, this seems to work now. rolf |
From: Donald G P. <don...@ni...> - 2025-04-02 16:32:56
|
On 3/20/25 11:50, Jan Nijtmans wrote: > This is a CFV warning for TIP #626. for Tcl 9.1+: > Command arguments > 2^31 elements > <https://core.tcl-lang.org/tips/doc/trunk/tip/626.md> > > If you think this is a bad idea, speak up now. If not, > I'll start the vote in a few days. Looking at this again, I notice the call to deprecate Tcl_CreateCommand. I think that's a bad idea. I think the ability to define string-based extension and application commands should live forever. I like Tcl continuing to support extension via the simplest C language tools imaginable, writing what is very close to a main() routine. The continuing interoperability with the oldest textbooks is also appealing to me. -- | Don Porter Applied and Computational Mathematics Division | | don...@ni... Information Technology Laboratory | | http://math.nist.gov/~DPorter/ NIST | |______________________________________________________________________| |
From: Jan N. <jan...@gm...> - 2025-04-04 08:26:51
|
Op wo 2 apr 2025 om 18:33 schreef Donald G Porter: > Looking at this again, I notice the call to deprecate Tcl_CreateCommand. > > I think that's a bad idea. I think the ability to define string-based extension and application commands should live forever. > > I like Tcl continuing to support extension via the simplest C language tools imaginable, writing what is very close to a main() routine. The continuing interoperability with the oldest textbooks is also appealing to me. Understood. I don't have much problems keeping that, it's just a tiny bit of wrapper functionality.The TIP and implementation is modified now. @Ashok, how is your review going? Thanks! Jan Nijtmans |
From: <apn...@ya...> - 2025-04-04 10:21:11
|
Jan, I haven't been reviewing it since you have been continuing with your commits. Since I can only afford the time for one full review, I wanted to wait till you are done. *If* you feel the implementation and test suite is more or less complete, I will review over the weekend. Let me know accordingly /Ashok -----Original Message----- From: Jan Nijtmans <jan...@gm...> Sent: Friday, April 4, 2025 1:56 PM To: Donald G Porter <don...@ni...> Cc: tcl...@li... Subject: Re: [TCLCORE] CFV warning: TIP #626: Command arguments > 2^31 elements Op wo 2 apr 2025 om 18:33 schreef Donald G Porter: > Looking at this again, I notice the call to deprecate Tcl_CreateCommand. > > I think that's a bad idea. I think the ability to define string-based extension and application commands should live forever. > > I like Tcl continuing to support extension via the simplest C language tools imaginable, writing what is very close to a main() routine. The continuing interoperability with the oldest textbooks is also appealing to me. Understood. I don't have much problems keeping that, it's just a tiny bit of wrapper functionality.The TIP and implementation is modified now. @Ashok, how is your review going? Thanks! Jan Nijtmans _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Jan N. <jan...@gm...> - 2025-04-04 20:38:06
|
Op vr 4 apr 2025 om 12:21 schreef apnmbx-public: > I haven't been reviewing it since you have been continuing with your > commits. Since I can only afford the time for one full review, I wanted to > wait till you are done. *If* you feel the implementation and test suite is > more or less complete, I will review over the weekend. Let me know > accordingly Then, go ahead. I'm sure some "bigdata" testcases need to be adapted: They no longer give an error-message but will work as expected now. I don't have a machine with enough memory to do this. And those testcases cannot run in GITHUB CI anyway. Jan Nijtmans |