|
From: Colin M. <col...@ya...> - 2025-11-06 09:37:06
|
Hi Eric, thanks for your support.
For the boolean check I think it's more consistent to disallow
alphabetic forms entirely and only accept numeric zero or non-zero,
which is what boolean operations and functions return anyway. An
alphabetic string should always be treated as a variable reference, no
exceptions to worry about.
Similarly, i don't like the idea of foo(bar) sometimes being treated
as a function call and sometimes as an array reference, depending on
what definitions have been made elsewhere. One should be able to tell
what kind of thing it is just from the expression code, without
searching for other definitions. It is still possible to include an
array reference by writing $foo(bar) so I would treat foo(bar) as a
function call always, and fail if the function has not been defined.
Sacrificing consistency for minor convenience is the slippery slope that
leads to Perl. :-)
You are welcome to post your modified version anywhere you like.
Personally I still want to try a C implementation, but that will take me
a few weeks.
Best regards,
Colin.
On 05/11/2025 22:22, EricT wrote:
> Hi Colin,
>
> I've successfully modified your amazing code to handle arrays. In doing so, I also found 2 other issues, one is with your Boolean check, the other with your function name check, both because of [string is] issues.
>
> - Boolean check: `$tokeneq "false" || $tokeneq "true"` (was `[string isboolean $token]` - treated 'f','n', 't', 'y',etc. variables asboolean false, no, true, yes, ...)
>
> - Function check: `[regexp {^[[:alpha:]]} $token]` (was `[string is alpha $token]` - broke log10, atan2)
>
>
> here's the code for arrays:
>
> # Function call or array reference?
> setnexttok [lindex $::tokens $::tokpos]
> if {$nexttok eq "(" && [regexp {^[[:alpha:]]} $token]} {
> set fun [namespace whichtcl::mathfunc::$token]
> if {$fun ne {}} {
> # It's a function
> incr ::tokpos
> setopcodes "push $fun; "
> appendopcodes [parseFuncArgs]
> return $opcodes
> } else {
> # Not a function, assume array reference
> incr ::tokpos
> setopcodes "push $token; "
> # Parse the index expression - leaves VALUE on stack
> appendopcodes [parse 0]
> # Expect closingparen
> set closing [lindex $::tokens $::tokpos]
> if {$closing ne ")"} {
> error "Calc: expected ')' but found '$closing'"
> }
> # Stack now has: [arrayname,indexvalue]
> incr ::tokpos
> appendopcodes "loadArrayStk; "
> return $opcodes
> }
> }
>
>
> In addition, there has indeed been some changes in thebytecode, land andlor are no longer supported in 9.0 although they work in 8.6.
>
> I had an AI generate some 117 test cases, which all pass on 8.6 and 111 on 9.x (the land/lor not being tested in 9.x).
>
> Colin, with your permission, I can post the code as a new file, with all the test cases, say on a repository atgithub.
>
> I think a new TIP is worth considering; one that promotes assemble to a supported form, with a compile and handle approach to avoid the time parsing theascii byte code text. I think that this would be great for your = command, but also quite useful for others who might want to create their own little languages.
>
> By doing it this way, it remains puretcl, and avoids all the problems with different systems and hardware that a binary extension would create. In the end, I believe your code can achieve performance parity with expr. Not only does it remove half the [expr {...}] baggage, but all the $'s too! So much easier on these old eyes.
>
> Regards,
>
> Eric
>
>
> On Tue, Nov 4, 2025 at 1:06 PM EricT <tw...@gm...> wrote:
>
> Hi Colin,
>
> Hmmm, why can't you do bareword on $a(b) as a(b) you just need to
> do an uplevel to see if a is a variable, if not, it would have to
> be a function. True?
>
> % tcl::unsupported::disassemble script {set a [expr {$b($c)}] }
> snip
> Command 2: "expr {$b($c)}..."
> (2) push1 1 # "b"
> (4) push1 2 # "c"
> (6) loadStk
> (7) loadArrayStk
> (8) tryCvtToNumeric
> (9) storeStk
> (10) done
>
> This doesn't look too much different from what you are producing.
>
> I think what's really needed here is a TIP that would open up the
> bytecode a bit so you don't need to use an unsupported command.
> And then maybe even have a new command to take the string byte
> code you are now producing and return a handle to a cached version
> that was probably equivalent to the existing bytecode. Then your
> cache array would be
>
> set cache($exp) $handle
>
> Instead of it having to parse the text, it could be as fast as
> bytecode. You'd likely be just as fast as expr, and safe as well,
> since you can't pass a string command in where the bareword is
> required:
>
> % set x {[pwd]}
> [pwd]
> % = sqrt(x)
> exp= |sqrt(x)| code= |push ::tcl::mathfunc::sqrt; push x; loadStk;
> invokeStk 2; | ifexist: 0
> expected floating-point number but got "[pwd]"
>
> I think you really have something here, perhaps this is the best
> answer yet to slay the expr dragon!
>
> Regards,
>
> Eric
>
>
> On Tue, Nov 4, 2025 at 6:52 AM Colin Macleod via Tcl-Core
> <tcl...@li...> wrote:
>
> Hi Eric,
>
> That's very neat!
>
> Yes, a pure Tcl version could go into TclLib. I still think it
> may be worth trying a C implementation though. The
> work-around that's needed for array references [= 2* $a(b)]
> would defeat the caching, so it would be good to speed up the
> parsing if possible. Also I think your caching may be
> equivalent to doing byte-compilation, in which case it may
> make sense to use the framework which already exists for that.
>
> Colin.
>
> On 04/11/2025 01:18, EricT wrote:
>> that is:
>>
>> if {[info exist ::cache($exp)]} {
>> tailcall ::tcl::unsupported::assemble $::cache($exp)
>> }
>>
>> (hate gmail!)
>>
>>
>> On Mon, Nov 3, 2025 at 5:17 PM EricT <tw...@gm...> wrote:
>>
>> and silly of me, it should be:
>> if {[info exist ::cache($exp)]} {
>> tailcall ::tcl::unsupported::assemble $::cache($exp)
>> }
>>
>>
>> On Mon, Nov 3, 2025 at 4:50 PM EricT <tw...@gm...>
>> wrote:
>>
>> With a debug line back in plus the tailcall:
>>
>> proc = args {
>> set exp [join $args]
>> if { [info exist ::cache($exp)] } {
>> return [tailcall
>> ::tcl::unsupported::assemble $::cache($exp)]
>> }
>> set tokens [tokenise $exp]
>> deb1 "TOKENS = '$tokens'"
>> set code [compile $tokens]
>> deb1 "GENERATED CODE:\n$code\n"
>> puts "exp= |$exp| code= |$code| ifexist: [info
>> exist ::cache($exp)]"
>> set ::cache($exp) $code
>> uplevel [list ::tcl::unsupported::assemble $code]
>> }
>>
>> % set a 5
>> 5
>> % set b 10
>> 10
>> % = a + b
>> exp= |a + b| code= |push a; loadStk; push b; loadStk;
>> add; | ifexist: 0
>> 15
>> % = a + b
>> 15
>>
>> % time {= a + b} 1000
>> 1.73 microseconds per iteration
>>
>>
>> Faster still!
>>
>> I thought the uplevel was needed to be able to get
>> the local variables, seems not.
>>
>> % proc foo arg {set a 5; set b 10; set c [= a+b+arg]}
>> % foo 5
>> exp= |a+b+arg| code= |push a; loadStk; push b;
>> loadStk; add; push arg; loadStk; add; | ifexist: 0
>> 20
>> % foo 5
>> 20
>>
>> % proc foo arg {global xxx; set a 5; set b 10; set c
>> [= a+b+arg+xxx]}
>>
>> % set xxx 100
>> 100
>> % foo 200
>> 315
>> % time {foo 200} 10000
>> 2.1775 microseconds per iteration
>>
>> % parray cache
>> cache(a + b) = push a; loadStk; push b;
>> loadStk; add;
>> cache(a+b+arg) = push a; loadStk; push b;
>> loadStk; add; push arg; loadStk; add;
>> cache(a+b+arg+xxx) = push a; loadStk; push b;
>> loadStk; add; push arg; loadStk; add; push xxx;
>> loadStk; add;
>>
>>
>> Very Impressive, great job Colin! Great catch Don!
>>
>> Eric
>>
>>
>>
>>
>> On Mon, Nov 3, 2025 at 4:22 PM Donald Porter via
>> Tcl-Core <tcl...@li...> wrote:
>>
>> Check what effect replacing [uplevel] with
>> [tailcall] has.
>>
>>> On Nov 3, 2025, at 7:13 PM, EricT
>>> <tw...@gm...> wrote:
>>>
>>> Subject: Your bytecode expression evaluator -
>>> impressive results with caching!
>>>
>>> Hey Colin:
>>>
>>> I took a look at your bytecode-based expression
>>> evaluator and was intrigued by the approach. I
>>> made a small modification to add caching and the
>>> results are really impressive. Here's what I
>>> changed:
>>>
>>> proc = args {
>>> set exp [join $args]
>>> if {[info exist ::cache($exp)]} {
>>> return [uplevel [list
>>> ::tcl::unsupported::assemble $::cache($exp)]]
>>> }
>>> set tokens [tokenise $exp]
>>> deb1 "TOKENS = '$tokens'"
>>> set code [compile $tokens]
>>> deb1 "GENERATED CODE:\n$code\n"
>>> set ::cache($exp) $code
>>> uplevel [list ::tcl::unsupported::assemble
>>> $code]
>>> }
>>>
>>> The cache is just a simple array lookup - one
>>> line to store, one line to retrieve. But the
>>> performance impact is huge:
>>>
>>> Performance Tests
>>>
>>> Without caching
>>> % time {= 1 + 2} 1000
>>> 24.937 microseconds per iteration
>>>
>>> With caching
>>> % time {= 1 + 2} 1000
>>> 1.8 microseconds per iteration
>>>
>>> That's a 13x speedup! The tokenize and parse
>>> steps were eating about 92% of the execution time.
>>>
>>> The Real Magic: Bare Variables + Caching
>>>
>>> What really impressed me is how well your bare
>>> variable feature synergizes with caching:
>>>
>>> % set a 5
>>> 5
>>> % set b 6
>>> 6
>>> % = a + b
>>> 11
>>> % time {= a + b} 1000
>>> 2.079 microseconds per iteration
>>>
>>> Now change the variable values
>>> % set a 10
>>> 10
>>> % = a + b
>>> 16
>>> % time {= a + b} 1000
>>> 2.188 microseconds per iteration
>>>
>>> The cache entry stays valid even when the
>>> variable values change! Why? Because the
>>> bytecode stores variable names, not values:
>>>
>>> push a; loadStk; push b; loadStk; add;
>>>
>>> The loadStk instruction does runtime lookup, so:
>>> - Cache key is stable: "a + b"
>>> - Works for any values of a and b
>>> - One cache entry handles all value combinations
>>>
>>> Compare this to if we used $-substitution:
>>>
>>> = $a + $b # With a=5, b=6 becomes "5 + 6"
>>> = $a + $b # With a=10, b=6 becomes "10 + 6" -
>>> different cache key!
>>>
>>> Every value change would create a new cache
>>> entry or worse, a cache miss.
>>>
>>> Comparison to Other Approaches
>>>
>>> Tcl's expr: about 0.40 microseconds
>>> Direct C evaluator: about 0.53 microseconds
>>> Your cached approach: about 1.80 microseconds
>>> Your uncached approach: about 24.9 microseconds
>>>
>>> With caching, you're only 3-4x slower than a
>>> direct C evaluator.
>>>
>>>
>>> My Assessment
>>>
>>> Your design is excellent. The bare variable
>>> feature isn't just syntax sugar - it's essential
>>> for good cache performance. The synergy between:
>>>
>>> 1. Bare variables leading to stable cache keys
>>> 2. Runtime lookup keeping cache hot
>>> 3. Simple caching providing dramatic speedup
>>>
>>> makes this really elegant.
>>>
>>> My recommendation: Keep it in Tcl! The
>>> implementation is clean, performance is
>>> excellent (1.8 microseconds is plenty fast), and
>>> converting to C would add significant complexity
>>> for minimal gain (maybe getting to about 1.0
>>> microseconds).
>>>
>>> The Tcl prototype with caching is actually the
>>> right solution here. Sometimes the prototype IS
>>> the product!
>>>
>>> Excellent work on this. The bytecode approach
>>> really shines with caching enabled.
>>>
>>> On Sun, Nov 2, 2025 at 10:14 AM Colin Macleod
>>> via Tcl-Core <tcl...@li...> wrote:
>>>
>>> Hi again,
>>>
>>> I've now made a slightly more serious
>>> prototype, see
>>> https://cmacleod.me.uk/tcl/expr_ng
>>>
>>> This is a modified version of the prototype
>>> I wrote for tip 676. It's still in Tcl, but
>>> doesn't use `expr`. It tokenises and parses
>>> the input, then generates TAL bytecode and
>>> uses ::tcl::unsupported::assemble to run
>>> that. A few examples:
>>>
>>> (bin) 100 % set a [= 3.0/4]
>>> 0.75
>>> (bin) 101 % set b [= sin(a*10)]
>>> 0.9379999767747389
>>> (bin) 102 % set c [= (b-a)*100]
>>> 18.79999767747389
>>> (bin) 103 % namespace eval nn {set d [=
>>> 10**3]}
>>> 1000
>>> (bin) 104 % set e [= a?nn::d:b]
>>> 1000
>>> (bin) 105 % = {3 + [pwd]}
>>> Calc: expected start of expression but
>>> found '[pwd]'
>>> (bin) 106 % = {3 + $q}
>>> Calc: expected start of expression but
>>> found '$q'
>>> (bin) 107 % = sin (12)
>>> -0.5365729180004349
>>>
>>> (bin) 108 % array set rr {one 1 two 2
>>> three 3}
>>> (bin) 110 % = a * rr(two)
>>> Calc: expected operator but found '('
>>> (bin) 111 % = a * $rr(two)
>>> 1.5
>>>
>>> - You can use $ to get an array value
>>> substituted before the `=` code sees the
>>> expression.
>>>
>>> (bin) 112 % string repeat ! [= nn::d / 15]
>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>>
>>> Colin.
>>>
>>> On 02/11/2025 09:04, Donal Fellows wrote:
>>>> Doing the job properly would definitely
>>>> involve changing the expression parser,
>>>> with my suggested fix being to turn all
>>>> bare words not otherwise recognised as
>>>> constants or in positions that look like
>>>> function calls (it's a parser with some
>>>> lookahead) into simple variable reads (NB:
>>>> C resolves such ambiguities within itself
>>>> differently, but that's one of the nastiest
>>>> parts of the language). We would need to
>>>> retain $ support for resolving ambiguity
>>>> (e.g., array reads vs function calls; you
>>>> can't safely inspect the interpreter to
>>>> resolve it at the time of compiling the
>>>> expression due to traces and unknown
>>>> handlers) as well as compatibility, but
>>>> that's doable as it is a change only in
>>>> cases that are currently errors.
>>>>
>>>> Adding assignment is quite a bit trickier,
>>>> as that needs a new major syntax class to
>>>> describe the left side of the assignment. I
>>>> suggest omitting that from consideration at
>>>> this stage.
>>>>
>>>> Donal.
>>>>
>>>> -------- Original message --------
>>>> From: Colin Macleod via Tcl-Core
>>>> <tcl...@li...>
>>>> <mailto:tcl...@li...>
>>>> Date: 02/11/2025 08:13 (GMT+00:00)
>>>> To: Pietro Cerutti <ga...@ga...>
>>>> <mailto:ga...@ga...>
>>>> Cc: tcl...@li...,
>>>> av...@lo...
>>>> Subject: Re: [TCLCORE] Fwd: TIP 672
>>>> Implementation Complete - Ready for
>>>> Sponsorship
>>>>
>>>> Indeed, this toy implementation doesn't
>>>> handle that:
>>>>
>>>> % = sin (12)
>>>> can't read "sin": no such variable
>>>>
>>>> I'm not sure that's serious, but it could
>>>> be fixed in a C implementation.
>>>>
>>> _______________________________________________
>>> Tcl-Core mailing list
>>> Tcl...@li...
>>> https://lists.sourceforge.net/lists/listinfo/tcl-core
>>>
>>> _______________________________________________
>>> Tcl-Core mailing list
>>> Tcl...@li...
>>> https://lists.sourceforge.net/lists/listinfo/tcl-core
>>
>> _______________________________________________
>> Tcl-Core mailing list
>> Tcl...@li...
>> https://lists.sourceforge.net/lists/listinfo/tcl-core
>>
>>
>>
>> _______________________________________________
>> Tcl-Core mailing list
>> Tcl...@li...
>> https://lists.sourceforge.net/lists/listinfo/tcl-core
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
> |