|
From: Florent M. <flo...@gm...> - 2025-10-23 05:56:16
|
Hello Eric,
Some questions / Tests
What are the results of these commands :
1.
% set script [subst -noc {
set x $(1+1)
}]
2.
% set a 2
% set script [subst -nov {
set x $(2+$a)
}]
3.
% set x $(3+$(3+3))
4.
% set x [expr {4+$(4+4)}]
Cordialy
FM
Le mer. 22 oct. 2025, 14:07, EricT <tw...@gm...> a écrit :
Hi everyone,
I wanted to update you on the status of TIP 672 (expression
substitution syntax).
The prototype implementation is now complete and available at:
https://github.com/rocketship88/tcl-tip-672-prototype
The implementation has been parameterized to support both discussed
approaches with a simple compile-time flag:
- Mode 1: $(expr) syntax
- Mode 2: $=(expr) or $^(expr) - configurable character
Key features:
- Comprehensive test suite (82 tests covering edge cases)
- Minimal code changes (~100 lines)
- Clean integration with existing parser
- Full Tcl test suite passes (only 3 expected differences)
There is one known issue: a minor memory leak in the synthetic
command strings that will need attention during integration.
At this point, I need a sponsor to upload the code to a TIP branch
on Fossil so the core team can review and decide whether to adopt
the TIP and, if so, which syntax to use. I'm available for any
collaborative work, questions, or modifications needed.
I'm grateful for all the feedback and support from the community. I
look forward to the next steps in the process.
Best regards,
Eric
_______________________________________________
Tcl-Core mailing list
Tcl...@li...
https://lists.sourceforge.net/lists/listinfo/tcl-core
|
|
From: Jan N. <jan...@gm...> - 2025-10-23 07:44:11
|
Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
> What are the results of these commands:
% set script [subst -noc {
set x $(1+1)
}]
set x 2
% set a 2
2
% set script [subst -nov {
set x $(2+$a)
}]
set x 4
% set x $(3+$(3+3))
invalid character "$"
in expression "3+$(3+3)"
% set x [expr {4+$(4+4)}]
invalid character "$"
in expression "4+$(4+4)"
%
I think all of those are expected. The first answer does
"expr" substitution, the option "-noc" doesn't change
anything on that (-noexpr would!). The second one
also does expr substitution, the variable $a is handled
by the expr handling itself. So that's expected too.
The latter 2 are also as I expect, since $( substitution
should not be done within expressions itself.
This test is done with the version in the fossil "tip-672"
branch. This branch is based on an earlier version from Eric.
(@eric, could you base later changes on this version?
you use a non-standard indenting, that makes merging
in later versions rather cumbersome)
One thing already discussed was to adapt the "subst"
command for the new substitution. The tip-672 branch
adds new -expr and -noexpr options, so the new proposed
substitution can be switched on or off at will. I think
that should be added to the TIP text.
For now, I'll wait for the result of the discussion.
Florent Merlet wrote:
> Is it to make people to go far away from Tcl ?
Such accusations don't help anything in this
discussion. That's all I'm going to say about it.
Hope this helps,
Jan Nijtmans
|
|
From: Florent M. <flo...@gm...> - 2025-10-23 08:32:39
|
Hello Jan,
So we can admit that the implementation is not ready.
In fact, expression is modelised like a variable and it can't nest.
Fondamentally, the problem is that the « $ » sign allways has been
related to the concept of variable, whereas an expression is related to
the concept of command.
« variables » have caracteristics that differ from the one of « commands » .
« commands » can nest, whereas « variables » don't.
These distinct caracteristics were translated directly in the programm
flow, because, at a certain degree, there must be an analogy between the
flow of the programm and the caracteristics of the concept.
In Parse_token, the « $ » branch doesn't need to handle any nesting.
Only the « [ » branch need to handle nesting.
In fact, the proposal falls in the bad branch of « parse_Token », the «
variable » branch, which tests for the presence of a « $ » char.
The good branch for such an implementation is the « command »
branch, that test for the presence of a « [ » char, because a command
can nest.
Conclusion : There is no real choice : the expr shorthand proposal must
begin with a « [ », if not, the price paid by the source code will be
too big, because of the radical difference between the « variable » and
« command » concepts. At the end, you could have to recreate the flow
for commands in the flow for variables.
That's why, instead, I proposed this shorthand : « [( ... )] »
It will be detected in the command branch of parse token, so that :
1. subst -noc can inhibit it : no need to change anything in the actual
logic
(an open question remains : how -nocommands et -noexpression flags will
interfer, we may want -nocommand -withexpression after all)
2. We just have to check
} else if (*src == '[' && src[1] =='(') {
if (noSubstExpr) {
tokenPtr->type = TCL_TOKEN_TEXT;
tokenPtr->size = 2;
parsePtr->numTokens++;
src+=2;
numBytes-=2;
continue;
}
Tcl_Parse *nestedPtr; src+=2; numBytes-=2;
int length = tclParseExpression(src); // to be created :
opnode *opnode
nestedPtr = (Tcl_Parse *)TclStackAlloc(parsePtr->interp,
sizeof(Tcl_Parse));
if (Tcl_ParseExpr(interp, src, -1, nestedPtr) != TCL_OK) {...}
...etc
I think, globally, this would be a simpler change, since we are not mixing carrots and potatoes.
Honnestly, is the following code so bad visualy ?
% set script [subst -noc {
set x [(1+1)]
}]
% set a 2
% set script [subst -nov {
set x [(2+$a)]
}]
% set x [( 3 + [(3+3)] )]
% set x [expr {4+[(4+4)]}]
Le 23/10/2025 à 09:43, Jan Nijtmans a écrit :
> Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
>> What are the results of these commands:
> % set script [subst -noc {
> set x $(1+1)
> }]
>
> set x 2
>
> % set a 2
> 2
> % set script [subst -nov {
> set x $(2+$a)
> }]
>
> set x 4
>
> % set x $(3+$(3+3))
> invalid character "$"
> in expression "3+$(3+3)"
> % set x [expr {4+$(4+4)}]
> invalid character "$"
> in expression "4+$(4+4)"
> %
>
> I think all of those are expected. The first answer does
> "expr" substitution, the option "-noc" doesn't change
> anything on that (-noexpr would!). The second one
> also does expr substitution, the variable $a is handled
> by the expr handling itself. So that's expected too.
>
> The latter 2 are also as I expect, since $( substitution
> should not be done within expressions itself.
>
> This test is done with the version in the fossil "tip-672"
> branch. This branch is based on an earlier version from Eric.
> (@eric, could you base later changes on this version?
> you use a non-standard indenting, that makes merging
> in later versions rather cumbersome)
>
> One thing already discussed was to adapt the "subst"
> command for the new substitution. The tip-672 branch
> adds new -expr and -noexpr options, so the new proposed
> substitution can be switched on or off at will. I think
> that should be added to the TIP text.
>
> For now, I'll wait for the result of the discussion.
>
> Florent Merlet wrote:
>> Is it to make people to go far away from Tcl ?
> Such accusations don't help anything in this
> discussion. That's all I'm going to say about it.
>
> Hope this helps,
> Jan Nijtmans
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
--
/
/
|
|
From: Donal F. <don...@ma...> - 2025-10-23 11:11:58
|
Conclusion : There is no real choice
That's an incorrect conclusion. At least one of the priors you use in reaching it is invalid.
The core of the proposal is that the sequence $( start a new syntax category (with what follows being an expression to the point where there's a matching parenthesis using an appropriate recursive matching algorithm, with a similar model to how command substitutions work). This new syntactic entity will be called (almost certainly) an expression substitution. That it starts with $ is good; existing code that inserts backslashes before dollar signs to make things subst-safe will not be surprised; normally we'd recommend such approaches instead now switch to using regsub -command, due to the way subst works, but that's a longer-term change-over.
That the C code inside the Tcl parsing engine will need to change to accommodate a new syntax category isn't a deep insight. Of course it will need to change. Such change is what is being proposed!
Right now, the real discussion is what the sequence of literal characters in a Tcl script to introduce the new syntax should be. The options in play are:
1.
Do nothing. Stick with [expr {expression...}]. This option is always open to us.
2.
Use $(expression...)
3.
Use $((expression...))
4.
Use [={expression...}]
5.
Use [= expression...]
6.
Use [(expression...)]
7.
Use something else (I've probably missed an option or two from this painting of the bikeshed)
Of the options above, 2 has real compatibility issues (i.e., we know it will clash with existing code in Tcllib) and 5 probably has semantic issues (because it was proposed as an actual command; anything that introduces uncontrolled double substitution is a no-no at this point). Option 3 has been assessed for practical compatibility, and should be fine (except for a few places in JimTcl, written by someone already in this conversation). Option 4 manages to be ugly, but is probably OK in terms of amount of in-the-wild usage. Option 6 hasn't been assessed yet.
For all options that introduce a new syntactic substitution class, a change to subst is needed to work with it (Jan's outlined this adequately) and there should be consideration whether the other mini-languages inside Tcl should change (we have several).
Donal.
________________________________
From: Florent Merlet <flo...@gm...>
Sent: Thursday, October 23, 2025 09:32
To: tcl...@li... <tcl...@li...>
Subject: Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for Sponsorship
Hello Jan, So we can admit that the implementation is not ready. In fact, expression is modelised like a variable and it can't nest. Fondamentally, the problem is that the « $ » sign allways has been related to the concept of variable, whereas
Hello Jan,
So we can admit that the implementation is not ready.
In fact, expression is modelised like a variable and it can't nest.
Fondamentally, the problem is that the « $ » sign allways has been related to the concept of variable, whereas an expression is related to the concept of command.
« variables » have caracteristics that differ from the one of « commands » .
« commands » can nest, whereas « variables » don't.
These distinct caracteristics were translated directly in the programm flow, because, at a certain degree, there must be an analogy between the flow of the programm and the caracteristics of the concept.
In Parse_token, the « $ » branch doesn't need to handle any nesting. Only the « [ » branch need to handle nesting.
In fact, the proposal falls in the bad branch of « parse_Token », the « variable » branch, which tests for the presence of a « $ » char.
The good branch for such an implementation is the « command » branch, that test for the presence of a « [ » char, because a command can nest.
Conclusion : There is no real choice : the expr shorthand proposal must begin with a « [ », if not, the price paid by the source code will be too big, because of the radical difference between the « variable » and « command » concepts. At the end, you could have to recreate the flow for commands in the flow for variables.
That's why, instead, I proposed this shorthand : « [( ... )] »
It will be detected in the command branch of parse token, so that :
1. subst -noc can inhibit it : no need to change anything in the actual logic
(an open question remains : how -nocommands et -noexpression flags will interfer, we may want -nocommand -withexpression after all)
2. We just have to check
} else if (*src == '[' && src[1] =='(') {
if (noSubstExpr) {
tokenPtr->type = TCL_TOKEN_TEXT;
tokenPtr->size = 2;
parsePtr->numTokens++;
src+=2;
numBytes-=2;
continue;
}
Tcl_Parse *nestedPtr; src+=2; numBytes-=2;
int length = tclParseExpression(src); // to be created :
opnode *opnode
nestedPtr = (Tcl_Parse *)TclStackAlloc(parsePtr->interp, sizeof(Tcl_Parse));
if (Tcl_ParseExpr(interp, src, -1, nestedPtr) != TCL_OK) {...}
...etc
I think, globally, this would be a simpler change, since we are not mixing carrots and potatoes.
Honnestly, is the following code so bad visualy ?
% set script [subst -noc {
set x [(1+1)]
}]
% set a 2
% set script [subst -nov {
set x [(2+$a)]
}]
% set x [( 3 + [(3+3)] )]
% set x [expr {4+[(4+4)]}]
Le 23/10/2025 à 09:43, Jan Nijtmans a écrit :
Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
What are the results of these commands:
% set script [subst -noc {
set x $(1+1)
}]
set x 2
% set a 2
2
% set script [subst -nov {
set x $(2+$a)
}]
set x 4
% set x $(3+$(3+3))
invalid character "$"
in expression "3+$(3+3)"
% set x [expr {4+$(4+4)}]
invalid character "$"
in expression "4+$(4+4)"
%
I think all of those are expected. The first answer does
"expr" substitution, the option "-noc" doesn't change
anything on that (-noexpr would!). The second one
also does expr substitution, the variable $a is handled
by the expr handling itself. So that's expected too.
The latter 2 are also as I expect, since $( substitution
should not be done within expressions itself.
This test is done with the version in the fossil "tip-672"
branch. This branch is based on an earlier version from Eric.
(@eric, could you base later changes on this version?
you use a non-standard indenting, that makes merging
in later versions rather cumbersome)
One thing already discussed was to adapt the "subst"
command for the new substitution. The tip-672 branch
adds new -expr and -noexpr options, so the new proposed
substitution can be switched on or off at will. I think
that should be added to the TIP text.
For now, I'll wait for the result of the discussion.
Florent Merlet wrote:
Is it to make people to go far away from Tcl ?
Such accusations don't help anything in this
discussion. That's all I'm going to say about it.
Hope this helps,
Jan Nijtmans
_______________________________________________
Tcl-Core mailing list
Tcl...@li...<mailto:Tcl...@li...>
https://lists.sourceforge.net/lists/listinfo/tcl-core [lists.sourceforge.net]<https://urldefense.com/v3/__https://lists.sourceforge.net/lists/listinfo/tcl-core__;!!PDiH4ENfjr2_Jw!G2apoGRhxYjjkcB234n_qe1j3E65OHgCeKrFVRNGPkptfDmsAOCfgWEx4K2_YpAcsWbWY_5Bm2XkftSqpPt9ZcYmA1AMGovG0wSuHcY$>
--
|
|
From: Florent M. <flo...@gm...> - 2025-10-23 14:45:22
|
Le 23/10/2025 à 13:11, Donal Fellows a écrit :
>
>
>
> Conclusion : There is no real choice
>
>
> That's an incorrect conclusion. At least one of the priors you use in
> reaching it is invalid.
> The core of the proposal is that the sequence *$(* start a
> /new/ syntax category (with what follows being an expression to the
> point where there's a /matching /parenthesis using an appropriate
> recursive matching algorithm, with a similar model to how command
> substitutions work). This new syntactic entity will be called (almost
> certainly) an /expression substitution/. That it starts with *$* is
> good; existing code that inserts backslashes before dollar signs to
> make things *subst*-safe will not be surprised; normally we'd
> recommend such approaches instead now switch to using *regsub
> -command*, due to the way *subst *works, but that's a longer-term
> change-over.
>
> That the C code inside the Tcl parsing engine will need to change to
> accommodate a new syntax category isn't a deep insight. Of course it
> will need to change. Such change is what is being proposed!
But there is change, and change.
- in matter of quantity : Which syntax will lead to the minimal change
in the c-code ?
- in matter of quality : Which syntax will give the cleaner c-code for
the reader ?
About this proposal « $(...) »
in matter of quality, the fact that the expression substitution is
detected in a procedure whose name is Tcl_ParseVarname is highly
illogical (not clean)
in matter of quantity, the fact that :
- subst is defeat (confusion of variable and math expression) will
imply corrections, whereas it's just the consequence of the previous
conception illogicality (more code)
- the impossibility of this new syntax to nest will imply corrections
in Parse_Expr (more code)
- the necessity to track memlink will imply more changed too (more code)
About the proposal « [( ... )] »
In matter of quality, the fact that a math expression is a command is
logical (more clean)
In matter of quantity :
- subst won't be defeat : no command, no expression : no change
needed, but new feature possible of course (less code)
- The syntax may easily nest, since it will fall in the SCRIPT case of
the /Parse_Expr/ procedure, that will call /Tcl_ParseCommand/ then
/Parse_Token/ (less code)
> Right now, the real discussion is what the sequence of literal
> characters in a Tcl script to introduce the new syntax should be. The
> options in play are:
>
> 1.
> Do nothing. Stick with *[expr {*expression...*}]*. This option is
> /always open/ to us.
> 2.
> Use *$(*expression...*)*
> 3.
> Use *$((*expression...*))*
> 4.
> Use *[={*expression...*}]*
> 5.
> Use *[=* expression...*]*
> 6.
> Use *[(*expression...*)]*
> 7.
> Use something else (I've probably missed an option or two from
> this painting of the bikeshed)
>
The better way be should to implement each of these proposals and
compare it in quantity as well as in quality.
I really would like to implement the « [( ... )] » variant, sadly, I
have neither the time, neither the infrastructure, neither the
competence to do so.
I am absolutely sure the result will be, finally, simpler than the other
alternatives (since an expression is a kind of command...)
Is someone interesting to get a try ?
Florent |
|
From: da S. P. J <pet...@fl...> - 2025-10-23 15:17:49
|
5 is basically equivalent to 1, it’s just a convention.
From: Donal Fellows <don...@ma...>
Date: Thursday, October 23, 2025 at 06:15
To: Florent Merlet <flo...@gm...>, tcl...@li... <tcl...@li...>
Subject: Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for Sponsorship
Conclusion : There is no real choice That's an incorrect conclusion. At least one of the priors you use in reaching it is invalid. The core of the proposal is that the sequence $( start a new syntax category (with what follows being an expression
Conclusion : There is no real choice
That's an incorrect conclusion. At least one of the priors you use in reaching it is invalid.
The core of the proposal is that the sequence $( start a new syntax category (with what follows being an expression to the point where there's a matching parenthesis using an appropriate recursive matching algorithm, with a similar model to how command substitutions work). This new syntactic entity will be called (almost certainly) an expression substitution. That it starts with $ is good; existing code that inserts backslashes before dollar signs to make things subst-safe will not be surprised; normally we'd recommend such approaches instead now switch to using regsub -command, due to the way subst works, but that's a longer-term change-over.
That the C code inside the Tcl parsing engine will need to change to accommodate a new syntax category isn't a deep insight. Of course it will need to change. Such change is what is being proposed!
Right now, the real discussion is what the sequence of literal characters in a Tcl script to introduce the new syntax should be. The options in play are:
1. Do nothing. Stick with [expr {expression...}]. This option is always open to us.
1. Use $(expression...)
1. Use $((expression...))
1. Use [={expression...}]
1. Use [= expression...]
1. Use [(expression...)]
1. Use something else (I've probably missed an option or two from this painting of the bikeshed)
Of the options above, 2 has real compatibility issues (i.e., we know it will clash with existing code in Tcllib) and 5 probably has semantic issues (because it was proposed as an actual command; anything that introduces uncontrolled double substitution is a no-no at this point). Option 3 has been assessed for practical compatibility, and should be fine (except for a few places in JimTcl, written by someone already in this conversation). Option 4 manages to be ugly, but is probably OK in terms of amount of in-the-wild usage. Option 6 hasn't been assessed yet.
For all options that introduce a new syntactic substitution class, a change to subst is needed to work with it (Jan's outlined this adequately) and there should be consideration whether the other mini-languages inside Tcl should change (we have several).
Donal.
________________________________
From: Florent Merlet <flo...@gm...>
Sent: Thursday, October 23, 2025 09:32
To: tcl...@li... <tcl...@li...>
Subject: Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for Sponsorship
Hello Jan, So we can admit that the implementation is not ready. In fact, expression is modelised like a variable and it can't nest. Fondamentally, the problem is that the « $ » sign allways has been related to the concept of variable, whereas
Hello Jan,
So we can admit that the implementation is not ready.
In fact, expression is modelised like a variable and it can't nest.
Fondamentally, the problem is that the « $ » sign allways has been related to the concept of variable, whereas an expression is related to the concept of command.
« variables » have caracteristics that differ from the one of « commands » .
« commands » can nest, whereas « variables » don't.
These distinct caracteristics were translated directly in the programm flow, because, at a certain degree, there must be an analogy between the flow of the programm and the caracteristics of the concept.
In Parse_token, the « $ » branch doesn't need to handle any nesting. Only the « [ » branch need to handle nesting.
In fact, the proposal falls in the bad branch of « parse_Token », the « variable » branch, which tests for the presence of a « $ » char.
The good branch for such an implementation is the « command » branch, that test for the presence of a « [ » char, because a command can nest.
Conclusion : There is no real choice : the expr shorthand proposal must begin with a « [ », if not, the price paid by the source code will be too big, because of the radical difference between the « variable » and « command » concepts. At the end, you could have to recreate the flow for commands in the flow for variables.
That's why, instead, I proposed this shorthand : « [( ... )] »
It will be detected in the command branch of parse token, so that :
1. subst -noc can inhibit it : no need to change anything in the actual logic
(an open question remains : how -nocommands et -noexpression flags will interfer, we may want -nocommand -withexpression after all)
2. We just have to check
} else if (*src == '[' && src[1] =='(') {
if (noSubstExpr) {
tokenPtr->type = TCL_TOKEN_TEXT;
tokenPtr->size = 2;
parsePtr->numTokens++;
src+=2;
numBytes-=2;
continue;
}
Tcl_Parse *nestedPtr; src+=2; numBytes-=2;
int length = tclParseExpression(src); // to be created :
opnode *opnode
nestedPtr = (Tcl_Parse *)TclStackAlloc(parsePtr->interp, sizeof(Tcl_Parse));
if (Tcl_ParseExpr(interp, src, -1, nestedPtr) != TCL_OK) {...}
...etc
I think, globally, this would be a simpler change, since we are not mixing carrots and potatoes.
Honnestly, is the following code so bad visualy ?
% set script [subst -noc {
set x [(1+1)]
}]
% set a 2
% set script [subst -nov {
set x [(2+$a)]
}]
% set x [( 3 + [(3+3)] )]
% set x [expr {4+[(4+4)]}]
Le 23/10/2025 à 09:43, Jan Nijtmans a écrit :
Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
What are the results of these commands:
% set script [subst -noc {
set x $(1+1)
}]
set x 2
% set a 2
2
% set script [subst -nov {
set x $(2+$a)
}]
set x 4
% set x $(3+$(3+3))
invalid character "$"
in expression "3+$(3+3)"
% set x [expr {4+$(4+4)}]
invalid character "$"
in expression "4+$(4+4)"
%
I think all of those are expected. The first answer does
"expr" substitution, the option "-noc" doesn't change
anything on that (-noexpr would!). The second one
also does expr substitution, the variable $a is handled
by the expr handling itself. So that's expected too.
The latter 2 are also as I expect, since $( substitution
should not be done within expressions itself.
This test is done with the version in the fossil "tip-672"
branch. This branch is based on an earlier version from Eric.
(@eric, could you base later changes on this version?
you use a non-standard indenting, that makes merging
in later versions rather cumbersome)
One thing already discussed was to adapt the "subst"
command for the new substitution. The tip-672 branch
adds new -expr and -noexpr options, so the new proposed
substitution can be switched on or off at will. I think
that should be added to the TIP text.
For now, I'll wait for the result of the discussion.
Florent Merlet wrote:
Is it to make people to go far away from Tcl ?
Such accusations don't help anything in this
discussion. That's all I'm going to say about it.
Hope this helps,
Jan Nijtmans
_______________________________________________
Tcl-Core mailing list
Tcl...@li...<mailto:Tcl...@li...>
https://lists.sourceforge.net/lists/listinfo/tcl-core [lists.sourceforge.net]<https://urldefense.com/v3/__https:/lists.sourceforge.net/lists/listinfo/tcl-core__;!!PDiH4ENfjr2_Jw!G2apoGRhxYjjkcB234n_qe1j3E65OHgCeKrFVRNGPkptfDmsAOCfgWEx4K2_YpAcsWbWY_5Bm2XkftSqpPt9ZcYmA1AMGovG0wSuHcY$>
--
|
|
From: EricT <tw...@gm...> - 2025-10-23 18:17:31
|
Jan:
You mentioned the non-standard indenting - that's a tab/space mixing issue.
I have ongoing trouble with Tcl's tab size 8 + indent 4 standard.
The problem: with 8-char tabs, odd indentation levels require mixing tabs
and spaces, which breaks when editors differ. Adding an outer if
statement can mess up the spacing.
I'd recommend Tcl adopt tab size 4 + indent 4, which eliminates the issue -
whether your editor inserts spaces or tabs, it looks the same.
For my sections: would converting to spaces-only work for the community? I
hesitate to change the entire file.
On testing:
I converted my test file from $() to $(()) syntax and ran it against mode 1
- 82 of 84 passed. The two failures were empty expression tests expecting
different error messages (become sub-expressions). I fixed them with a
flexible pattern:
string match "*empty*expression*" $msg
Now they work with either syntax. This is something to keep in mind if
$(()) were chosen, and years later relaxed to just $() after the $(index)
empty array was deprecated.
Regarding nested $():
$(... $(...)...) fails because expr (or the compiler) rejects the nested $
in the expression.
But $(... [expr {...}]...) works fine. There's no reason to nest anyway -
removing the inner $ gives the same result more efficiently.
Eric
On Thu, Oct 23, 2025 at 8:18 AM da Silva, Peter J <
pet...@fl...> wrote:
> 5 is basically equivalent to 1, it’s just a convention.
>
>
>
> *From: *Donal Fellows <don...@ma...>
> *Date: *Thursday, October 23, 2025 at 06:15
> *To: *Florent Merlet <flo...@gm...>,
> tcl...@li... <tcl...@li...>
> *Subject: *Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for
> Sponsorship
>
> Conclusion : There is no real choice That's an incorrect conclusion.
> At least one of the priors you use in reaching it is invalid. The core of
> the proposal is that the sequence $( start a new syntax category (with what
> follows being an expression
>
>
>
> Conclusion : There is no real choice
>
>
>
> That's an incorrect conclusion. At least one of the priors you use in
> reaching it is invalid.
>
>
>
> The core of the proposal is that the sequence *$(* start a *new* syntax
> category (with what follows being an expression to the point where there's
> a *matching *parenthesis using an appropriate recursive matching
> algorithm, with a similar model to how command substitutions work). This
> new syntactic entity will be called (almost certainly) an *expression
> substitution*. That it starts with *$* is good; existing code that
> inserts backslashes before dollar signs to make things *subst*-safe will
> not be surprised; normally we'd recommend such approaches instead now
> switch to using *regsub -command*, due to the way *subst *works, but
> that's a longer-term change-over.
>
>
>
> That the C code inside the Tcl parsing engine will need to change to
> accommodate a new syntax category isn't a deep insight. Of course it will
> need to change. Such change is what is being proposed!
>
>
>
> Right now, the real discussion is what the sequence of literal characters
> in a Tcl script to introduce the new syntax should be. The options in play
> are:
>
>
>
> 1. Do nothing. Stick with *[expr {*expression...*}]*. This option is *always
> open* to us.
>
>
> 2. Use *$(*expression...*)*
>
>
> 3. Use *$((*expression...*))*
>
>
> 4. Use *[={*expression...*}]*
>
>
> 5. Use *[=* expression...*]*
>
>
> 6. Use *[(*expression...*)]*
>
>
> 7. Use something else (I've probably missed an option or two from this
> painting of the bikeshed)
>
>
>
> Of the options above, 2 has *real *compatibility issues (i.e., we *know *it
> will clash with existing code in Tcllib) and 5 probably has semantic issues
> (because it was proposed as an actual command; anything that introduces
> uncontrolled double substitution is a no-no at this point). Option 3 has
> been assessed for practical compatibility, and should be fine (except for a
> few places in JimTcl, written by someone *already in this conversation*).
> Option 4 manages to be ugly, but is probably OK in terms of amount of
> in-the-wild usage. Option 6 hasn't been assessed yet.
>
>
>
> For all options that introduce a new syntactic substitution class, a
> change to *subst *is needed to work with it (Jan's outlined this
> adequately) and there should be consideration whether the other
> mini-languages inside Tcl should change (we have several).
>
>
>
> Donal.
>
>
> ------------------------------
>
> *From:* Florent Merlet <flo...@gm...>
> *Sent:* Thursday, October 23, 2025 09:32
> *To:* tcl...@li... <tcl...@li...>
> *Subject:* Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for
> Sponsorship
>
>
>
> Hello Jan, So we can admit that the implementation is not ready. In fact,
> expression is modelised like a variable and it can't nest. Fondamentally,
> the problem is that the « $ » sign allways has been related to the concept
> of variable, whereas
>
> Hello Jan,
>
> So we can admit that the implementation is not ready.
>
> In fact, expression is modelised like a variable and it can't nest.
>
> Fondamentally, the problem is that the « $ » sign allways has been related
> to the concept of variable, whereas an expression is related to the concept
> of command.
>
> « variables » have caracteristics that differ from the one of « commands »
> .
>
> « commands » can nest, whereas « variables » don't.
>
> These distinct caracteristics were translated directly in the programm
> flow, because, at a certain degree, there must be an analogy between the
> flow of the programm and the caracteristics of the concept.
>
> In Parse_token, the « $ » branch doesn't need to handle any nesting. Only
> the « [ » branch need to handle nesting.
>
> In fact, the proposal falls in the bad branch of « parse_Token », the «
> variable » branch, which tests for the presence of a « $ » char.
>
> The good branch for such an implementation is the « command » branch, that
> test for the presence of a « [ » char, because a command can nest.
>
> Conclusion : There is no real choice : the expr shorthand proposal must
> begin with a « [ », if not, the price paid by the source code will be too
> big, because of the radical difference between the « variable » and «
> command » concepts. At the end, you could have to recreate the flow for
> commands in the flow for variables.
>
> That's why, instead, I proposed this shorthand : « [( ... )] »
>
> It will be detected in the command branch of parse token, so that :
>
> 1. subst -noc can inhibit it : no need to change anything in the actual
> logic
>
> (an open question remains : how -nocommands et -noexpression flags will
> interfer, we may want -nocommand -withexpression after all)
>
> 2. We just have to check
>
> } else if (*src == '[' && src[1] =='(') {
>
> if (noSubstExpr) {
> tokenPtr->type = TCL_TOKEN_TEXT;
> tokenPtr->size = 2;
> parsePtr->numTokens++;
> src+=2;
> numBytes-=2;
> continue;
> }
>
> Tcl_Parse *nestedPtr; src+=2; numBytes-=2;
>
> int length = tclParseExpression(src); // to be created :
>
> opnode *opnode
>
> nestedPtr = (Tcl_Parse *)TclStackAlloc(parsePtr->interp,
> sizeof(Tcl_Parse));
>
> if (Tcl_ParseExpr(interp, src, -1, nestedPtr) != TCL_OK) {...}
>
> ...etc
>
> I think, globally, this would be a simpler change, since we are not mixing carrots and potatoes.
>
> Honnestly, is the following code so bad visualy ?
>
> % set script [subst -noc {
>
> set x [(1+1)]
>
> }]
>
> % set a 2
>
> % set script [subst -nov {
>
> set x [(2+$a)]
>
> }]
>
> % set x [( 3 + [(3+3)] )]
>
> % set x [expr {4+[(4+4)]}]
>
>
>
>
>
>
>
> Le 23/10/2025 à 09:43, Jan Nijtmans a écrit :
>
> Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
>
> What are the results of these commands:
>
> % set script [subst -noc {
>
> set x $(1+1)
>
> }]
>
>
>
> set x 2
>
>
>
> % set a 2
>
> 2
>
> % set script [subst -nov {
>
> set x $(2+$a)
>
> }]
>
>
>
> set x 4
>
>
>
> % set x $(3+$(3+3))
>
> invalid character "$"
>
> in expression "3+$(3+3)"
>
> % set x [expr {4+$(4+4)}]
>
> invalid character "$"
>
> in expression "4+$(4+4)"
>
> %
>
>
>
> I think all of those are expected. The first answer does
>
> "expr" substitution, the option "-noc" doesn't change
>
> anything on that (-noexpr would!). The second one
>
> also does expr substitution, the variable $a is handled
>
> by the expr handling itself. So that's expected too.
>
>
>
> The latter 2 are also as I expect, since $( substitution
>
> should not be done within expressions itself.
>
>
>
> This test is done with the version in the fossil "tip-672"
>
> branch. This branch is based on an earlier version from Eric.
>
> (@eric, could you base later changes on this version?
>
> you use a non-standard indenting, that makes merging
>
> in later versions rather cumbersome)
>
>
>
> One thing already discussed was to adapt the "subst"
>
> command for the new substitution. The tip-672 branch
>
> adds new -expr and -noexpr options, so the new proposed
>
> substitution can be switched on or off at will. I think
>
> that should be added to the TIP text.
>
>
>
> For now, I'll wait for the result of the discussion.
>
>
>
> Florent Merlet wrote:
>
> Is it to make people to go far away from Tcl ?
>
> Such accusations don't help anything in this
>
> discussion. That's all I'm going to say about it.
>
> Hope this helps,
>
> Jan Nijtmans
>
>
>
>
>
> _______________________________________________
>
> Tcl-Core mailing list
>
> Tcl...@li...
>
> https://lists.sourceforge.net/lists/listinfo/tcl-core [lists.sourceforge.net] <https://urldefense.com/v3/__https:/lists.sourceforge.net/lists/listinfo/tcl-core__;!!PDiH4ENfjr2_Jw!G2apoGRhxYjjkcB234n_qe1j3E65OHgCeKrFVRNGPkptfDmsAOCfgWEx4K2_YpAcsWbWY_5Bm2XkftSqpPt9ZcYmA1AMGovG0wSuHcY$>
>
> --
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
>
|
|
From: EricT <tw...@gm...> - 2025-10-23 19:41:09
|
Actually, the only efficiency gain is with constant folding, the compiler
just can't combine the two:
% tcl::unsupported::disassemble script {set a $(1 + ([expr {2-1}]))}
ByteCode 0x23090eb7460, refCt 1, epoch 22, interp 0x230898e0830 (epoch 22)
Source "set a $(1 + ([expr {2-1}]))"
Cmds 3, src 27, inst 18, litObjs 3, aux 0, stkDepth 3, code/src 0.00
Commands 3:
1: pc 0-16, src 0-26 2: pc 5-15, src 6529-6553
3: pc 10-14, src 6541-6550
Command 1: "set a $(1 + ([expr {2-1}]))"
(0) push 0 # "a"
Command 2: "expr {1 + ([expr {2-1}])}..."
(5) push 1 # "1"
Command 3: "expr {2-1}..."
(10) push 2 # "1"
(15) add
(16) storeStk
(17) done
% tcl::unsupported::disassemble script {set a $(1 + (2-1))}
ByteCode 0x23090eb6c60, refCt 1, epoch 22, interp 0x230898e0830 (epoch 22)
Source "set a $(1 + (2-1))"
Cmds 2, src 18, inst 12, litObjs 2, aux 0, stkDepth 2, code/src 0.00
Commands 2:
1: pc 0-10, src 0-17 2: pc 5-9, src 3334001-3334016
Command 1: "set a $(1 + (2-1))"
(0) push 0 # "a"
Command 2: "expr {1 + (2-1)}..."
(5) push 1 # "2"
(10) storeStk
(11) done
E
On Thu, Oct 23, 2025 at 11:17 AM EricT <tw...@gm...> wrote:
> Jan:
>
> You mentioned the non-standard indenting - that's a tab/space mixing
> issue. I have ongoing trouble with Tcl's tab size 8 + indent 4 standard.
>
> The problem: with 8-char tabs, odd indentation levels require mixing tabs
> and spaces, which breaks when editors differ. Adding an outer if
> statement can mess up the spacing.
>
> I'd recommend Tcl adopt tab size 4 + indent 4, which eliminates the issue
> - whether your editor inserts spaces or tabs, it looks the same.
>
> For my sections: would converting to spaces-only work for the community? I
> hesitate to change the entire file.
> On testing:
>
> I converted my test file from $() to $(()) syntax and ran it against mode
> 1 - 82 of 84 passed. The two failures were empty expression tests expecting
> different error messages (become sub-expressions). I fixed them with a
> flexible pattern:
>
> string match "*empty*expression*" $msg
>
> Now they work with either syntax. This is something to keep in mind if
> $(()) were chosen, and years later relaxed to just $() after the $(index)
> empty array was deprecated.
>
> Regarding nested $():
>
> $(... $(...)...) fails because expr (or the compiler) rejects the nested $
> in the expression.
>
> But $(... [expr {...}]...) works fine. There's no reason to nest anyway -
> removing the inner $ gives the same result more efficiently.
>
> Eric
>
>
> On Thu, Oct 23, 2025 at 8:18 AM da Silva, Peter J <
> pet...@fl...> wrote:
>
>> 5 is basically equivalent to 1, it’s just a convention.
>>
>>
>>
>> *From: *Donal Fellows <don...@ma...>
>> *Date: *Thursday, October 23, 2025 at 06:15
>> *To: *Florent Merlet <flo...@gm...>,
>> tcl...@li... <tcl...@li...>
>> *Subject: *Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready
>> for Sponsorship
>>
>> Conclusion : There is no real choice That's an incorrect conclusion.
>> At least one of the priors you use in reaching it is invalid. The core of
>> the proposal is that the sequence $( start a new syntax category (with what
>> follows being an expression
>>
>>
>>
>> Conclusion : There is no real choice
>>
>>
>>
>> That's an incorrect conclusion. At least one of the priors you use in
>> reaching it is invalid.
>>
>>
>>
>> The core of the proposal is that the sequence *$(* start a *new* syntax
>> category (with what follows being an expression to the point where there's
>> a *matching *parenthesis using an appropriate recursive matching
>> algorithm, with a similar model to how command substitutions work). This
>> new syntactic entity will be called (almost certainly) an *expression
>> substitution*. That it starts with *$* is good; existing code that
>> inserts backslashes before dollar signs to make things *subst*-safe will
>> not be surprised; normally we'd recommend such approaches instead now
>> switch to using *regsub -command*, due to the way *subst *works, but
>> that's a longer-term change-over.
>>
>>
>>
>> That the C code inside the Tcl parsing engine will need to change to
>> accommodate a new syntax category isn't a deep insight. Of course it will
>> need to change. Such change is what is being proposed!
>>
>>
>>
>> Right now, the real discussion is what the sequence of literal characters
>> in a Tcl script to introduce the new syntax should be. The options in play
>> are:
>>
>>
>>
>> 1. Do nothing. Stick with *[expr {*expression...*}]*. This option is *always
>> open* to us.
>>
>>
>> 2. Use *$(*expression...*)*
>>
>>
>> 3. Use *$((*expression...*))*
>>
>>
>> 4. Use *[={*expression...*}]*
>>
>>
>> 5. Use *[=* expression...*]*
>>
>>
>> 6. Use *[(*expression...*)]*
>>
>>
>> 7. Use something else (I've probably missed an option or two from
>> this painting of the bikeshed)
>>
>>
>>
>> Of the options above, 2 has *real *compatibility issues (i.e., we *know *it
>> will clash with existing code in Tcllib) and 5 probably has semantic issues
>> (because it was proposed as an actual command; anything that introduces
>> uncontrolled double substitution is a no-no at this point). Option 3 has
>> been assessed for practical compatibility, and should be fine (except for a
>> few places in JimTcl, written by someone *already in this conversation*).
>> Option 4 manages to be ugly, but is probably OK in terms of amount of
>> in-the-wild usage. Option 6 hasn't been assessed yet.
>>
>>
>>
>> For all options that introduce a new syntactic substitution class, a
>> change to *subst *is needed to work with it (Jan's outlined this
>> adequately) and there should be consideration whether the other
>> mini-languages inside Tcl should change (we have several).
>>
>>
>>
>> Donal.
>>
>>
>> ------------------------------
>>
>> *From:* Florent Merlet <flo...@gm...>
>> *Sent:* Thursday, October 23, 2025 09:32
>> *To:* tcl...@li... <tcl...@li...>
>> *Subject:* Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready
>> for Sponsorship
>>
>>
>>
>> Hello Jan, So we can admit that the implementation is not ready. In fact,
>> expression is modelised like a variable and it can't nest. Fondamentally,
>> the problem is that the « $ » sign allways has been related to the concept
>> of variable, whereas
>>
>> Hello Jan,
>>
>> So we can admit that the implementation is not ready.
>>
>> In fact, expression is modelised like a variable and it can't nest.
>>
>> Fondamentally, the problem is that the « $ » sign allways has been
>> related to the concept of variable, whereas an expression is related to the
>> concept of command.
>>
>> « variables » have caracteristics that differ from the one of « commands
>> » .
>>
>> « commands » can nest, whereas « variables » don't.
>>
>> These distinct caracteristics were translated directly in the programm
>> flow, because, at a certain degree, there must be an analogy between the
>> flow of the programm and the caracteristics of the concept.
>>
>> In Parse_token, the « $ » branch doesn't need to handle any nesting. Only
>> the « [ » branch need to handle nesting.
>>
>> In fact, the proposal falls in the bad branch of « parse_Token », the «
>> variable » branch, which tests for the presence of a « $ » char.
>>
>> The good branch for such an implementation is the « command »
>> branch, that test for the presence of a « [ » char, because a command can
>> nest.
>>
>> Conclusion : There is no real choice : the expr shorthand proposal must
>> begin with a « [ », if not, the price paid by the source code will be too
>> big, because of the radical difference between the « variable » and «
>> command » concepts. At the end, you could have to recreate the flow for
>> commands in the flow for variables.
>>
>> That's why, instead, I proposed this shorthand : « [( ... )] »
>>
>> It will be detected in the command branch of parse token, so that :
>>
>> 1. subst -noc can inhibit it : no need to change anything in the actual
>> logic
>>
>> (an open question remains : how -nocommands et -noexpression flags will
>> interfer, we may want -nocommand -withexpression after all)
>>
>> 2. We just have to check
>>
>> } else if (*src == '[' && src[1] =='(') {
>>
>> if (noSubstExpr) {
>> tokenPtr->type = TCL_TOKEN_TEXT;
>> tokenPtr->size = 2;
>> parsePtr->numTokens++;
>> src+=2;
>> numBytes-=2;
>> continue;
>> }
>>
>> Tcl_Parse *nestedPtr; src+=2; numBytes-=2;
>>
>> int length = tclParseExpression(src); // to be created :
>>
>> opnode *opnode
>>
>> nestedPtr = (Tcl_Parse *)TclStackAlloc(parsePtr->interp,
>> sizeof(Tcl_Parse));
>>
>> if (Tcl_ParseExpr(interp, src, -1, nestedPtr) != TCL_OK) {...}
>>
>> ...etc
>>
>> I think, globally, this would be a simpler change, since we are not mixing carrots and potatoes.
>>
>> Honnestly, is the following code so bad visualy ?
>>
>> % set script [subst -noc {
>>
>> set x [(1+1)]
>>
>> }]
>>
>> % set a 2
>>
>> % set script [subst -nov {
>>
>> set x [(2+$a)]
>>
>> }]
>>
>> % set x [( 3 + [(3+3)] )]
>>
>> % set x [expr {4+[(4+4)]}]
>>
>>
>>
>>
>>
>>
>>
>> Le 23/10/2025 à 09:43, Jan Nijtmans a écrit :
>>
>> Op do 23 okt 2025 om 07:56 schreef Florent Merlet:
>>
>> What are the results of these commands:
>>
>> % set script [subst -noc {
>>
>> set x $(1+1)
>>
>> }]
>>
>>
>>
>> set x 2
>>
>>
>>
>> % set a 2
>>
>> 2
>>
>> % set script [subst -nov {
>>
>> set x $(2+$a)
>>
>> }]
>>
>>
>>
>> set x 4
>>
>>
>>
>> % set x $(3+$(3+3))
>>
>> invalid character "$"
>>
>> in expression "3+$(3+3)"
>>
>> % set x [expr {4+$(4+4)}]
>>
>> invalid character "$"
>>
>> in expression "4+$(4+4)"
>>
>> %
>>
>>
>>
>> I think all of those are expected. The first answer does
>>
>> "expr" substitution, the option "-noc" doesn't change
>>
>> anything on that (-noexpr would!). The second one
>>
>> also does expr substitution, the variable $a is handled
>>
>> by the expr handling itself. So that's expected too.
>>
>>
>>
>> The latter 2 are also as I expect, since $( substitution
>>
>> should not be done within expressions itself.
>>
>>
>>
>> This test is done with the version in the fossil "tip-672"
>>
>> branch. This branch is based on an earlier version from Eric.
>>
>> (@eric, could you base later changes on this version?
>>
>> you use a non-standard indenting, that makes merging
>>
>> in later versions rather cumbersome)
>>
>>
>>
>> One thing already discussed was to adapt the "subst"
>>
>> command for the new substitution. The tip-672 branch
>>
>> adds new -expr and -noexpr options, so the new proposed
>>
>> substitution can be switched on or off at will. I think
>>
>> that should be added to the TIP text.
>>
>>
>>
>> For now, I'll wait for the result of the discussion.
>>
>>
>>
>> Florent Merlet wrote:
>>
>> Is it to make people to go far away from Tcl ?
>>
>> Such accusations don't help anything in this
>>
>> discussion. That's all I'm going to say about it.
>>
>> Hope this helps,
>>
>> Jan Nijtmans
>>
>>
>>
>>
>>
>> _______________________________________________
>>
>> Tcl-Core mailing list
>>
>> Tcl...@li...
>>
>> https://lists.sourceforge.net/lists/listinfo/tcl-core [lists.sourceforge.net] <https://urldefense.com/v3/__https:/lists.sourceforge.net/lists/listinfo/tcl-core__;!!PDiH4ENfjr2_Jw!G2apoGRhxYjjkcB234n_qe1j3E65OHgCeKrFVRNGPkptfDmsAOCfgWEx4K2_YpAcsWbWY_5Bm2XkftSqpPt9ZcYmA1AMGovG0wSuHcY$>
>>
>> --
>>
>>
>> _______________________________________________
>> Tcl-Core mailing list
>> Tcl...@li...
>> https://lists.sourceforge.net/lists/listinfo/tcl-core
>>
>
|
|
From: Jan N. <jan...@gm...> - 2025-10-24 10:21:55
|
Op do 23 okt 2025 om 20:17 schreef EricT <tw...@gm...>:
> For my sections: would converting to spaces-only work for the community? I hesitate to change the entire file.
No problem. Three things are - in my opinion - important to be done
1) The $() is only acceptable if there is a knob to switch it off. I
implemented this knob now (interp exprsubst). If another change
is made than $(), we can always decide to remove this.
2) Integrating the expr parsing in Tcl_ParseVarName() is not a good
idea: It's a public function, which behavior would change. Therefore
I refactored it into a new internal function TclParseExprSubst().
3) The "subst" command (and the Tcl_SubstObj() function) need
new options/flags to handle the new "expression" substitution,
separate to "variable" substitution and "command" substitution.
That solves the discussion, whether this substitution belongs
to "variable" or "command" substitution. I think it is neither ;-)
All of this is available now in the tip-672 branch
https://core.tcl-lang.org/tcl/timeline?r=tip-672
Eric, if you make further changes, can you please take the
files in this branch as your starting point? It's quite a big
effort to merge the two versions, if I take over your changes,
but you never take over mine.
Please review!
Thanks!
Jan Nijtmans
|
|
From: Florent M. <flo...@gm...> - 2025-10-24 15:07:10
|
Hi Jan, I've been thinking about the subject this morning. I m trying to implement the alternative syntaxe [( ... )] I got my head turning form eval to parse and from parse to subst... Whatever, I have 1 question about the procedure Tcl_ParseExpr. This procedure seems to be used nowhere. How do we use it ? The idea is to parse an expression, then to subst it to get a value. I don't want to go throught a TCL_TOKEN_COMMAND token, but throught a TCL_TOKEN_SUBEXPR token. But turning from files to files i can't see any procedure that use it to get a value from it Thanks for your help. Le ven. 24 oct. 2025, 12:22, Jan Nijtmans <jan...@gm...> a écrit : > Op do 23 okt 2025 om 20:17 schreef EricT <tw...@gm...>: > > For my sections: would converting to spaces-only work for the community? > I hesitate to change the entire file. > > No problem. Three things are - in my opinion - important to be done > 1) The $() is only acceptable if there is a knob to switch it off. I > implemented this knob now (interp exprsubst). If another change > is made than $(), we can always decide to remove this. > 2) Integrating the expr parsing in Tcl_ParseVarName() is not a good > idea: It's a public function, which behavior would change. Therefore > I refactored it into a new internal function TclParseExprSubst(). > 3) The "subst" command (and the Tcl_SubstObj() function) need > new options/flags to handle the new "expression" substitution, > separate to "variable" substitution and "command" substitution. > That solves the discussion, whether this substitution belongs > to "variable" or "command" substitution. I think it is neither ;-) > > All of this is available now in the tip-672 branch > https://core.tcl-lang.org/tcl/timeline?r=tip-672 > > Eric, if you make further changes, can you please take the > files in this branch as your starting point? It's quite a big > effort to merge the two versions, if I take over your changes, > but you never take over mine. > > Please review! > > Thanks! > Jan Nijtmans > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
|
From: EricT <tw...@gm...> - 2025-10-24 20:46:33
|
Hi Jan, Thank you for the extensive refactoring work! I've downloaded the latest from the tip-672 branch and can see you picked up my latest changes including the 3-way configuration. I haven't made any new changes since then, so you should be working with my most recent code. I'll review your refactoring and provide feedback. The creation of TclParseExprSubst() as an internal function and the interp exprsubst knob make good sense. I appreciate you taking this on! Eric On Fri, Oct 24, 2025 at 3:22 AM Jan Nijtmans <jan...@gm...> wrote: > Op do 23 okt 2025 om 20:17 schreef EricT <tw...@gm...>: > > For my sections: would converting to spaces-only work for the community? > I hesitate to change the entire file. > > No problem. Three things are - in my opinion - important to be done > 1) The $() is only acceptable if there is a knob to switch it off. I > implemented this knob now (interp exprsubst). If another change > is made than $(), we can always decide to remove this. > 2) Integrating the expr parsing in Tcl_ParseVarName() is not a good > idea: It's a public function, which behavior would change. Therefore > I refactored it into a new internal function TclParseExprSubst(). > 3) The "subst" command (and the Tcl_SubstObj() function) need > new options/flags to handle the new "expression" substitution, > separate to "variable" substitution and "command" substitution. > That solves the discussion, whether this substitution belongs > to "variable" or "command" substitution. I think it is neither ;-) > > All of this is available now in the tip-672 branch > https://core.tcl-lang.org/tcl/timeline?r=tip-672 > > Eric, if you make further changes, can you please take the > files in this branch as your starting point? It's quite a big > effort to merge the two versions, if I take over your changes, > but you never take over mine. > > Please review! > > Thanks! > Jan Nijtmans > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
|
From: Andreas L. <av...@lo...> - 2025-10-30 01:38:24
|
Jan Nijtmans <jan...@gm...> wrote:
> 3) The "subst" command (and the Tcl_SubstObj() function) need
> new options/flags to handle the new "expression" substitution,
> separate to "variable" substitution and "command" substitution.
> That solves the discussion, whether this substitution belongs
> to "variable" or "command" substitution. I think it is neither ;-)
This is very dangerous, imho:
If some code contains: ... [subst -nocommands $userString]
then the subst shall never ever cause any command to be called,
no matter how it appears in the $userString.
E.g. if $userString is {foo bar [exec snafu] } then the command "snafu"
just should not be executed... ("snafu" might just as well be "rm -rf /")
Now, if expr-subst gets considered a "different" thing, then code like
the one above might end up executing snafu, if $userString is just slightly
modified to {foo bar $(([exec snafu]))}, which definitely is a severe
breach of "subst"'s promises. Existing code shouldn't be forced to add a
new option to all "subst -nocommands" invocations to not become vulnerable.
PS: some more opinions on this topic:
- I like empty-named arrays and use them quite often, so I'd
consider $(...) redefinition for expr out of way until maybe Tcl 10.
- $((...)) is fine by me, and I'm proud of having (a while back) started
a discussion that led to disallowing certain characters in literal array
indices. My motivation back then was paving the way for assignment (even
to array elements) within expr - without ambiguity with builtin-functions.
- [(...)] looks good to me, too - I think the compatibility issues are
rather theoretic. If brackets are necessary, to allow subst -nocommands
to do its job correctly, then this would be my preference.
- [= ...] with a modified expr-language using barewords as variable
names would probably solve many (if not most) of the cases, where
[expr {...}] really "hurts" for its verbosity.
|
|
From: EricT <tw...@gm...> - 2025-10-30 07:40:47
|
Hi, Andreas
Jan has already implemented the changes for subst, they're in the tip-672
branch. I don't use the command much so I can't say for certain that it
solves the problem.
That was a nice summary of the current choices. You left out the one that I
actually like the best, the $=(...) option. It only breaks very rare cases
where someone wants a literal $= and can easily modify it to \$=
anyway, and likely already has by not realizing it wasn't needed.
But there's one little twist I've been playing with, that can drop the
outer ()'s as long as you don't need any whitespace outside ()'s, and you
can always add them if you do:
% set X 2; set Y 625 ; lrange [lseq 0 to 50] $=$X*2-1 end-$=int(
sqrt( $Y ) )
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Sorta like index expressions with regards to whitespace. Not too much of a
change.
On Wed, Oct 29, 2025 at 6:38 PM Andreas Leitgeb <av...@lo...> wrote:
> Jan Nijtmans <jan...@gm...> wrote:
> > 3) The "subst" command (and the Tcl_SubstObj() function) need
> > new options/flags to handle the new "expression" substitution,
> > separate to "variable" substitution and "command" substitution.
> > That solves the discussion, whether this substitution belongs
> > to "variable" or "command" substitution. I think it is neither ;-)
>
> This is very dangerous, imho:
>
> If some code contains: ... [subst -nocommands $userString]
> then the subst shall never ever cause any command to be called,
> no matter how it appears in the $userString.
>
> E.g. if $userString is {foo bar [exec snafu] } then the command "snafu"
> just should not be executed... ("snafu" might just as well be "rm -rf /")
>
> Now, if expr-subst gets considered a "different" thing, then code like
> the one above might end up executing snafu, if $userString is just slightly
> modified to {foo bar $(([exec snafu]))}, which definitely is a severe
> breach of "subst"'s promises. Existing code shouldn't be forced to add a
> new option to all "subst -nocommands" invocations to not become vulnerable.
>
> PS: some more opinions on this topic:
>
> - I like empty-named arrays and use them quite often, so I'd
> consider $(...) redefinition for expr out of way until maybe Tcl 10.
>
> - $((...)) is fine by me, and I'm proud of having (a while back) started
> a discussion that led to disallowing certain characters in literal
> array
> indices. My motivation back then was paving the way for assignment
> (even
> to array elements) within expr - without ambiguity with
> builtin-functions.
>
> - [(...)] looks good to me, too - I think the compatibility issues are
> rather theoretic. If brackets are necessary, to allow subst -nocommands
> to do its job correctly, then this would be my preference.
>
> - [= ...] with a modified expr-language using barewords as variable
> names would probably solve many (if not most) of the cases, where
> [expr {...}] really "hurts" for its verbosity.
>
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
>
|
|
From: Andreas L. <av...@lo...> - 2025-10-30 14:43:21
|
EricT <tw...@gm...> wrote: > You left out the one that I actually like the best, the $=(...) option. Well, I don't share that preference. Short answer is: it looks too "perlish" for my taste. Even $((...)) and [(...)] do scratch a bit on the border line towards "perlishness", but if I see ... $=$X*2-1 ... (with or without parentheses) I immediately wonder what equation or assignment this would be supposed to be. The "=" just doesn't fit at all to the proposed use - imho. And I also can't think of any other character that would fit better. > % set X 2; set Y 625 ; lrange [lseq 0 to 50] $=$X*2-1 end-$=int( sqrt( $Y ) ) I wish, that the [= X*2-1] approach could be incorporated directly into commands taking indices, and that they'd define values for "end" and other anchors. That would eliminate the need for "=" and allow: lrange ... X*2-1 (end-1)/2 lrange ... [getSomeIndex] (start+end)/2 (assuming new anchor "start") Only thing left to define is, how to separate anchornames from local variables and variable/array names from function-names like min,max. |
|
From: EricT <tw...@gm...> - 2025-10-30 16:02:44
|
Andreas:
Others have commented that $= makes one think, oh, this is going to
replace the $ with the expression on the right side. (See comments by
Rich on comp.lang.tcl)
The problem I have with expr is two fold. The security hole and the
hard on the eyes with too much text surrounding the expression. That's
why I like the idea of getting rid of the extra ()'s when they're not
needed.
Well, the prototype has served it's purpose. But it's not the
solution. Here's why.
Subject: Array subscript limitation in TIP 672 prototype - architectural issue
I've discovered a fundamental limitation in the TIP 672 prototype
implementation that affects array subscripts. I want to explain the
issue clearly and acknowledge Florent's observation about this.
The Problem:
The synthetic string approach (transforming $(expr) into [expr {expr}]
at parse time) works well for simple command arguments, but breaks for
array subscripts like:
set x($(1+2)) 1 # Crashes during compilation
Root Cause - The Pointer Arithmetic Problem:
Tcl's parser is buffer-based - all tokens must point into the same
contiguous source buffer. The compiler and other systems use pointer
arithmetic between tokens to calculate sizes and positions.
This has caused two problems with the synthetic string approach:
Problem 1: Line Number Tracking
Early in development, I encountered issues with TclLogCommandInfo()
which scans from the script start to the command location counting
newlines. The original code was:
for (p = script; p != command; p++) {
if (*p == '\n') {
iPtr->errorLine++;
}
}
When command pointed to the synthetic buffer and script pointed to the
original source, p would never equal command (different memory
regions), so the loop would scan forever through garbage memory until
it crashed.
I patched this by adding a null terminator check and pointer ordering
(mentioned in the comments for this routine):
// Only scan if command appears to be within script's memory region
if (command >= script) {
for (p = script; p != command && *p != '\0'; p++) {
if (*p == '\n') {
iPtr->errorLine++;
}
}
}
// If command < script, something is wrong - just use line 1
This prevented crashes by stopping at the null byte, but meant we
never computed accurate line numbers for errors in synthetic strings -
the scan would stop early when hitting the end of the original source,
producing incorrect line counts.
Problem 2: Array Subscript Compilation (Florent's Discovery)
Florent quickly identified that array subscripts would be problematic.
Investigation confirms this is the same underlying issue - the
compiler does pointer arithmetic between adjacent tokens:
// From TclPushVarName in tclCompile.c:
name = varTokenPtr[1].start;
nameLen = p - varTokenPtr[1].start;
elName = p + 1;
remainingLen = (varTokenPtr[2].start - p) - 1;
elNameLen = (varTokenPtr[n].start-p) + varTokenPtr[n].size - 1;
For set x($(1+2)) 1, the tokens are (for example during debug):
Token 3: TEXT "x(" > points to ORIGINAL source (0x281f7eb0184)
Token 4: COMMAND "[expr]" > points to SYNTHETIC buffer (0x281f7ec3070)
Token 5: TEXT ")" > points to ORIGINAL source (0x281f7eb018d)
When the compiler calculates varTokenPtr[2].start - p (synthetic
address minus original address), it produces garbage values - I've
seen 2+ million byte sizes - leading to crashes in memcpy.
The Architectural Challenge:
I don't know if pointer arithmetic between tokens can be eliminated
throughout Tcl's codebase - it may be a fundamental assumption. This
is likely why Florent's approach of a new token type is a more sound
approach:
- Creates a proper token type handled by new code
- New code paths don't make assumptions about token address relationships
- Works with Tcl's architecture rather than trying to work around it
The Pattern:
Both issues stem from the same root: trying to do stream-based macro
expansion (synthetic string replacement) in a buffer-based parser that
assumes all tokens share a common address space for pointer
arithmetic.
Why This Matters:
In a stream-based parser (like lex/yacc), you can expand macros by
pushing characters back onto the input stream. Tcl's buffer-based
architecture doesn't allow this - tokens are pointer/length pairs into
a fixed buffer, and pointer arithmetic assumes all tokens share the
same base address. Back in 2018 this limitation led to the synthetic
approach. And our implementation of (...) didn't permit use as
subscripts, so this never was tested until now. Thanks Florent.
The Prototype's Value:
The TIP 672 prototype successfully:
- Proved expression shorthand is implementable
- Moved past 20 years of syntax bikeshedding
- Generated concrete discussion and alternatives
- Demonstrated JimTcl compatibility is achievable
- Has 251 passing test cases for the parsing logic
However, even with this extensive testing, array subscripts fell
through the cracks. This reveals how the prototype focused on proving
the concept for simple command arguments, where the synthetic string
approach works, but didn't expose the architectural incompatibility
with contexts requiring token pointer arithmetic.
Going Forward:
The synthetic string approach may be fundamentally incompatible with
contexts requiring token pointer arithmetic. This suggests either:
1. Modifying the expression parser (Florent's approach) to handle the
syntax natively
2. Finding a way to keep all tokens in the original buffer (unclear how)
3. Some other architectural approach I haven't thought of
I want to be transparent about this limitation. The prototype served
its purpose in proving viability, but production implementation needs
deeper integration with Tcl's parser architecture.
Thoughts?
Best regards,
Eric
On Thu, Oct 30, 2025 at 7:43 AM Andreas Leitgeb <av...@lo...> wrote:
> EricT <tw...@gm...> wrote:
> > You left out the one that I actually like the best, the $=(...) option.
>
> Well, I don't share that preference.
> Short answer is: it looks too "perlish" for my taste.
>
> Even $((...)) and [(...)] do scratch a bit on the border line
> towards "perlishness", but if I see ... $=$X*2-1 ...
> (with or without parentheses) I immediately wonder what equation
> or assignment this would be supposed to be. The "=" just doesn't
> fit at all to the proposed use - imho. And I also can't think of
> any other character that would fit better.
>
> > % set X 2; set Y 625 ; lrange [lseq 0 to 50] $=$X*2-1 end-$=int(
> sqrt( $Y ) )
>
> I wish, that the [= X*2-1] approach could be incorporated directly
> into commands taking indices, and that they'd define values for "end"
> and other anchors. That would eliminate the need for "=" and allow:
> lrange ... X*2-1 (end-1)/2
> lrange ... [getSomeIndex] (start+end)/2 (assuming new anchor "start")
> Only thing left to define is, how to separate anchornames from local
> variables and variable/array names from function-names like min,max.
>
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
>
|
|
From: Florent M. <flo...@gm...> - 2025-10-31 08:50:41
|
Hi Eric,
I just noticed a problem, while you explained it, and very clearly, imho.
Your explanation points well a whole category of coding logic : "pointer
arithmetic on the parsed buffer"
How many times is this logic used ? In what files ? For which purposes
? As you showed, now we know that it is used at least in "Line Number
Tracking" and "array indice compilation". This could be investigate
further. Maybe those occurences are rare enough to be resolved ?
I must admit, I was interested by the "synthetic string" approach. It's
a kind of "Subst", in the parsing period of compilation, it make me
think about "Macro". Maybe it could be usefull sometimes.
So, let me just give an idea on it : Could a "TCL_TOKEN_SYNTHETIC" token
be created, to account for the gap between the parsed buffer and the
synthetic one ?
--------------------------------------------------------------
Tcl_token *syntheticToken;
syntheticToken -> start = parseBufferLocation; // the location in the
parsed buffer where the synthethic string has began
syntheticToken -> size = (syntheticStringLocation
- parseBufferLocation); // bridge : go forward over the gap;
syntheticToken -> type = TCL_TOKEN_SYNTHETIC;
syntheticToken -> numComponent = 2;
Tcl_token *commandToken;
commandToken -> start = syntheticStringLocation; // the location of the
synthetic buffer
commandToken -> size = syntheticStringLength; // the length of the
synthetic buffer
commandToken -> type = TCL_TOKEN_Command;
commandToken -> numComponent = 0;
Tcl_token *syntheticToken;
syntheticToken -> start = parseBufferLocation; // the location in the
parsed buffer where the synthethic string has end
syntheticToken -> size = (parseBufferLocation -
syntheticStringLocation); // bridge : how to go backward over the gap;
syntheticToken -> type = TCL_TOKEN_SYNTHETIC;
syntheticToken -> numComponent = 0;
------------------------------------------
Of course, I doubt this can resolve all issues, since I imagine that the
parse informations aren't kept forever. Those remaining issues would
have to be resolved into the line number tracking function...
So, as you said, the simplest road is to use a new TCL_TOKEN.
Myself, I used TCL_TOKEN_SUB_EXPR into parse_Expr, because I wanted the
syntax to be nestable, and since this TOKEN seems to be used nowhere
anymore (supposely an historical result of TCL evolution). I may be
wrong on this assumption.
I even tried to use it (quickly) in Parse_Token, instead of using a
Command Token that point to the synthetic string. But I couldn't get it
working. As I had no time to investigate why, so I had reverted my code
back to the synthetic solution.
Since you discover that synthetic string is not a viable solution, and
explain clearly the reason why, I think I will have to insist more on
this abvorted try : I will detect the ranges of "expression substitution
syntax" into Parse_Token, but mark these ranges as a TCL_TOKEN_SUB_EXPR,
then I will have to modifiy all the compilers parts that are concerned
to evaluate it as expression, till it works. But I will have to study
deeply this compiler (even maybe the execution part), what is not
possible yet.
Anyway, thank you for those great insights you gave about the TCL
internal. It will help a lot.
Best regards
Florent
Le 30/10/2025 à 17:02, EricT a écrit :
> Subject: Array subscript limitation in TIP 672 prototype - architectural issue
>
> I've discovered a fundamental limitation in the TIP 672 prototype implementation that affects array subscripts. I want to explain the issue clearly and acknowledgeFlorent's observation about this.
>
> The Problem:
>
> The synthetic string approach (transforming $(expr) into [expr {expr}] at parse time) works well for simple command arguments, but breaks for array subscripts like:
>
> set x($(1+2)) 1 # Crashes during compilation
>
> Root Cause - The Pointer Arithmetic Problem:
>
> Tcl's parser is buffer-based - all tokens must point into the same contiguous source buffer. The compiler and other systems use pointer arithmetic between tokens to calculate sizes and positions.
>
> This has caused two problems with the synthetic string approach:
>
>
> Problem 1: Line Number Tracking
>
> Early in development, I encountered issues withTclLogCommandInfo() which scans from the script start to the command location counting newlines. The original code was:
>
> for (p = script; p != command; p++) {
> if (*p == '\n') {
> iPtr->errorLine++;
> }
> }
>
> When command pointed to the synthetic buffer and script pointed to the original source, p would never equal command (different memory regions), so the loop would scan forever through garbage memory until it crashed.
>
> I patched this by adding a null terminator check and pointer ordering (mentioned in the comments for this routine):
>
> // Only scan if command appears to be within script's memory region
> if (command >= script) {
> for (p = script; p != command && *p != '\0'; p++) {
> if (*p == '\n') {
> iPtr->errorLine++;
> }
> }
> }
> // If command < script, something is wrong - just use line 1
>
> This prevented crashes by stopping at the null byte, but meant we never computed accurate line numbers for errors in synthetic strings - the scan would stop early when hitting the end of the original source, producing incorrect line counts.
>
>
> Problem 2: Array Subscript Compilation (Florent's Discovery)
>
> Florent quickly identified that array subscripts would be problematic. Investigation confirms this is the same underlying issue - the compiler does pointer arithmetic between adjacent tokens:
>
> // FromTclPushVarName intclCompile.c:
> name =varTokenPtr[1].start;
> nameLen = p -varTokenPtr[1].start;
> elName = p + 1;
> remainingLen = (varTokenPtr[2].start - p) - 1;
> elNameLen = (varTokenPtr[n].start-p) +varTokenPtr[n].size - 1;
>
> For set x($(1+2)) 1, the tokens are (for example during debug):
>
> Token 3: TEXT "x(" > points to ORIGINAL source (0x281f7eb0184)
> Token 4: COMMAND "[expr]" > points to SYNTHETIC buffer (0x281f7ec3070)
> Token 5: TEXT ")" > points to ORIGINAL source (0x281f7eb018d)
>
> When the compiler calculatesvarTokenPtr[2].start - p (synthetic address minus original address), it produces garbage values - I've seen 2+ million byte sizes - leading to crashes inmemcpy.
>
> The Architectural Challenge:
>
> I don't know if pointer arithmetic between tokens can be eliminated throughoutTcl's codebase - it may be a fundamental assumption. This is likely whyFlorent's approach of a new token type is a more sound approach:
>
> - Creates a proper token type handled by new code
> - New code paths don't make assumptions about token address relationships
> - Works withTcl's architecture rather than trying to work around it
>
> The Pattern:
>
> Both issues stem from the same root: trying to do stream-based macro expansion (synthetic string replacement) in a buffer-based parser that assumes all tokens share a common address space for pointer arithmetic.
>
> Why This Matters:
>
> In a stream-based parser (likelex/yacc), you can expand macros by pushing characters back onto the input stream.Tcl's buffer-based architecture doesn't allow this - tokens are pointer/length pairs into a fixed buffer, and pointer arithmetic assumes all tokens share the same base address. Back in 2018 this limitation led to the synthetic approach. And our implementation of (...) didn't permit use as subscripts, so this never was tested until now. Thanks Florent.
>
> The Prototype's Value:
>
> The TIP 672 prototype successfully:
> - Proved expression shorthand is implementable
> - Moved past 20 years of syntaxbikeshedding
> - Generated concrete discussion and alternatives
> - DemonstratedJimTcl compatibility is achievable
> - Has 251 passing test cases for the parsing logic
>
> However, even with this extensive testing, array subscripts fell through the cracks. This reveals how the prototype focused on proving the concept for simple command arguments, where the synthetic string approach works, but didn't expose the architectural incompatibility with contexts requiring token pointer arithmetic.
>
> Going Forward:
>
> The synthetic string approach may be fundamentally incompatible with contexts requiring token pointer arithmetic. This suggests either:
>
> 1. Modifying the expression parser (Florent's approach) to handle the syntaxnatively
> 2. Finding a way to keep all tokens in the original buffer (unclear how)
> 3. Some other architectural approach I haven't thought of
>
> I want to be transparent about this limitation. The prototype served its purpose in proving viability, but production implementation needs deeper integration withTcl's parser architecture.
>
> Thoughts?
>
> Best regards,
> Eric
>
>
> On Thu, Oct 30, 2025 at 7:43 AM Andreas Leitgeb <av...@lo...> wrote:
>
> EricT <tw...@gm...> wrote:
> > You left out the one that I actually like the best, the $=(...)
> option.
>
> Well, I don't share that preference.
> Short answer is: it looks too "perlish" for my taste.
>
> Even $((...)) and [(...)] do scratch a bit on the border line
> towards "perlishness", but if I see ... $=$X*2-1 ...
> (with or without parentheses) I immediately wonder what equation
> or assignment this would be supposed to be. The "=" just doesn't
> fit at all to the proposed use - imho. And I also can't think of
> any other character that would fit better.
>
> > % set X 2; set Y 625 ; lrange [lseq 0 to 50] $=$X*2-1
> end-$=int( sqrt( $Y ) )
>
> I wish, that the [= X*2-1] approach could be incorporated directly
> into commands taking indices, and that they'd define values for "end"
> and other anchors. That would eliminate the need for "=" and allow:
> lrange ... X*2-1 (end-1)/2
> lrange ... [getSomeIndex] (start+end)/2 (assuming new anchor
> "start")
> Only thing left to define is, how to separate anchornames from local
> variables and variable/array names from function-names like min,max.
>
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
>
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
|
|
From: da S. P. J <pet...@fl...> - 2025-10-31 14:30:35
|
> Even $((...)) and [(...)] do scratch a bit on the border line towards "perlishness", but if I see ... $=$X*2-1 ... I have been convinced by this discussion the whole thing is galloping perlism. :( From: Andreas Leitgeb <av...@lo...> Date: Thursday, October 30, 2025 at 09:44 To: tcl...@li... <tcl...@li...> Subject: Re: [TCLCORE] Fwd: TIP 672 Implementation Complete - Ready for Sponsorship CAUTION - EXTERNAL EMAIL: This message originated from outside of your organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. EricT <tw...@gm...> wrote: > You left out the one that I actually like the best, the $=(...) option. Well, I don't share that preference. Short answer is: it looks too "perlish" for my taste. Even $((...)) and [(...)] do scratch a bit on the border line towards "perlishness", but if I see ... $=$X*2-1 ... (with or without parentheses) I immediately wonder what equation or assignment this would be supposed to be. The "=" just doesn't fit at all to the proposed use - imho. And I also can't think of any other character that would fit better. > % set X 2; set Y 625 ; lrange [lseq 0 to 50] $=$X*2-1 end-$=int( sqrt( $Y ) ) I wish, that the [= X*2-1] approach could be incorporated directly into commands taking indices, and that they'd define values for "end" and other anchors. That would eliminate the need for "=" and allow: lrange ... X*2-1 (end-1)/2 lrange ... [getSomeIndex] (start+end)/2 (assuming new anchor "start") Only thing left to define is, how to separate anchornames from local variables and variable/array names from function-names like min,max. _______________________________________________ Tcl-Core mailing list Tcl...@li... https://urldefense.us/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_tcl-2Dcore&d=DwICAg&c=MASr1KIcYm9UGIT-jfIzwQg1YBeAkaJoBtxV_4o83uQ&r=BRyGRggIJd8TmKOhvEmGElFuDuCl3O5mT8opva3f-Uc&m=dFOY6M3wFlTTn_sZeS_S2x37oqkxVAyKbumWdj9PUzT1p1DYv_qdt2xKtXV2Qc-X&s=HOhpe7jGf8XZyMmIg2wPpiatP3ugJpxoSmuNpG6d3gs&e= |
|
From: Donal F. <don...@ma...> - 2025-10-31 15:17:39
|
Someone said (and I've lost track of who; thanks, Outlook!): lrange ... X*2-1 (end-1)/2 Please don't. That makes producing nice bytecode really awkward. That we did something like that with lseq is bad enough; the bytecode engine has to recursively call itself to handle that and that's just terrible. (Fortunately, it's not used so much.) Donal. |
|
From: EricT <tw...@gm...> - 2025-10-31 22:25:43
|
Hi Florent, Thanks for thinking through the bridge token approach - it's clever! However, I think you're right that the path forward is TCL_TOKEN_SUB_EXPR with new compiler code that doesn't assume buffer continuity. The challenge with bridge tokens is that every piece of code doing pointer arithmetic would need to be taught to recognize and skip over them - that could be dozens of places throughout the codebase. The "new token type with new code" approach is cleaner because the new code paths don't inherit the old assumptions. I look forward to seeing what you discover as you explore the TCL_TOKEN_SUB_EXPR approach. Understanding the compiler deeply will be valuable for getting this right. Regarding the lseq naked expression issue - I was involved in the TIP process that led to lseq, though Brian Griffin did the implementation. The naked expression feature allows things like lseq 1 to $x*2-1 (note: still needs the $ for variable substitution). I'm not certain if that's what they were referring to, but I understand the bytecode compiler concerns about commands invoking expr recursively. This is actually one reason I proposed $(...) in the first place - to provide expression substitution at the language level so that commands like lseq wouldn't need to handle naked expressions themselves. Let the parser/compiler handle it once, properly, rather than having individual commands invoke expr recursively. Additionally, the naked expression feature in lseq has other issues that may lead to it being deprecated, which further argues for a language-level solution. Best regards, Eric On Fri, Oct 31, 2025 at 8:18 AM Donal Fellows < don...@ma...> wrote: > > Someone said (and I've lost track of who; thanks, Outlook!): > > lrange ... X*2-1 (end-1)/2 > > > Please don't. That makes producing nice bytecode really awkward. That we > did something like that with *lseq* is bad enough; the bytecode engine > has to recursively call itself to handle that and that's just terrible. > (Fortunately, it's not used so much.) > > Donal. > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
|
From: Jan N. <jan...@gm...> - 2025-10-30 10:53:37
|
Op do 30 okt 2025 om 02:38 schreef Andreas Leitgeb:
> If some code contains: ... [subst -nocommands $userString]
> then the subst shall never ever cause any command to be called,
> no matter how it appears in the $userString.
Are you sure?:
% set userstring {$a([puts stdout "Command executed"])}
$a([puts stdout "Command executed"])
% subst -nocommand $userstring
Command executed
can't read "a()": no such variable
%
>From the documentation:
Note that the substitution of one kind can include substitution of
other kinds. For example, even when the \fB\-novariables\fR option
is specified, command substitution is performed without restriction.
This means that any variable substitution necessary to complete the
command substitution will still take place. Likewise, any command
substitution necessary to complete a variable substitution will
take place, even when \fB\-nocommands\fR is specified.
What we can do is remove the "-noexpression" option, only
providing the "-expression" option. So, make it off by default.
Hope this helps,
Jan Nijtmans
|
|
From: Andreas L. <av...@lo...> - 2025-10-30 14:15:49
|
Jan Nijtmans <jan...@gm...> opened my eyes:
> Op do 30 okt 2025 om 02:38 schreef Andreas Leitgeb:
> > If some code contains: ... [subst -nocommands $userString]
> > then the subst shall never ever cause any command to be called,
> > no matter how it appears in the $userString.
> Are you sure?:
> % set userstring {$a([puts stdout "Command executed"])}
> $a([puts stdout "Command executed"])
> % subst -nocommand $userstring
> Command executed
> can't read "a()": no such variable
> %
Damn.
> From the documentation:
> Note that the substitution of one kind can include substitution of
> other kinds. For example, even when the \fB\-novariables\fR option
> is specified, command substitution is performed without restriction.
> This means that any variable substitution necessary to complete the
> command substitution will still take place. Likewise, any command
> substitution necessary to complete a variable substitution will
> take place, even when \fB\-nocommands\fR is specified.
I wasn't aware of that... gotta check my codebase for such
occurrances and see how I could rewrite them to fix this worm
hole.
I know I used subst recently with both -nocommands and -novars - to
just let it do backslash-treatment. Does this also have a pathway to
command-subst, or will it have one, if expr turns into a "separate"
thing?
> What we can do is remove the "-noexpression" option, only
> providing the "-expression" option. So, make it off by default.
As I learned, excluding certain substitutions turns out to be a broken
design in the first place. Requiring "-expression" to be explicitly
turned on (rather than off) would probably at least save the current
semantics of [subst -nocommands -novariables $userString], so it causes
least pain among the expr-sensitive alternatives offered so far.
Of course, it is highly confusing to future users of subst, that
some types need excluding, while others need including, so it isn't
really "good design", either.
My lesson for myself is: avoid [subst] altogether.
I might miss the next dicussion about next "new thing", and
suddenly subst even with all previously known -no* options
turns essentially into eval...
Last question: if we get syntactic sugar for expr, is it
technically really not possible to still treat the new syntax
as if it were [expr {...}] for [subst] ?
|
|
From: Colin M. <col...@ya...> - 2025-11-01 16:49:40
|
On 30/10/2025 01:18, Andreas Leitgeb wrote:
> - [= ...] with a modified expr-language using barewords as variable
> names would probably solve many (if not most) of the cases, where
> [expr {...}] really "hurts" for its verbosity.
Here is a pure Tcl implementation of that:
proc = args {
set ex [join $args]
set exp [regsub -all
{(::)?[[:alpha:]]([[:alnum:]_]|::)*([^[:alnum:]_(]|$)} $ex {$&}]
uplevel expr $exp
}
This just finds all the possible variable names in the expression,
prefixes them with $ and then runs expr on the result. The regex is a
little complicated because:
* The variable name could include :: at the beginning or in the middle
for namespacing.
* We disallow single colon : within the name because that creates
ambiguity with the ternary operator ?: .
* We also disallow array references because we can't distinguish them
from mathfunc calls.
The fact that this can be implemented in pure Tcl shows that it doesn't
require any change to the dodekalogue rules, and so there is no
consequence for subst either.
Of course a real implementation should be done in C. It could then also
be possible to implement the extension Kevin Kenny suggested of adding
`=` as an assignment operator as in C within an expression, and perhaps
even chaining multiple assignments with `,` .
I saw the `expr` question is on the agenda for the call on Monday. I
don't usually join these calls but I would join it for this issue except
that I have an urgent dentist's appointment at that time. Perhaps I
should create a new TIP for this proposal?
Colin.
|
|
From: Colin M. <col...@ya...> - 2025-11-01 17:25:07
|
One point I forgot - a serious C implementation of this should probably
disallow any further $ or [] substitution after substituting the bare
variable names. This could also have a secondary use-case as a safe way
to run user-specified calculations.
On 01/11/2025 16:49, Colin Macleod via Tcl-Core wrote:
> On 30/10/2025 01:18, Andreas Leitgeb wrote:
>> - [= ...] with a modified expr-language using barewords as variable
>> names would probably solve many (if not most) of the cases, where
>> [expr {...}] really "hurts" for its verbosity.
>
> Here is a pure Tcl implementation of that:
>
> proc = args {
> set ex [join $args]
> set exp [regsub -all
> {(::)?[[:alpha:]]([[:alnum:]_]|::)*([^[:alnum:]_(]|$)} $ex {$&}]
> uplevel expr $exp
> }
>
> This just finds all the possible variable names in the expression,
> prefixes them with $ and then runs expr on the result. The regex is a
> little complicated because:
>
> * The variable name could include :: at the beginning or in the
> middle for namespacing.
> * We disallow single colon : within the name because that creates
> ambiguity with the ternary operator ?: .
> * We also disallow array references because we can't distinguish
> them from mathfunc calls.
>
> The fact that this can be implemented in pure Tcl shows that it
> doesn't require any change to the dodekalogue rules, and so there is
> no consequence for subst either.
>
> Of course a real implementation should be done in C. It could then
> also be possible to implement the extension Kevin Kenny suggested of
> adding `=` as an assignment operator as in C within an expression, and
> perhaps even chaining multiple assignments with `,` .
>
> I saw the `expr` question is on the agenda for the call on Monday. I
> don't usually join these calls but I would join it for this issue
> except that I have an urgent dentist's appointment at that time.
> Perhaps I should create a new TIP for this proposal?
>
> Colin.
>
>
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core |
|
From: Pietro C. <ga...@ga...> - 2025-11-02 08:05:08
Attachments:
smime.p7s
|
How does it handle a space between a mathfunc and its parenthesized argument list?
% expr {sin (12)}
-0.5365729180004349
--
Pietro Cerutti
I've pledged to give 10% of income to effective charities and invite you to join me.
https://givingwhatwecan.org
Sent from a small device - please excuse brevity and typos.
> On 1 Nov 2025, at 17:50, Colin Macleod via Tcl-Core <tcl...@li...> wrote:
>
>
> On 30/10/2025 01:18, Andreas Leitgeb wrote:
>>
>> - [= ...] with a modified expr-language using barewords as variable
>> names would probably solve many (if not most) of the cases, where
>> [expr {...}] really "hurts" for its verbosity.
> Here is a pure Tcl implementation of that:
>
> proc = args {
> set ex [join $args]
> set exp [regsub -all {(::)?[[:alpha:]]([[:alnum:]_]|::)*([^[:alnum:]_(]|$)} $ex {$&}]
> uplevel expr $exp
> }
>
> This just finds all the possible variable names in the expression, prefixes them with $ and then runs expr on the result. The regex is a little complicated because:
>
> The variable name could include :: at the beginning or in the middle for namespacing.
> We disallow single colon : within the name because that creates ambiguity with the ternary operator ?: .
> We also disallow array references because we can't distinguish them from mathfunc calls.
> The fact that this can be implemented in pure Tcl shows that it doesn't require any change to the dodekalogue rules, and so there is no consequence for subst either.
>
> Of course a real implementation should be done in C. It could then also be possible to implement the extension Kevin Kenny suggested of adding `=` as an assignment operator as in C within an expression, and perhaps even chaining multiple assignments with `,` .
>
> I saw the `expr` question is on the agenda for the call on Monday. I don't usually join these calls but I would join it for this issue except that I have an urgent dentist's appointment at that time. Perhaps I should create a new TIP for this proposal?
>
> Colin.
>
> _______________________________________________
> Tcl-Core mailing list
> Tcl...@li...
> https://lists.sourceforge.net/lists/listinfo/tcl-core
|
|
From: Colin M. <col...@ya...> - 2025-11-02 08:13:29
|
Indeed, this toy implementation doesn't handle that:
% = sin (12)
can't read "sin": no such variable
I'm not sure that's serious, but it could be fixed in a C implementation.
Colin.
On 02/11/2025 08:04, Pietro Cerutti wrote:
> How does it handle a space between a mathfunc and its parenthesized
> argument list?
>
> % expr {sin (12)}
> -0.5365729180004349
>
> --
> Pietro Cerutti
> I've pledged to give 10% of income to effective charitiesand invite
> you to join me.
> https://givingwhatwecan.org
>
> Sent from a small device - please excuse brevity and typos.
>
>
>> On 1 Nov 2025, at 17:50, Colin Macleod via Tcl-Core
>> <tcl...@li...> wrote:
>>
>>
>> On 30/10/2025 01:18, Andreas Leitgeb wrote:
>>> - [= ...] with a modified expr-language using barewords as variable
>>> names would probably solve many (if not most) of the cases, where
>>> [expr {...}] really "hurts" for its verbosity.
>>
>> Here is a pure Tcl implementation of that:
>>
>> proc = args {
>> set ex [join $args]
>> set exp [regsub -all
>> {(::)?[[:alpha:]]([[:alnum:]_]|::)*([^[:alnum:]_(]|$)} $ex {$&}]
>> uplevel expr $exp
>> }
>>
>> This just finds all the possible variable names in the expression,
>> prefixes them with $ and then runs expr on the result. The regex is
>> a little complicated because:
>>
>> * The variable name could include :: at the beginning or in the
>> middle for namespacing.
>> * We disallow single colon : within the name because that creates
>> ambiguity with the ternary operator ?: .
>> * We also disallow array references because we can't distinguish
>> them from mathfunc calls.
>>
>> The fact that this can be implemented in pure Tcl shows that it
>> doesn't require any change to the dodekalogue rules, and so there is
>> no consequence for subst either.
>>
>> Of course a real implementation should be done in C. It could then
>> also be possible to implement the extension Kevin Kenny suggested of
>> adding `=` as an assignment operator as in C within an expression,
>> and perhaps even chaining multiple assignments with `,` .
>>
>> I saw the `expr` question is on the agenda for the call on Monday. I
>> don't usually join these calls but I would join it for this issue
>> except that I have an urgent dentist's appointment at that time.
>> Perhaps I should create a new TIP for this proposal?
>>
>> Colin.
>>
>> _______________________________________________
>> Tcl-Core mailing list
>> Tcl...@li...
>> https://lists.sourceforge.net/lists/listinfo/tcl-core |