|
From: <apn...@ya...> - 2025-11-17 17:17:29
|
TIP 615 is really two parts.
The first part is introduction of the [string is index] command. This seems
straightforward and may be useful for validation purposes. However, I would
like clarity what it will return for index forms that can be used with
commands like lindex. For example,
% string is index {0 0 0}
0
Given {0 0 0} is a valid index to lindex (and other commands) is this
intentional or a bug? If this is by design, it means the command cannot be
used to check if an index is valid to be passed to lindex, lsearch etc. If
this is a bug, and the command should return 1, then the command cannot be
used to check if an index is valid for lreplace and similar. This very much
limits the usefulness of the command and some clarity on the use case would
be beneficial.
The second part of the TIP proposes the use of empty strings as indices.
This gives me some heartburn from both conceptual and practical
perspectives.
Leaving aside my personal dislike of attaching semantics to the empty
string, special "tokens" should at least have the same semantics wherever
they are used. (As an aside, Tcl already violates this in its use of "end"
where it means the position of the last unit in some commands and the
position *after* the last unit in others.) The empty string semantics in TIP
615 is even worse in that semantics are dependent not only on the command
but on the context as well, in particular both the position of the index
argument AND the value of *other* arguments. This leads to behavior that is
inconsistent and non-intuitive.
Consider the commands
set s abcd
string range $s $index $index
string replace $s $index $index X
The natural expectation is that at most one character will be returned or
replaced. Instead, we have
% set index {}
% string range $s $index $index
abcd
% string replace $s $index $index X
X
The problem is that the empty string in the first position means something
different from the empty string in the second position. What exacerbates
this is the interpretation of the empty string in the second position
depends on the value in the first position. Further examples of this -
% string range $s 0 end
abcd
% string range $s -1 end
abcd
% string range $s 0 $index
a
% string range $s -1 $index
abcd
Note how the interpretation of "end" is the same in the first two. $index
(empty string) is on the other hand interpreted differently depending on the
value of a different argument being 0 or -1.
This is not only unintuitive, it also makes it harder to reason about
programs. Even the simplest of statements "string range $s $index $index" is
no longer simple to reason about in terms of effects.
Further, it also leads to incompatibilities with Tcl 9.
Tcl 9:
% lindex {a b c} {}
a b c
TIP 615:
% lindex {a b c} {}
(empty string)
Left unfixed, this means 9.1 is incompatible with 9.0 in this (and probably
other) cases. On the other hand, if you fix it, it means the lindex command
treats the empty string index differently from lrange et al which would be
horrible.
I also expect the syntax to be consistently usable in all index contexts.
Permitting it to be used for some indices and not others creates complexity
for users. Contrasting use of the empty string with the "end" index in TIP
615
% lsearch -index [list end] {{a 1} {b 0} {c 2}} 0
1
% lsearch -index [list {}] {{a 1} {b 0} {c 2}} 0
Empty string cannot be compiled as index
Perhaps the above is a fixable bug, not a conceptual failing. But otherwise,
consider the difficulty in using constructs like [list $index] as an
argument to lsearch where it is not known whether $index can be empty.
I have not looked at the effects on the byte code compiler if any. One thing
that would need to be looked at is the encoding of the empty string within
bytecode. As it currently stands, the encoding of indices already covers the
entire domain of from -INT_MAX:INT_MAX with negative numbers indicating
end-N index value range. There is no room for another index representing the
empty string. And there is certainly no current byte code mechanism to
encode based on the value of another argument. I haven't looked at the
implementation though because of the above objections. Maybe commands with
empty string indices are not being compiled.
Finally, with respect to benefits related to Tk compatibility, feels like
the tail wagging the dog. Missteps in index semantics in Tk (my *uninformed*
opinion because I do not understand the need there either) need not carry
back to Tcl.
/Ashok
|