The attached patch to tclTest.c and reg.test shows
how Tcl_RegExpExecObj returns incorrect matches
for many, many cases using the 'TCL_CAN_MATCH'
flag.
This flag is supposed to make regexp return (if it
can't find a complete match) the first possible
position at which a match could occur. However, it
returns positions at which it is quite obvious no
match could ever occur!
For example:
set pat {.x}
set line "asd asd"
# can match the last char, if followed by x
set res [testregexp -xflags -- c $pat $line resvar]
lappend res $resvar
Should return '0 6' to say there is no match (0) but
a match could possible occur at position 6 (with '.'
matching the final 'd', if an 'x' was to be appended
to the string).
Unfortunately it returns '0 0' which is quite wrong.
Logged In: YES
user_id=32170
Attaching patch to reg.test
windows eols in patch
Logged In: YES
user_id=32170
Attaching patch to tclTest.c
Logged In: YES
user_id=79902
Surely if you know where the end of the string is, you know
that the match area can't go past it! Yes?
Logged In: YES
user_id=32170
No: the whole point of the TCL_REG_CAN_MATCH flag is
to tell you where a match *might* be possible, if the
string given was a bit longer.
Logged In: YES
user_id=32170
Tcl now contains some 'knownBug' tests for this
problem (reg.test). Since tip113 has now passed, and
can easily exhibit this bug without the need to write C
code, I'm upping the bug priority.
Logged In: YES
user_id=32170
One more subtlety here. A full implementation of this flag
ought to provide two pieces of information:
(i) if there was no match, where a match could occur if
there was more text
(ii) even if there was a match, whether the match could be
longer (and possibly start earlier in the next) if there was
more text
The currently implementation handles (i) in a buggy way, and
doesn't handle (ii) at all.