Menu

#4754 "switch -regexp -indexvar" gives invalid range

obsolete: 8.5.9
closed-fixed
5
2012-05-17
2010-11-10
No

In TIP #75 is told, that the new -indexvar option, related to the -regexp option of switch, should behave like in the -indices option in the regexp command to return the range of the found matche and the ranges of the sub matches for any sub expression.
The man page points to "regexp -indices", too, but tells the index variable contains range(s) from the first matching character to the next after the last matching character.

So - TIP #75 is not realized and the behavior of "switch -regexp -indexvar" is not comparable to "regexp -indices".

An example:

% switch -regexp -indexvar i -matchvar m "abcdef" {
^abc {
puts "matchvar = '$m'";
puts "indexvar = [list $i]";
puts "string range = '[string range "abcdef" {*}[lindex $i 0]]'";
}
}
matchvar = 'abc'
indexvar = {{0 3}}
string range = 'abcd'
% string range "abcdef" {*}[lindex [regexp -indices -inline {^abc} "abcdef"] 0]
abc

This behavior should be consistent and be corrected for the switch option -indexvar!

Discussion

  • Martin Lemburg

    Martin Lemburg - 2010-11-10

    This bug is related to 8.6beta, too!

     
  • Martin Lemburg

    Martin Lemburg - 2010-11-10

    In tclCmdMZ.c:
    line 276+277 (Tcl_RegexpObjCmd):
    match = Tcl_RegExpExecObj(interp, regExpr, objPtr, offset, numMatchesSaved, eflags);

    line 3740+3741 (Tcl_SwitchObjCmd):
    int matched = Tcl_RegExpExecObj(interp, regExpr, stringObj, 0, numMatchesSaved, 0);

    The first call to Tcl_RegExpExecObj has an "offset", that later on in ...

    line 341-343:
    if (end >= offset) {
    end--;
    }

    ... is used to correct the "end" index to point to the last character of the match.

    This is missing in Tcl_SwitchObjCmd!

    Since my time is limited I was not yet able to think about the offset ... :(

     
  • Alexandre Ferrieux

    Sure, the only problem is that the code is consistent with the documentation. Bug locked in :(
    Ready to TIP ?

     
  • Martin Lemburg

    Martin Lemburg - 2010-11-11

    IMHO wrong implemented or not implemented as specified software is buggy or the specification must be changed.

    And there should really no need to TIP a change to let "switch -regexp -indexvar" behave like specified!

     
  • Alexandre Ferrieux

    At the end of the day, what count as specification is the manpage, not the unwritten intention behind the TIP. Here the manpage says:

    ... will be a two-element list specifying the index of
    the start and index of the first character after the end of
    the overall substring of the input string

    Hence, any application working today and using the -indexvar option, with its current and documented semantics, will sunddenly fail if the change is applied. That is an API change, hence that needs a TIP.

     
  • Andreas Leitgeb

    Andreas Leitgeb - 2010-11-11

    The current docu is in itself contradictory by doing both: stating the (non-TIP75-conformant) behaviour AND referring to regexp as being alike.

    Since this current behaviour is not only at odds with regexp, but also with everything in tcl that involves ranges, it should be seen as a bug that involves both implementation and docu, especially as keeping it is likely to create confusion and do more harm than changing this relatively new feature.

     
  • Donal K. Fellows

    The intent was clearly to mirror [regexp -indices]. That it does not is a bug.

     
  • Martin Lemburg

    Martin Lemburg - 2010-11-11

    What is more important ... the described, but wrong implemented intention (regexp -indices behavior) ... or a man page describing correctly the wrong behavior - which it should or must, because otherwise the man page would be "buggy"!

    No - no behavior is right, only because its documented, but only if the way to the behavior is well documented and the final behavior matches the intentions, which may change on that way!

    Here the intentions were clear from the start, but are not met at the end!

    And - the specification justifies the behavior not the man page, which only describes!

    So even a bug fix could cause incompabilities!

     
  • Donal K. Fellows

    Sure, it could cause incompatibilities to fix a bug but it was nonetheless a bug. Now it's a fixed bug. :-) Note from the ChangeLog:

    ***POTENTIAL INCOMPATIBILITY***
    Uses of [switch -regexp -indexvar] that previously compensated for the
    wrong offsets (by subtracting 1 from the end indices) now do not need
    to do so as the value is correct.

     
  • Donal K. Fellows

    • status: open --> closed-fixed