#2522 matching variables are not being set correctly

obsolete: 8.4.4
closed-invalid
5
2003-11-10
2003-11-09
Anonymous
No

Tcl Version :8.4.4
OS Platform :LINUX red hat 7.3
Problem Behavior: matching variables are not being set
when there is + in the end of the pattern

%regexp -all {([0-9|A-Z]*)} {31MAR+} match var
2
%puts $var

%regexp -all {([0-9|A-Z]*)} {31MAR} match var
1
%puts $var
31MAR

Discussion

  • miguel sofer

    miguel sofer - 2003-11-09
    • assigned_to: nobody --> dgp
    • status: open --> pending-invalid
     
  • miguel sofer

    miguel sofer - 2003-11-09

    Logged In: YES
    user_id=148712

    There is no bug here?

    Look at what is happening if you return both matches inline:
    % regexp -inline -all {([0-9|A-Z]*)} {31MAR+}
    31MAR 31MAR {} {}
    % regexp -indices -inline -all {([0-9|A-Z]*)} {31MAR+}
    {0 4} {0 4} {5 4} {5 4}

    This is probably not what the programmer wanted, but it is
    what he requested in the regexp: a sequence of 'ZERO or more
    digits, "|" or caps', and return the matching sequence. Two
    such sequences are found here: "31MAR" (wanted) and a zero
    length one.

    The pattern seems badly specified to me, a proper definition
    of zero-length matches is at least difficult. The correct
    thing would have been to request a sequence of ONE or more
    such symbols (+ instead of *)
    % regexp -inline -all {([0-9|A-Z]+)} {31MAR+}
    31MAR 31MAR

    BTW, I can imagine that the wanted regexp is actually
    {([0-9A-Z]+)} - ie, that the symbol "|" should not match -
    but this is second-guessing the submitter.

    Assigning to another mainatiner for confirmation of this
    dismissal.

     
  • Don Porter

    Don Porter - 2003-11-10
    • milestone: --> obsolete: 8.4.4
    • status: pending-invalid --> closed-invalid
     
  • Don Porter

    Don Porter - 2003-11-10

    Logged In: YES
    user_id=80530

    The pattern matches the empty string.

    The default "greedy"
    scheme of matching sucks up the
    whole string as the first match, though.
    Then the trailing empty string is the only
    one left that can match.

    Consider:
    % regexp -all -inline {[0-9|A-Z]*} 31MAR+
    31MAR {}

    I see no bug here.

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Also consider this:
    %regexp -all -inline {[0-9]+|[A-Z]+} {31MAR+}
    31 MAR