#1255 string match oddity when - is the last char

obsolete: 8.3
open-later
3
2004-10-04
2000-10-26
Anonymous
No

OriginalBugID: 4438 Bug
Version: 8.3
SubmitDate: '2000-03-21'
LastModified: '2000-04-03'
Severity: MED
Status: Assigned
Submitter: techsupp
ChangedBy: hobbs
OS: Windows 95
OSVersion: OSR2
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'

Name:
Keith Lea

ReproducibleScript:
% string match {[a-z0-9_/-]} \\ 1
% string match {[a-z0-9_/]} \\ 0

It's accidently interp'ing the "/-]" as "/-]]", taking last
] as ] endrange and ] endblock.
-- 04/03/2000 hobbs
This needs to be fixed in Tcl_String(Case)Match in tclUtil.c,
but wait until 8.4 just in case someone was counting on the
previous perverse behavior.
-- 04/03/2000 hobbs

Discussion

  • Donal K. Fellows

    Hmm. The problem seems to be that the first pattern is actually malformed by the rules of [string match], but there is no way to indicate this. I suppose the correct way of dealing with this is to decide that we were not really matching a range after all, but that's not very good at all. Either that, or we state that a malformed pattern matches nothing at all.

    Hmm. On successful matching of a range, should we really back up a character at the unexpected end of string , or should we fail at that point?

     
  • Donal K. Fellows

    Improved detection of bug:
    % sstring match \[a-] ]
    1
    % string match \[a-]x ]x
    0

     
  • Donal K. Fellows

    • priority: 5 --> 6
     
  • Donal K. Fellows

    • labels: 104238 --> 18. Commands M-Z
     
  • Donal K. Fellows

    • assigned_to: nobody --> dkf
     
  • Donal K. Fellows

    Apparently, the way [string match] handles syntactically invalid patterns is by failing to match anything at all. It's not entirely clear to me that this is an optimal strategy...

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    This behaviour won't be changed before 9.0

    Any fixes that *are* done must be applied to code in both tclUtil.c and tclUtf.c

     
  • Donal K. Fellows

    • priority: 6 --> 3
    • status: open --> open-later
     
  • Simon Bachmann

    Simon Bachmann - 2003-11-23

    Logged In: YES
    user_id=915599

    I encountered a problem with `string match' too. I suppouse
    it is, finally, the same bug: it is possible to match a [ in
    a character set quoting int with \. Thits should work with ]
    too, but it doesn't!

    % string match {[\[]} {[}
    1
    % string match {[\]]} {]}
    0

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Backslashes aren't special inside the square-bracket term,
    so the first match doesn't match what you are expecting and
    the second match isn't looking for what you think it is:

    % set str \\ \ % string match {[\[]} $str
    1
    % set str {\]}
    \]
    % string match {[\]]} $str
    1

    The glob-matching engine used by [string match] isn't very
    smart... :^( Consider using regular expressions instead.

     
  • Donal K. Fellows

    • summary: string match doesn't work as documented when - is the last c --> string match oddity when - is the last char
     
  • Matthias Kraft

    Matthias Kraft - 2007-06-14

    Logged In: YES
    user_id=330806
    Originator: NO

    Hi!

    I add another example to the list:

    % string match {*[.-]*} "2.1-Beta"
    0
    % string match {*[-.]*} "2.1-Beta"
    1
    % string match {*[.-]*} "Beta-2.1"
    1
    % string match {*[-.]*} "Beta-2.1"
    1

    If this is not going to be fixed soon, I suggest to at least document this behavior. Could be enough to just add the hyphen to the list of characters that need to be escaped...

    kind regards
    -- Matthias Kraft

     
  • Matthias Kraft

    Matthias Kraft - 2007-06-14

    Logged In: YES
    user_id=330806
    Originator: NO

    I forgot to mention that the example below is taken from a recent Tcl interpreter (8.4.14)...

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902
    Originator: NO

    Comments below indicate "wait until 9.0", this being a strategy that was mainly from Jeff.

     
  • gustafn

    gustafn - 2010-11-18

    i was hit by apparently the same problem.

    % string match {[-.]} -
    1
    % string match {[.-]} -
    0
    # even worse
    % string match {[.-]} A
    1

    This is certainly unexpected behavior and should be at least documented.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks