#399 word delimiters ignored

release
closed-fixed
Eddy De Greef
Program (402)
5
2004-08-20
2004-08-13
Victor Shoup
No

I have orverriden the default word delimiters for latex
(which by the way, should be changed).
This worked fine in v5.1.1, but not in v5.4.
In v5.4, while incremental search respects my word delimiters,
ordinary search and search and replace does not.

For example, I added the character "_" to the word delimiter list.
When I do a search for the pattern "<x>" in the text "x_1",
it does not fine "x".

Discussion

  • Eddy De Greef
    Eddy De Greef
    2004-08-13

    Logged In: YES
    user_id=73597

    I think the problem only exists for the '_' character. It's
    not a general delimiter problem.

    The meaning of '<' and '>' was slightly changed in 5.4 to be
    more conventional, but as a side effect '_' is now always
    considered to be a word character, so there can be no right
    word boundary in front of it.

    In fact, other regular expressions (\B, \w, and \W) will
    treat '_' (incorrectly) as a word character too in your
    case, even in v5.1.1, I think.

    So we should fix this and treat '_' only as a word character
    when it is not defined as a delimiter. Since this is not a
    regression (except for the '<' and '>' case), I presume this
    will have to wait till v5.5 has been released because we're
    currently in release preparation mode.

     
  • Eddy De Greef
    Eddy De Greef
    2004-08-13

    • labels: --> Program
    • milestone: --> release
    • assigned_to: nobody --> edg
     
  • Victor Shoup
    Victor Shoup
    2004-08-13

    Logged In: YES
    user_id=1103421

    I have found nedit to be really great for editing latex files, and the ability
    to make "_" a word delimiter is extremely useful for changing
    variable names using "<...>", since "_" is the subscript character in latex.
    If you can fix this in upcoming versions, I'll continue to be a happy
    nedit user.
    For now, I'll have to continue using v5.1.1, since I find this feature
    indepensable.

     
  • Victor Shoup
    Victor Shoup
    2004-08-13

    Logged In: YES
    user_id=1103421

    Just one other comment (and then I'll be quiet).
    In v5.1.1, double clicking to select a word does the right thing with "_",
    while \w does not. I never noticed that before.

    I'm not sure why edg does not classify this as a regression...
    it seems like that is a subjective call...as far as I'm concerned,
    "<...>", as well as double clicking to select a word, worked fine before,
    but now they are broken (in addition to \w, which was apparently
    always broken).

    I'm also wondering why nedit does not have a variable specifying
    which characters are word characters, rather than which characters
    are delimiter characters...the former would seem more natural
    (that's what vim does, for example).

     
  • Eddy De Greef
    Eddy De Greef
    2004-08-13

    Logged In: YES
    user_id=73597

    > I'm not sure why edg does not classify this as a regression...

    I was referring to the fact that the regular expression
    engine always treats an underscore as word character (for
    \w, \W, \B since the beginning and for < and > since v5.4; <
    and > only worked by accident before v5.4).

    But I suppose you are right. Even if < and > worked by
    accident, it's still a regression from a user's point of
    view (although not a 5.5 regression and I'm not sure whether
    I'm "allowed" to fix older regressions right now).

    I'd rather fix all cases at once, though, instead of only
    this one now and the \w, \W, and \B cases later on.

    > I'm also wondering why nedit does not have a variable
    specifying
    which characters are word characters...

    Good question. It may have been a poor decision, but it
    would be hard to change this now or in the future.
    NEdit implicitly defines 3 classes of characters, actually:
    word characters (alphanumeric, locale-dependent +
    underscore), delimiters, and the "rest". So word characters
    are not the inverse of delimiters.

     
  • Joerg Fischer
    Joerg Fischer
    2004-08-13

    Logged In: YES
    user_id=918104

    I don't understand why you stick to 5.1.1. Version 5.4 or
    the upcoming 5.5 are much more comfortable to use.
    (And contain fewer bugs.)

    Ok, there is a problem with regex and "_", and Eddy will
    fix it (probably not for 5.5, since the problem was in
    there before).

    But if all you want to do is to change variable names,
    you can still do so by replacing
    "foo(?:>|(?=_))" with "baa", or am I missing anything?

    (It is only about regex search; literal search and selections
    do work, don't they?)

     
  • Nathan Gray
    Nathan Gray
    2004-08-18

    Logged In: YES
    user_id=121553

    Eddy,

    I think you should try to fix this for 5.5. The new < and > behavior was
    introduced for 5.4 so it's fair game, but you should probably also fix the
    rest to maintain (or perhaps introduce) consistency.

     
  • Eddy De Greef
    Eddy De Greef
    2004-08-18

    Logged In: YES
    user_id=73597

    I checked the code and fixing everything to introduce
    consistency is definitely not something that I want to do
    for 5.5 any more. It requires changes all over the place.
    I'll restrict to patching the word boundary implementations
    only for now, to solve the submitter's problem (this
    requires only minor changes).
    On the longer term, we may have to make some more
    fundamental changes to the regex language to allow more
    flexibility in defining character classes etc., but that
    will require more thought and time.

     
  • Eddy De Greef
    Eddy De Greef
    2004-08-20

    Logged In: YES
    user_id=73597

    I've committed a fix to CVS.
    After some thought and some off-line discussion with Victor,
    I found a solution that almost gives back the 5.1.1
    behaviour without breaking the fix for the problem that we
    wanted to solve in 5.4, and which required only minor changes.
    I think this solves Victor's problem and restores the
    consistency that got lost in 5.4, without any side-effects.

     
  • Eddy De Greef
    Eddy De Greef
    2004-08-20

    • status: open --> closed-fixed