NEdit / Bugs / #399 word delimiters ignored

#399 word delimiters ignored

Milestone: release

Status: closed-fixed

Owner: Eddy De Greef

Labels: Program (402)

Priority: 5

Updated: 2004-08-20

Created: 2004-08-13

Creator: Victor Shoup

Private: No

I have orverriden the default word delimiters for latex
(which by the way, should be changed).
This worked fine in v5.1.1, but not in v5.4.
In v5.4, while incremental search respects my word delimiters,
ordinary search and search and replace does not.

For example, I added the character "_" to the word delimiter list.
When I do a search for the pattern "<x>" in the text "x_1",
it does not fine "x".

Discussion

Eddy De Greef - 2004-08-13

Logged In: YES
user_id=73597

I think the problem only exists for the '_' character. It's
not a general delimiter problem.

The meaning of '<' and '>' was slightly changed in 5.4 to be
more conventional, but as a side effect '_' is now always
considered to be a word character, so there can be no right
word boundary in front of it.

In fact, other regular expressions (\B, \w, and \W) will
treat '_' (incorrectly) as a word character too in your
case, even in v5.1.1, I think.

So we should fix this and treat '_' only as a word character
when it is not defined as a delimiter. Since this is not a
regression (except for the '<' and '>' case), I presume this
will have to wait till v5.5 has been released because we're
currently in release preparation mode.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eddy De Greef - 2004-08-13

labels: --> Program

milestone: --> release

assigned_to: nobody --> edg
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Victor Shoup - 2004-08-13

Logged In: YES
user_id=1103421

I have found nedit to be really great for editing latex files, and the ability
to make "_" a word delimiter is extremely useful for changing
variable names using "<...>", since "_" is the subscript character in latex.
If you can fix this in upcoming versions, I'll continue to be a happy
nedit user.
For now, I'll have to continue using v5.1.1, since I find this feature
indepensable.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Victor Shoup - 2004-08-13

Logged In: YES
user_id=1103421

Just one other comment (and then I'll be quiet).
In v5.1.1, double clicking to select a word does the right thing with "_",
while \w does not. I never noticed that before.

I'm not sure why edg does not classify this as a regression...
it seems like that is a subjective call...as far as I'm concerned,
"<...>", as well as double clicking to select a word, worked fine before,
but now they are broken (in addition to \w, which was apparently
always broken).

I'm also wondering why nedit does not have a variable specifying
which characters are word characters, rather than which characters
are delimiter characters...the former would seem more natural
(that's what vim does, for example).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eddy De Greef - 2004-08-13

Logged In: YES
user_id=73597

> I'm not sure why edg does not classify this as a regression...

I was referring to the fact that the regular expression
engine always treats an underscore as word character (for
\w, \W, \B since the beginning and for < and > since v5.4; <
and > only worked by accident before v5.4).

But I suppose you are right. Even if < and > worked by
accident, it's still a regression from a user's point of
view (although not a 5.5 regression and I'm not sure whether
I'm "allowed" to fix older regressions right now).

I'd rather fix all cases at once, though, instead of only
this one now and the \w, \W, and \B cases later on.

> I'm also wondering why nedit does not have a variable
specifying
which characters are word characters...

Good question. It may have been a poor decision, but it
would be hard to change this now or in the future.
NEdit implicitly defines 3 classes of characters, actually:
word characters (alphanumeric, locale-dependent +
underscore), delimiters, and the "rest". So word characters
are not the inverse of delimiters.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Joerg Fischer - 2004-08-13

Logged In: YES
user_id=918104

I don't understand why you stick to 5.1.1. Version 5.4 or
the upcoming 5.5 are much more comfortable to use.
(And contain fewer bugs.)

Ok, there is a problem with regex and "_", and Eddy will
fix it (probably not for 5.5, since the problem was in
there before).

But if all you want to do is to change variable names,
you can still do so by replacing
"foo(?:>|(?=_))" with "baa", or am I missing anything?

(It is only about regex search; literal search and selections
do work, don't they?)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nathan Gray - 2004-08-18

Logged In: YES
user_id=121553

Eddy,

I think you should try to fix this for 5.5. The new < and > behavior was
introduced for 5.4 so it's fair game, but you should probably also fix the
rest to maintain (or perhaps introduce) consistency.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eddy De Greef - 2004-08-18

Logged In: YES
user_id=73597

I checked the code and fixing everything to introduce
consistency is definitely not something that I want to do
for 5.5 any more. It requires changes all over the place.
I'll restrict to patching the word boundary implementations
only for now, to solve the submitter's problem (this
requires only minor changes).
On the longer term, we may have to make some more
fundamental changes to the regex language to allow more
flexibility in defining character classes etc., but that
will require more thought and time.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eddy De Greef - 2004-08-20

Logged In: YES
user_id=73597

I've committed a fix to CVS.
After some thought and some off-line discussion with Victor,
I found a solution that almost gives back the 5.1.1
behaviour without breaking the fix for the problem that we
wanted to solve in 5.4, and which required only minor changes.
I think this solves Victor's problem and restores the
consistency that got lost in 5.4, without any side-effects.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eddy De Greef - 2004-08-20

status: open --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.