#48 For the paranoid: look behind

development
closed
Program (79)
5
2002-07-09
2002-03-15
No

This patch extends the regular expression engine
with look-behind matching (both positive and negative)
and adapts all code using REs to take into account
look-behind matching (most notably, the highlight
engine).

There is one limitation: matches must have a _bounded_
length, meaning that *, +, and {n,} quantifiers are
not allowed in look-behind expressions. Note that most
other implementations only support _fixed_ size
look-behind (such as Perl, I believe).

Drawback: there may be a slight performance decrease
in the regex compilation phase (3% or so).
(But don't worry, I have another patch in the pipeline
to crank up syntax highlighting performance considerably.)

I didn't test it very well; I will do so in
the future. But if anyone already wants to play with
it, they can. Feedback on performance, bug reports,
etc. are welcome.

PS: No, this is not for 5.3 anymore :-) The patch
is made against the main trunk.

Discussion

  • Eddy De Greef

    Eddy De Greef - 2002-03-22

    Logged In: YES
    user_id=73597

    I've uploaded a revised version (V2):

    - It fixes a bug in _both_ look-ahead and look-behind
    (I copied the bug), when the pattern contains a
    branch (behaviour was unpredictable).
    (I don't plan to fix the look-ahead bug in 5.3,
    because it has been there since the beginning, and
    fixing it _could_ break some highlighting patterns
    that happen to rely on the faulty behaviour).

    - It makes the greedy/lazy behaviour of the look-behind
    patterns more consistent (the previous version used a
    questionable trick to prevent matching failures due
    to too greedy matching).
    This introduces an additional limitation: look-behind
    patterns should not end with patterns that need
    look-ahead information (this includes word-boundary
    matching etc.), but that will hardly be a problem in
    practice, since it's unlikely that anyone will ever
    use it and, if necessary, the look-ahead pattern
    can simply be placed after the look-behind pattern
    iso. inside it.

     
  • Eddy De Greef

    Eddy De Greef - 2002-07-05

    Logged In: YES
    user_id=73597

    This new version fixes a problem with executing
    replace-and-find repeatedly: look-behind patterns were not
    taken into account.

    I intend to commit this version early next week, if nobody
    objects.

     
  • Eddy De Greef

    Eddy De Greef - 2002-07-09

    Logged In: YES
    user_id=73597

    Committed and closed.

     
  • Eddy De Greef

    Eddy De Greef - 2002-07-09
    • status: open --> closed
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks