#20 Prevent backtracking for trivial cases

open
nobody
None
5
2006-04-08
2006-04-08
crisp
No

Some example using PCRE 6.6:

re> /([a-y]+|z)+#/gD
------------------------------------------------------------------
0 50 Bra 0
3 37 Bra 1
6 [a-y]+
40 5 Alt
43 z
45 42 KetRmax
48 #
50 50 Ket
53 End
------------------------------------------------------------------
Capturing subpattern count = 1
Partial matching not supported
No options
No first char
Need char = '#'
data> foo# bar# abcdefghijklmnopqrst bla#
0: foo#
1: foo
0: bar#
1: bar
0: bla#
1: bla
data> foo# bar# abcdefghijklmnopqrstu bla#
0: foo#
1: foo
0: bar#
1: bar
Error -8

As you can see the backtracking and the amount of
variations that are being tried eventually results in
PCRE_ERROR_MATCHLIMIT being raised.
In my opinion this should not be necessary if at first
a selection is made to determine which execution-paths
will lead to nowhere instead of just trying them all.
Such decision could be made if it becomes obvious that
there's a big chance we need to do some heavy
backtracking. The cost to exclude certain posibilities
imho outweighs the penalty of the backtracking or even
- as in this case - failure to complete the matching.

Discussion


Log in to post a comment.