Menu

#534 PCRE optional regexp

TIP Implementation
open
43. Regexp (5)
6
2007-12-30
2007-12-30
No

Attached is a diff that adds a configure --with-pcre option, as well as -type classic|pcre -binary options to [regexp] (available in either build, only functional with --with-pcre).

--with-pcre=/path/to/pcre (or have it installed in a "default" location).

Initial testing shows that PCRE is significantly faster in all cases the the classic Spencer engine.

Discussion

  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-12-30

    Logged In: YES
    user_id=72656
    Originator: YES

    Updated version that adds:
    interp regexp {} ?classic|pcre?

    So set the default engine with [interp regexp {} pcre]. I've also added support in Tcl_RegExpExecObj to recognize compiled PCREs so that the compile case works. It currently assumes -binary operation by default.

    In the lmbench grep.tcl code, you need to add:

    if {![catch {interp regexp {}}]} {
    puts stderr "PCRE regexp"
    interp regexp {} pcre
    } else {
    puts stderr "TCL regexp"
    }

    and then it will work as before, just faster.
    File Added: pcre.diff.gz

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-12-31

    Logged In: YES
    user_id=72656
    Originator: YES

    Updated version that has cleaner integration. The conversion of RE compile flags is done at the caching of the object.

    This version includes fully correct handling in [regsub] (you'll find the it is mostly transparent to Tcl_RegsubObjCmd), with support for the whole Tcl_GetRegExpFromObj/Tcl_RegExpExecObj/Tcl_RegExpGetInfo path of execution being handled 100% transparently for classic or PCRE REs.

    The translation of flags needs to be better reconciled between Spencer's flag meanings and PCREs (like TCL_REG_NLSTOP TCL_REG_NLMATCH == ??? in PCRE).
    File Added: pcre.diff2.gz

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-12-31

    updated

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-12-31

    Logged In: YES
    user_id=72656
    Originator: YES

    New patch that calms some tests, fixes [lsearch -regexp] crash condition. Note that any calls that use Tcl_GetRegexpFromObj with NULL interp can't check the [interp regexp {} pcre] state (as lsearch -regexp does).

    In this version, you can set environment TCL_REGEXP_PCRE to have PCRE enabled by default in Tcl interps.
    File Added: pcre.diff3.gz

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2008-01-02

    less leaky, not so creaky!

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2008-01-02

    Logged In: YES
    user_id=72656
    Originator: YES

    Updated version that doesn't leak the study'd pcre info, corrects more tests and is generally better, so just use it.
    File Added: pcre.diff4.gz

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2008-01-22

    Logged In: YES
    user_id=72656
    Originator: YES

    Updated to have --enable-pcre=yes|no|default. If default is used, then PCRE will be the default engine. --with-pcre still exists to point to a non-standard location.

    Fixed a -indices issue, and updated the test suite. The remaining test issues mostly represent differences in line anchor styles.
    File Added: pcre-20080121.diff.gz

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2008-01-22