Menu

#312 Function to compile a pattern

closed-invalid
43. Regexp (4)
1
2003-10-29
2003-10-29
No

The class
library "http://java.sun.com/j2se/1.4.2/docs/api/java/ut
il/regex/Pattern.html" provides a method "compile".
Can a similar function be offered for TCL?
How do you think about to make the
function "Tcl_RegExpCompile"
(http://tcl.tk/man/tcl8.4/TclLib/RegExp.htm) callable
on the script level?

I would like to avoid the overhead for pattern
preprocessing for the
commands "http://tcl.tk/man/tcl8.4/TclCmd/regexp.htm
" and "http://tcl.tk/man/tcl8.4/TclCmd/regsub.htm" in
loops.
Can we get more performance if the additional memory
for the technique "Compile caching" will be saved?

Discussion

  • Jeffrey Hobbs

    Jeffrey Hobbs - 2003-10-29

    Logged In: YES
    user_id=72656

    Tcl already does this if you keep the RE in an object (even a
    static object as part of a compiled body of code). IOW, just
    set the RE to a var, and use that var just as an RE.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2003-10-29
    • status: open --> closed-invalid
     
  • Markus Elfring

    Markus Elfring - 2003-10-29
    • assigned_to: nobody --> hobbs
    • status: closed-invalid --> open-later
     
  • Markus Elfring

    Markus Elfring - 2003-10-29

    Logged In: YES
    user_id=572001

    I assumed that "Compile caching" is performed already.
    But how much processing does it need by TCL? This effort
    might not be needed for constant patterns that should be
    compiled only once before the matching is done in a loop.

     
  • Donal K. Fellows

    • assigned_to: hobbs --> nobody
    • status: open-later --> closed-invalid
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Fleshing out with an example:

    set RE {the.*complex.*regexp}
    # Match once against the empty string; this forces compilation
    regexp $RE {}
    # Now use $RE instead of the literal and you'll use the
    # pre-compiled RE.

    In fact, there's a few other places which do caching as
    well. But IMHO the biggest advantage of the above technique
    is that you can also give logical names to your RE (e.g.
    $LinkTagMatcher instead {<a[^>]*?HREF="([^"]+)"}) and this
    is a big win in maintenance and code-simplicity terms.

     
  • Markus Elfring

    Markus Elfring - 2003-10-29
    • assigned_to: nobody --> dkf
     
  • Markus Elfring

    Markus Elfring - 2003-10-29

    Logged In: YES
    user_id=572001

    I'm sorry - I did not know that TCL variables can be marked
    that an instance contains a compiled regular expression.
    Would you like to publish this behaviour on the
    page "http://mini.net/tcl/regexp"
    and "http://mini.net/tcl/2244"?

     
  • Donal K. Fellows

    • priority: 5 --> 1
     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    You do it Marcus. You're obviously much keener on doing
    this than I am especially as virtually everyone's already
    reaping the benefits of precompilation anyway, one way or
    another.

     
  • Markus Elfring

    Markus Elfring - 2003-10-29

    Logged In: YES
    user_id=572001

    I have updated these wikis.

     
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.