From: Edi W. <ed...@ag...> - 2004-05-27 23:38:22
|
On Thu, 27 May 2004 14:29:17 +0200, Bruno Haible <br...@cl...> wrote: > Edi Weitz wrote: >> Last time I checked it was comparable in speed with CLISP's regex >> engine. > > Do you mean the speed when run in a Lisp implementation that > compiles to native code, or when run in clisp? I meant CL-PPCRE when run in CLISP. Of course it's much faster when compiled to native code. > Doing character-at-a-time processing in byte-compiled clisp is quite > slow; that's why we are looking for an implementation written in > C/C++. While this is generally true the situation is not as black-and-white as you seem to imply. First, regular expressions are not only about character-at-a-time processing. There are several cases where CL-PPCRE is actually (a lot) faster than CLISP's regex engine. (See end of mail for some benchmarks.) But apart from that I'd say that it is usually just fast enough[TM] (unless you're throwing certain regular expressions at enormous strings several times). Moreover: 1. It has more features (look-aheads, look-behinds, stand-alone expressions, ...) than the current engine. 2. It has a syntax (the one from Perl) most users will be familiar with. 3. It has an alternative S-expression syntax for regular expressions. These are of course much easier to manipulate programmatically from Lisp. 4. Because it's written in Lisp it'll use the same character encoding that CLISP uses independently of external settings like your locale. 5. If you get an error you get an error Lisp can handle. Disasters like this one can't happen: [1]> (regexp:regexp-compile "(a|(bc)){0,0}?xyz" :extended t) *** - handle_fault error2 ! address = 0x14 not in [0x20248000,0x203d5a30) ! SIGSEGV cannot be cured. Fault address = 0x14. Segmentation fault 6. It's written in Lisp (did I mention that already?) - for marketing reasons it might be not too bad an idea if the regex engine used by a Lisp implementation was also written in Lisp... :) Anyway, you decide... Cheers, Edi. Regarding the speed of CL-PPCRE I did some simple benchmarks based on <http://weitz.de/cl-ppcre/#bench>. The code I wrote can be found at <http://miles.agharta.de/bench.lisp>. This is on a Debian sid system with CLISP (2.33) from Debian. It should be noted that CL-PPCRE has been profiled with and optimized for CMUCL only. (With the help of Duane Rettig from Franz I've done some preliminary work to optimize it for AllegroCL as well but that's not part of the official distribution yet.) I'm pretty sure there are ways to tweak it for CLISP if someone who knows CLISP better than I does it. I'll gladly accept patches. edi@bird:~$ uname -a Linux bird 2.6.6 #1 Wed May 26 11:10:22 CEST 2004 i686 GNU/Linux edi@bird:~$ echo $LC_CTYPE en_US.UTF-8 edi@bird:~$ clisp WARNING: *FOREIGN-ENCODING*: reset to ASCII i i i i i i i ooooo o ooooooo ooooo ooooo I I I I I I I 8 8 8 8 8 o 8 8 I \ `+' / I 8 8 8 8 8 8 \ `-+-' / 8 8 8 ooooo 8oooo `-__|__-' 8 8 8 8 8 | 8 o 8 8 o 8 8 ------+------ ooooo 8oooooo ooo8ooo ooooo 8 Copyright (c) Bruno Haible, Michael Stoll 1992, 1993 Copyright (c) Bruno Haible, Marcus Daniels 1994-1997 Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998 Copyright (c) Bruno Haible, Sam Steingold 1999-2000 Copyright (c) Sam Steingold, Bruno Haible 2001-2004 ;; Loading file /home/edi/.clisprc ... ;; Loaded file /home/edi/.clisprc [1]> (lisp-implementation-version) "2.33 (2004-03-17) (built 3289141980) (memory 3294470374)" [2]> (require :cl-ppcre) ;; Loading file /usr/share/common-lisp/systems/cl-ppcre.asd ... ;; Loaded file /usr/share/common-lisp/systems/cl-ppcre.asd ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/packages.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/packages.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/specials.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/specials.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/util.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/util.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/errors.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/errors.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/lexer.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/lexer.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/parser.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/parser.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/regex-class.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/regex-class.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/convert.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/convert.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/optimize.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/optimize.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/closures.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/closures.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/repetition-closures.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/repetition-closures.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/scanner.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/scanner.fas ;; Loading file /usr/lib/common-lisp/clisp/cl-ppcre/api.fas ... ;; Loaded file /usr/lib/common-lisp/clisp/cl-ppcre/api.fas 0 errors, 0 warnings T [3]> (load (compile-file "/tmp/bench.lisp")) Compiling file /tmp/bench.lisp ... Wrote file /tmp/bench.fas 0 errors, 0 warnings ;; Loading file /tmp/bench.fas ... ;; Loaded file /tmp/bench.fas T [4]> (test) PPCRE wins by a factor of 31.2 PPCRE wins by a factor of 257.9 PPCRE wins by a factor of 2534.2 PPCRE wins by a factor of 20842.4 PPCRE wins by a factor of 3.6 PPCRE wins by a factor of 3.8 PPCRE wins by a factor of 3.9 PPCRE wins by a factor of 4.0 PPCRE wins by a factor of 9.5 PPCRE wins by a factor of 84.5 PPCRE wins by a factor of 846.4 PPCRE wins by a factor of 6337.0 CLISP wins by a factor of 1.3 CLISP wins by a factor of 1.3 CLISP wins by a factor of 1.3 CLISP wins by a factor of 1.2 CLISP wins by a factor of 2.1 CLISP wins by a factor of 2.3 CLISP wins by a factor of 2.3 CLISP wins by a factor of 2.0 PPCRE wins by a factor of 3.0 PPCRE wins by a factor of 3.2 PPCRE wins by a factor of 3.3 PPCRE wins by a factor of 3.3 PPCRE wins by a factor of 1.2 PPCRE wins by a factor of 1.2 PPCRE wins by a factor of 1.2 PPCRE wins by a factor of 1.2 CLISP wins by a factor of 2.3 CLISP wins by a factor of 2.6 CLISP wins by a factor of 2.7 CLISP wins by a factor of 2.4 PPCRE wins by a factor of 3.0 PPCRE wins by a factor of 3.1 PPCRE wins by a factor of 3.3 PPCRE wins by a factor of 3.3 PPCRE wins by a factor of 1.3 PPCRE wins by a factor of 1.5 PPCRE wins by a factor of 1.5 PPCRE wins by a factor of 1.5 CLISP wins by a factor of 1.9 CLISP wins by a factor of 1.9 CLISP wins by a factor of 1.8 PPCRE wins by a factor of 48.6 PPCRE wins by a factor of 617.9 PPCRE wins by a factor of 6168.3 CLISP wins by a factor of 14.6 CLISP wins by a factor of 20.1 CLISP wins by a factor of 21.3 PPCRE wins by a factor of 1.2 PPCRE wins by a factor of 1.8 PPCRE wins by a factor of 1.8 CLISP wins by a factor of 14.6 CLISP wins by a factor of 21.3 CLISP wins by a factor of 21.9 PPCRE wins by a factor of 1.3 PPCRE wins by a factor of 1.8 PPCRE wins by a factor of 1.9 PPCRE wins by a factor of 1.6 PPCRE wins by a factor of 2.0 PPCRE wins by a factor of 2.1 NIL [5]> |