From: Michel T. <ta...@lp...> - 2015-10-19 08:31:27
|
Le 18/10/2015 23:10, Robert Dodier a écrit : >> In share/stringproc there is an alternative regex parser (portable regex >> >parser by Dorai Sitaram) with an interface at Maxima level. >> > >> >It works nicely but it appears to be quite slow. > Thanks for the reminder, I had forgotten about that. I suspect that > sregex is strictly more powerful than nregex; I'd be surprised if nregex > were any faster, but that is not a key point. Also it's helpful that > there is a Maxima interface for sregex. Given all that, I'm in favor of > moving sregex into src and nuking nregex. There are a few calls to > nregex in src, but those would be easily replaced by sregex, I believe. > Well, i have looked a little bit at the programes nregex, pregexp, and the way nregex is used in maxima. Obviously pregexp is a complete regex parser, covering more or less the perl regexp syntax, while nregex is a very basic regex parser covering the standard new regex syntax as in grep -E except that modern stuff [:alnum:] etc. is not supported. Of course only 256 characters alphabets are supported. There are some extensions as in emacs regexps, like \w matching words, \b matching boundaries, etc. (and their opposites \W, \B etc.). The repeating patterns {n,m} are not supported, i am not even sure the alternative patterns patt1|patt2 are. But thanks to all these shortcomings nregex is very small and can use a clever trick to speed up character classes matching. Basically when encountering [...] it builds a bitstring of length 256 having ones at each matching position (with special cases for \w etc.) or inverted if one has [^...] and this can be very fast to match on something. On the contrary pregexp does straightforward comparisons. Hence one may expect considerable speed difference between them. In the way in which regexp is used in maxima (in cl-info.lisp for finding stuff and in in commac.lisp for stripping a string of trailing zeroes) the speed difference could cause problems. -- Michel Talon |