From: Volker v. N. <vol...@gm...> - 2015-10-19 11:44:40
|
Thank you, Michel. These are very helpful informations. Volker van Nek 2015-10-19 10:31 GMT+02:00 Michel Talon <ta...@lp...>: > Le 18/10/2015 23:10, Robert Dodier a écrit : > >> In share/stringproc there is an alternative regex parser (portable regex > >> >parser by Dorai Sitaram) with an interface at Maxima level. > >> > > >> >It works nicely but it appears to be quite slow. > > Thanks for the reminder, I had forgotten about that. I suspect that > > sregex is strictly more powerful than nregex; I'd be surprised if nregex > > were any faster, but that is not a key point. Also it's helpful that > > there is a Maxima interface for sregex. Given all that, I'm in favor of > > moving sregex into src and nuking nregex. There are a few calls to > > nregex in src, but those would be easily replaced by sregex, I believe. > > > > Well, i have looked a little bit at the programes nregex, pregexp, and > the way nregex is used in maxima. Obviously pregexp is a complete regex > parser, covering more or less the perl regexp syntax, while nregex is > a very basic regex parser covering the standard new regex syntax as in > grep -E except that modern stuff [:alnum:] etc. is not supported. Of > course only 256 characters alphabets are supported. There are some > extensions as in emacs regexps, like \w matching words, \b matching > boundaries, etc. (and their opposites \W, \B etc.). The repeating > patterns {n,m} are not supported, i am not even sure the alternative > patterns patt1|patt2 are. But thanks to all these shortcomings nregex > is very small and can use a clever trick to speed up character classes > matching. Basically when encountering [...] it builds a bitstring of > length 256 having ones at each matching position (with special cases for > \w etc.) or inverted if one has [^...] and this can be very fast to > match on something. On the contrary pregexp does straightforward > comparisons. Hence one may expect considerable speed difference between > them. In the way in which regexp is used in maxima (in cl-info.lisp for > finding stuff and in in commac.lisp for stripping a string of trailing > zeroes) the speed difference could cause problems. > > > -- > Michel Talon > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Maxima-discuss mailing list > Max...@li... > https://lists.sourceforge.net/lists/listinfo/maxima-discuss > |