From: Leo B. <l_b...@us...> - 2015-10-19 14:57:08
|
One further observation: sregex is written in scheme and uses recursion extensively. When I looked at sregex a few years ago as a possible replacement for nregex, I recall that the lisp version ran very slow and had numerous stack overflows when run against Maxima's info files. Leo Volker van Nek <vol...@gm...> writes: > Thank you, Michel. > > These are very helpful informations. > > Volker van Nek > > 2015-10-19 10:31 GMT+02:00 Michel Talon <ta...@lp...>: > >> Le 18/10/2015 23:10, Robert Dodier a écrit : >> >> In share/stringproc there is an alternative regex parser (portable regex >> >> >parser by Dorai Sitaram) with an interface at Maxima level. >> >> > >> >> >It works nicely but it appears to be quite slow. >> > Thanks for the reminder, I had forgotten about that. I suspect that >> > sregex is strictly more powerful than nregex; I'd be surprised if nregex >> > were any faster, but that is not a key point. Also it's helpful that >> > there is a Maxima interface for sregex. Given all that, I'm in favor of >> > moving sregex into src and nuking nregex. There are a few calls to >> > nregex in src, but those would be easily replaced by sregex, I believe. >> > >> >> Well, i have looked a little bit at the programes nregex, pregexp, and >> the way nregex is used in maxima. Obviously pregexp is a complete regex >> parser, covering more or less the perl regexp syntax, while nregex is >> a very basic regex parser covering the standard new regex syntax as in >> grep -E except that modern stuff [:alnum:] etc. is not supported. Of >> course only 256 characters alphabets are supported. There are some >> extensions as in emacs regexps, like \w matching words, \b matching >> boundaries, etc. (and their opposites \W, \B etc.). The repeating >> patterns {n,m} are not supported, i am not even sure the alternative >> patterns patt1|patt2 are. But thanks to all these shortcomings nregex >> is very small and can use a clever trick to speed up character classes >> matching. Basically when encountering [...] it builds a bitstring of >> length 256 having ones at each matching position (with special cases for >> \w etc.) or inverted if one has [^...] and this can be very fast to >> match on something. On the contrary pregexp does straightforward >> comparisons. Hence one may expect considerable speed difference between >> them. In the way in which regexp is used in maxima (in cl-info.lisp for >> finding stuff and in in commac.lisp for stripping a string of trailing >> zeroes) the speed difference could cause problems. >> >> >> -- >> Michel Talon >> >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Maxima-discuss mailing list >> Max...@li... >> https://lists.sourceforge.net/lists/listinfo/maxima-discuss >> > ------------------------------------------------------------------------------ > _______________________________________________ > Maxima-discuss mailing list > Max...@li... > https://lists.sourceforge.net/lists/listinfo/maxima-discuss |