Re: [Maxima-discuss] regex packages, was: Reading data from header

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Am 19.10.2015 um 16:56 schrieb Leo Butler:
> 
> One further observation: sregex is written in scheme and uses recursion
> extensively. When I looked at sregex a few years ago as a possible
> replacement for nregex, I recall that the lisp version ran very slow and
> had numerous stack overflows when run against Maxima's info files.

The regex parser share/stringproc/pregexp.lisp by Dorai Sitaram has been
fully revised and is now completely written in Lisp.

https://github.com/ds26gte/pregexp/

I plan for the next weekend to replace the current pregexp.lisp and to
write a documentation for the interface functions in sregex.lisp.
Hopefully the new version will show better results.

Volker

> 
> Leo
> 
> Volker van Nek <vol...@gm...> writes:
> 
>> Thank you, Michel.
>>
>> These are very helpful informations.
>>
>> Volker van Nek
>>
>> 2015-10-19 10:31 GMT+02:00 Michel Talon <ta...@lp...>:
>>
>>> Le 18/10/2015 23:10, Robert Dodier a écrit :
>>>>> In share/stringproc there is an alternative regex parser (portable regex
>>>>>> parser by Dorai Sitaram) with an interface at Maxima level.
>>>>>>
>>>>>> It works nicely but it appears to be quite slow.
>>>> Thanks for the reminder, I had forgotten about that. I suspect that
>>>> sregex is strictly more powerful than nregex; I'd be surprised if nregex
>>>> were any faster, but that is not a key point. Also it's helpful that
>>>> there is a Maxima interface for sregex. Given all that, I'm in favor of
>>>> moving sregex into src and nuking nregex. There are a few calls to
>>>> nregex in src, but those would be easily replaced by sregex, I believe.
>>>>
>>>
>>> Well, i have looked a little bit at the programes nregex, pregexp, and
>>> the way nregex is used in maxima. Obviously pregexp is a complete regex
>>> parser, covering more or less the perl regexp syntax, while nregex is
>>> a very basic regex parser covering the standard new regex syntax as in
>>> grep -E except that modern stuff [:alnum:] etc. is not supported. Of
>>> course only 256 characters alphabets are supported. There are some
>>> extensions as in emacs regexps, like \w matching words, \b matching
>>> boundaries, etc. (and their opposites \W, \B etc.). The repeating
>>> patterns {n,m} are not supported, i am not even sure the alternative
>>> patterns patt1|patt2 are. But thanks to all these shortcomings nregex
>>> is very small and can use a clever trick to speed up character classes
>>> matching. Basically when encountering [...] it builds a bitstring of
>>> length 256 having ones at each matching position (with special cases for
>>> \w etc.) or inverted if one has [^...] and this can be very fast to
>>> match on something. On the contrary pregexp does straightforward
>>> comparisons. Hence one may expect considerable speed difference between
>>> them. In the way in which regexp is used in maxima (in cl-info.lisp for
>>> finding stuff and in in commac.lisp for stripping a string of trailing
>>> zeroes) the speed difference could cause problems.
>>>
>>>
>>> --
>>> Michel Talon
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Maxima-discuss mailing list
>>> Max...@li...
>>> https://lists.sourceforge.net/lists/listinfo/maxima-discuss
>>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Maxima-discuss mailing list
>> Max...@li...
>> https://lists.sourceforge.net/lists/listinfo/maxima-discuss

Re: [Maxima-discuss] regex packages, was: Reading data from header

Computer Algebra System written in Common Lisp

Re: [Maxima-discuss] regex packages, was: Reading data from header