From: Volker v. N. <vol...@gm...> - 2015-10-27 14:38:58
|
I can't do better. Compiling pregexp with sbcl I get a factor of about 10, i.e. nregex is 10 times faster in this situation. It turns out that there is a significant speed and space advantage for nregex. On the other hand, as Michel Talon has pointed out earlier in this thread, nregex is very limited. '|' is treated as an ordinary character and in case of e.g. "w{3}" nregex-compile loops endlessly. I would like to propose the following. We leave nregex in src and pregexp in share/stringproc and I will modify the Maxima level functions in share/stringproc in a way that the user can decide which scanner he or she wants to use. E.g. regex_match(regex, string) with pregexp as the default, regex_match(regex, string, 'pregexp) and regex_match(regex, string, 'nregex). pregexp also allows to set start and end positions. These might be passed in as optional arguments. The documentation then should emphasize the advantages and limitations of nregex. Volker 2015-10-26 21:46 GMT+01:00 Leo Butler <l_b...@us...>: > > Volker, I doubt you will get such good results with gcl. Nregex is 31 > times faster: > > pregex + gcl 2.6.12 > > (%i5) unicode_add(["GREEK","LATIN","APL","NON-SPACING"]); > Evaluation took 32.3900 seconds (34.2200 elapsed) > (%o5) [[GREEK, 553], [LATIN, 1478], [APL, 73], [NON-SPACING, 0]] > > nregex + gcl 2.6.12 > > (%i4) unicode_add(["GREEK","LATIN","APL","NON-SPACING"]); > Evaluation took 1.0400 seconds (1.1200 elapsed) > (%o4) [[GREEK, 553], [LATIN, 1478], [APL, 73], [NON-SPACING, 0]] > > Leo > > Volker van Nek <vol...@gm...> writes: > > > I have coded and tested pregexp-replacements for functions using nregex > > in src/commac.lisp and src/cl-info.lisp. > > > > This time I used sbcl. > > (compile_file doesn't work with gcl 2.6.12 in my gcl-Maxima build) > > > > I compiled pregexp.lisp and the replacements by compile_file. > > I.e. I had simply overwritten some functions in an already build Maxima. > > (For replacing nregex.lisp by pregexp.lisp in maxima.system I would need > > assistance.) > > > > > > cl-info: > > regex-sanitize, find-regex-matches > > > > As far as I can see, find-regex-matches is used by > > ??, ? and describe. I checked describe: > > > > thru 100 do describe(cint); > > 7.7080 seconds using 627.061 MB. <- with nregex > > 6.8160 seconds using 2378.905 MB. <- with pregexp > > > > thru 100 do describe(subvarp); > > 7.6560 seconds using 631.239 MB. <- with nregex > > 7.9520 seconds using 2382.587 MB. <- with pregexp > > > > thru 100 do describe(foo); > > 5.4640 seconds using 439.326 MB. <- with nregex > > 5.0360 seconds using 1591.035 MB. <- with pregexp > > > > -> same timing results but 4 times more space for pregexp > > > > > > commac: > > strip-float-zeros, $parse_timedate > > > > I don't expect $parse_timedate to be any crucial, > > I just checked strip-float-zeros: > > > > The test I performed was essentially printing floats from 1.0 to n > > redirected into a string_output_stream. > > This test calls two regex matching functions: > > trailing-zeros-regex-f-0 (no match) and > > trailing-zeros-regex-f-1 (match) > > > > test(n) := > > block([old_io, redirection], > > old_io : ?\*standard\-output\*, > > redirection : make_string_output_stream(), > > ?\*standard\-output\* : redirection, > > for i:1.0 thru n do print(i), > > ?\*standard\-output\* : old_io, > > close(redirection) )$ > > > > test(100000)$ > > 5.3080 seconds using 514.245 MB. <- with nregex > > test(100000)$ > > 9.4680 seconds using 1810.400 MB. <- with pregexp > > > > -> double time and again 4 times more space for pregexp > > > > > > share/contrib/unicodedata/unicodedata.lisp > > uses nregex but I did not look very closely here. > > At first sight a replacement seems to be possible. > > Maybe Leo Butler wants to reorganize his code by himself if we want to > > proceed in that direction. > > > > The question is: Do we want to proceed in that direction? > > > > Volker > > > > > > > > Am 19.10.2015 um 18:54 schrieb Robert Dodier: > >> On 2015-10-19, Volker van Nek <vol...@gm...> wrote: > >> > >>> Thereafter I will see how I can replace calls to nregex in src. Maybe I > >>> need help at this point to identify and understand all calls to nregex. > >> > >> There are only a few -- grepping the source code, I see they are: > >> > >> * src/commac.lisp: strip trailing zeros from floats; parse time/date > >> * src/cl-info.lisp: search for help topic > >> * share/contrib/unicodedata/unicodedata.lisp: not sure > >> > >> If you can take a look at this stuff, that will be terrific. Thanks a > >> lot. > >> > >> best, > >> > >> Robert Dodier > >> > >> > >> > ------------------------------------------------------------------------------ > >> _______________________________________________ > >> Maxima-discuss mailing list > >> Max...@li... > >> https://lists.sourceforge.net/lists/listinfo/maxima-discuss > >> > > > > > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > > Maxima-discuss mailing list > > Max...@li... > > https://lists.sourceforge.net/lists/listinfo/maxima-discuss > |