From: Volker v. N. <vol...@gm...> - 2015-10-18 17:42:18
|
In share/stringproc there is an alternative regex parser (portable regex parser by Dorai Sitaram) with an interface at Maxima level. It works nicely but it appears to be quite slow. The header in sregexp.lisp contains the Maxima functions and simple examples. Meanwhile the manual by Dorai Sitaram moved to http://ds26gte.github.io/pregexp/index.html In the following I use Robert's regex. The list returned by regex_match contains the wanted capturing groups. (%i1) load(sregex); (%o1) /usr/local/share/maxima/5.37post/share/stringproc/sregex.lisp (%i2) stringdisp: true$ (%i3) input: openr("data.txt"); (%o3) #<input stream data.txt> (%i4) line: readline(input); (%o4) "/* RGBident=000000 */" (%i5) regex: "/\\* *([^ ]+) *[:=] *([^ ]+) *\\*/"$ (%i6) match: regex_match(regex, line); (%o6) ["/* RGBident=000000 */", "RGBident", "000000"] (%i7) regex_match(regex, "other stuff"); (%o7) false (%i8) close(input); (%o8) true When parsing RGBident with ibase:16 the parser wants hex-numbers starting with an alphabet-character to be zero-prefixed, e.g. "/* RGBident=ff9933 */". (%i9) match[3]: "ff9933"; (%o9) "ff9933" (%i10) ibase:16.$ (%i11) eval_string(sconcat(match[2],":","0",match[3])); (%o11) 16750899 (%i12) RGBident; (%o12) 16750899 (%i13) obase:16.$ (%i0E) RGBident; (%o0E) 0FF9933 Volker van Nek Am 18.10.2015 um 06:28 schrieb Robert Dodier: > On Fri, Oct 16, 2015 at 10:02 AM, André Fettouhi <a.f...@gm...> wrote: > >> Thanks for the suggestion. I'll take a crack at it. I would like to hear >> about the regular expression package you mentioned. > > Andre, I've take the liberty of cc'ing the mailing list on this reply. > The regex functions are in the package :maxima-nregex, and > the source code is src/nregex.lisp. It can handle character > classes, starting and ending anchors, the quantifiers * + and ?, > and capturing groups; there might or might not be other features. > This is obviously much more limited than, say, Perl or Python, > but it is enough for the task at hand. > > Create a matching function like this: > > :lisp (setq foo (coerce (maxima-nregex::regex-compile "...") 'function)) > > where "..." is the regex pattern -- something like "/\\* *([^ ]+) > *[:=] *([^ ]+) *\\*/" > in this case. > > Call FOO to look for a match: > > :lisp (funcall foo "/* something=whatever */") > > which returns T for a hit or NIL for a miss. If T, then look at > > :lisp maxima-nregex::*regex-groupings* > > to see how many groups were captured and > > :lisp maxima-nregex::*regex-groups* > > to see the groups. Group 0 is the whole thing. In each group > e.g. (m n), the first character in the group is the m'th character > and the first character beyond the end is the n'th character, > so there are n - m characters in the group. > > Here's an initial attempt to parse header lines. Note that > I omitted the code to actually parse strings into numbers -- > you can call parse_string(s) for floats and decimal integers > but you'll want block([ibase:16], parse_string(s)) for hex. > Also, note that the regex cannot capture a right-hand side > which contains a space -- you'll have to adjust the regex > if you want to capture spaces. Also I assume you can get > the list of header lines via readline. > > (%i1) display2d : false $ > (%i2) load ("parse_header.lisp"); > (%o2) "parse_header.lisp" > (%i3) l : ["/* foo=123 */", "/* baz=blurf */", "/* quux: sdsdkkds.dfk */"] $ > (%i4) parse_header (l); > (%o4) ["foo" = "123","baz" = "blurf","quux" = "sdsdkkds.dfk"] > > Hope this helps, > > Robert Dodier > > PS. > ;; copyright 2015 by Robert Dodier > ;; I release this work under terms of the GNU General Public License > > (defvar foo) > (setq foo (coerce > (maxima-nregex::regex-compile > "/\\* *([^ ]+) *[:=] *([^ ]+) *\\*/") > 'function)) > > ;; parse each element of LINES to a Maxima expression "foo" = "bar"; > ;; assume LINES is a Maxima list, and return a Maxima list. > (defun $parse_header (lines) > (cons '(mlist) > (mapcar #'(lambda (l) > (if (funcall foo l) > (list '(mequal) > (group-substring l 1) > (group-substring l 2)))) > (rest lines)))) > > (defun group-substring (l i) > (let ((group (aref maxima-nregex::*regex-groups* i))) > (subseq l (first group) (second group)))) > > ------------------------------------------------------------------------------ > _______________________________________________ > Maxima-discuss mailing list > Max...@li... > https://lists.sourceforge.net/lists/listinfo/maxima-discuss > |