Re: [Pyparsing] Parsing/lexing Tcl style {} strings
Brought to you by:
ptmcg
From: Ian B. <ia...@co...> - 2007-10-23 19:01:38
|
Paul McGuire wrote: > First off, I would try something simple, using the nestedExpr helper method > that was just released with version 1.4.8. Here is an extended version of > your example, showing how closing braces are handled within quoted strings > and comments: > > tclCode = """\ > proc keep_trying { > # here is a closing brace } in a comment > do_forever { > stuff > "here is a closing brace } in a quoted string" > } > more stuff > }""" > > from pyparsing import nestedExpr, Literal, Word, alphas, alphanums, > pythonStyleComment > > # define mini BNF for a Tcl proc > tclBlock = nestedExpr("{","}") > tclProc = Literal("proc") + \ > Word(alphas,alphanums+"_")("name") + \ > tclBlock("body") > > # ignore contents of comments > tclStyleComment = pythonStyleComment > tclProc.ignore( tclStyleComment ) > > # parse the input Tcl code > parsedTcl = tclProc.parseString(tclCode) > print parsedTcl.dump() > > -------- > prints: > > ['proc', 'keep_trying', ['do_forever', ['stuff', '"here is a closing brace } > in a quoted string"'], 'more', 'stuff']] > - body: ['do_forever', ['stuff', '"here is a closing brace } in a quoted > string"'], 'more', 'stuff'] > - name: keep_trying > > > This actually does more than you asked for, is that necessarily a problem? I'm not sure. If {} means a block of code, this isn't really a problem. If {} is just a form of string quoting, then it is a bit of a problem, since it will be hard to use in a string context. E.g., in Tcl it's not uncommon to use {some string} just like you might use "some string". Potentially using the position of the parser it would be possible to parse the string, and then reconstruct the string that was parsed and attach the parse tree to the string (or a probably a str subclass). Handling things like "}" or #} would be very difficult without actually parsing the string; I'm not actually sure how Tcl does that. "proc" isn't something Tcl detects as a special verb, and in my use case (Twill) I would want the same behavior, as it is easy and desirable to add new commands that work on blocks. But I think that's simple enough -- it just requires putting tclBlock alongside the other argument options (which are just a quoted string or a plain string). If I'm not careful at constraining my ambition here I'm just going to end up reimplementing Tcl directly in PyParsing ;) It sounds fun, which is why it is dangerous. -- Ian Bicking : ia...@co... : http://blog.ianbicking.org |