Re: [Pyparsing] ParseResults as AST?
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2011-09-29 06:10:59
|
Aubrey - If all you want is an AST, then you should be able to construct this using Group's and parseActions. For instance, let's say you want a parser for an assignment statement, where the lvalue can be a variable and the rvalue can be simple expression, with terms that can be integers, variables, or function calls. """ Simple BNF: assignment :: lvalue '=' rvalue lvalue :: variableRef rvalue :: term [ op term ] term :: fnCall | variable | integer fnCall :: fnName '(' [rvalue[,rvalue]*] ') variable :: [a-zA-Z_][a-zA-Z0-9_]* integer :: \d+ op :: '+' | '-' """ from pyparsing import * LPAR,RPAR,EQ = map(Suppress, "()=") op = oneOf("+ -") integer = Word(nums) identifier = Word(alphas+'_', alphanums+'_') variableRef = identifier.copy() fnName = identifier.copy() rvalue = Forward() arglist = delimitedList(rvalue) fnCall = fnName + Group(LPAR + Optional(arglist, default=[]) + RPAR) rvalue << (fnCall | variableRef | integer) assignment = variableRef + EQ + Group(rvalue) print assignment.parseString("y = sin(60)-180").asList() Will print: ['y', ['sin', ['60']]] Knowing that this was an assignment statement, you could then use assignment unpacking to pull out the left and right hand sides of the assignment: lhs, rhs = assignment.parseString("...").asList() With some simple parse actions, you can decorate these structures similar to what you show in your AST example: assignment.setParseAction(lambda t : ['assign'] + t.asList()) fnCall.setParseAction(lambda t : ['function-call'] + t.asList()) print assignment.parseString("y = sin(60)-180").asList() will print: ['assign', 'y', ['function-call', 'sin', ['60']]] But now I have to ask, what do you plan to do with this AST? I suspect that the next step is to walk the structure, and perform actions depending on the decorating labels. Instead, I would suggest you use the parse actions to construct representational objects. At parse time, you already know what the substructures are going to be ("a function call will have a name and a potentially empty argument list" for instance), why not use a parse action to construct an object that gives a representation for this? class ASTNode(object): def __init__(self, tokens): self.tokens = tokens self.assignFields() def __str__(self): return self.__class__.__name__ + ':' + str(self.__dict__) __repr__ = __str__ class Assignment(ASTNode): def assignFields(self): self.lhs, self.rhs = self.tokens del self.tokens class FunctionCall(ASTNode): def assignFields(self): self.fnName, self.args = self.tokens del self.tokens fnCall.setParseAction(FunctionCall) assignment = (variableRef + EQ + rvalue).setParseAction(Assignment) print assignment.parseString("y = sin(60)-180")[0] Will print: Assignment:{'rhs': FunctionCall:{'fnName': 'sin', 'args': (['60'], {})}, 'lhs': 'y'} Now we have bypassed the intermediate AST step, and gone straight to an object representation. These classes can be extended to include an eval or exec method, which could be directly called, instead of retracing steps already taken during parse time - why test args with isdigit, for instance? At parse time, your grammar already defines when an integer is being encountered - just add a parse action that does the string-to-integer conversion. I think the SimpleBool.py example on the pyparsing wiki gives a more complete treatment of this concept. In the article I wrote for Python magazine, I showed how you could even pickle these representational objects, and then unpickle and execute them in varying VM's (GUI, console, server) without reparsing the original code. If you just want an AST, use Groups to add structure, and parse actions to decorate the groups with your AST structure labels. I think though that the representational objects are more powerful. -- Paul -----Original Message----- From: Aubrey Barnard [mailto:ba...@cs...] Sent: Wednesday, September 28, 2011 7:35 PM To: pyp...@li... Subject: [Pyparsing] ParseResults as AST? Pyparsing users, I am having trouble figuring out how to make an abstract syntax tree (AST) from a ParseResults object. I do not understand the data model that is the ParseResults object as it relates to holding the information of an AST. (I have spent a whole day on this so far, so it is time to ask for help.) I am writing a Prolog parser, so I really do need an AST as opposed to some flattened representation like ParseResults. Order matters. Type matters. Depth matters. Sample Input ------------ This is only Datalog with atoms, integers, and variables. ''' % Facts r1(a). r1(b). r1(c). r2(a, 1). r2(a, 2). r2(d, 3). r2(e, 4). % Clauses r3(A, B) :- r1(A), r2(A, B). ''' Desired Output -------------- I would like to get something back from parsing like the following: ('datalog', [ ('comment', '% Facts'), ('fact', [('name', 'r1'), ('body', [('atom', 'a')])]), ('fact', <result>), ... ('clause', [('head', <result>), ('body', [<result>...])]), ] ) where <result> stands for the parse result from deeper recursion. The structure is just lists of pairs. Each pair is the name (type) of the parsed object and its content, the result. Parsing ------- I believe I have my grammar implemented correctly in pyparsing as I get the expected AST back as XML by calling ParseResults.asXML(). (Except that I have an extra top-level element, "<datalog><datalog>...</datalog></datalog>", but that is easy to workaround.) Tried ----- I have tried to understand the ParseResults object. Clearly the AST information exists as ParseResults.asXML() gives me an AST in XML. However, some of the information used in asXML() is not part of the public API. I have tried to use parse actions to construct my own AST but I couldn't figure this out either (in part because the product of parsing is still a ParseResults object). Options ------- * Use asXML() and reparse the XML. Not performant. Too much code. Misses the point. * Write asAST() into pyparsing.py as a near clone of asXML(). This is my current favorite option. (I'm on a deadline.) * Use parse actions effectively. * Use a different parsing library (I have also tried pyPEG.) * Other? Any suggestions and/or help would be very much appreciated. Sincerely, Aubrey ---------------------------------------------------------------------------- -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |