[Pyparsing] tools
Brought to you by:
ptmcg
From: spir <den...@fr...> - 2008-11-12 15:35:08
|
Hello, If ever there is someone alive on the list, these days... I had some trouble understanding the output structure of parsing results. Probably very different of what I was +/- unconsciously expecting. Below some tools used to get more usable results, according to my very personal taste. Possibly some of you may find them useful -- or know better ways to obtain similar results. Comments welcome. listAll() is used to get a flat list out of nested results; with compound and included results listed in sequence. Avoids the need to recursively walk through nested structure, when action has to be performed on each single result. By default, listAll actually returns (type,content) tuples, else simple results. The type is given by typ(): either the type set at pattern defintion (with ('type') or setResultName('type')) ; or the 'real' type of the result. I use listAll e.g. for instanciating objects which types are given by the result's type (mapping) and init data are taken from the content of the results. For instance: for item in listAll(calc.parseString(text)): Type = typ(item[0]) data = item[1] symbols.append(Type(data)) Well, in fact, I don't really use it like that anymore, because: * listAll (and other funcs below) as ParseResult methods. * I created a custom result type that natively hold type and content as fields. So that ParseStrings now returns objects of that kind for all named results (through a dedicated parse action). pickLeaves() is very similar to listAll, except it skips all compound results to jump inside instead -- at any level -- and retain only 'terminal' ones: ==> its name. Very nice to get a low-level overview of the results. If these build a complete representation of the source, then pickLeaves give it back at the lowest *relevant* level, as defined by the various grouping patterns in the grammar. treeView() builds a (python-like) hierarchical picture of the results. Compact and clear. Both are mainly intended for testing. Both also return types (as given by typ()) by default in addition to the content. showSeq() can be used to properly format pickLeaves screen output. This also applies to listAll. denis #============================= def typ(result): try: #print "type --- %s:%s" %(result.getName(),result) return result.getName() except AttributeError: return "<%s>" %result.__class__.__name__ def listAll(tree, noType=False): seq = [] for part in tree: isValue = not(isinstance(part,ParseResults)) isSimple = (not isValue) and (len(part) == 1) # case simple result if isValue: if noType: seq.append(part) else: seq.append((typ(part),part)) elif isSimple: if noType: seq.append(part[0]) else: seq.append((typ(part),part[0])) # recursively explore nested result else: if noType: seq.append(part) else: seq.append((typ(part),part)) seq.extend(listAll(part, noType)) return seq # o u t p u t f u n c s def showSeq(seq): if len(seq) == 0: return '' # define if seq holds types, or not noType = not isinstance(seq[0],tuple) # build return text text = seq[0] if noType else "%s:%s" %(seq[0][0],str(seq[0][1])) for item in seq[1:]: if noType: text += " , %s" %str(item) else: text += " , %s:%s" %(item[0],str(item[1])) # add [...] return "[%s]" %text def pickLeaves(tree, noType=False): seq = [] for part in tree: isValue = not(isinstance(part,ParseResults)) # str, int, float... isSimple = (not isValue) and (len(part) == 1) # unique item inside # case value result : add value to seq if isValue: if noType: seq.append(part) else: seq.append((typ(part),part)) # case simple result : add content to seq elif isSimple: if noType: seq.append(part[0]) else: seq.append((typ(part),part[0])) # case compound result: recursively explore nested result else: seq.extend(pickLeaves(part, noType)) return seq def treeView(results, level=0, skipAnonymous=False, defaultType=None, TAB='\t'): NL = '\n' texte = '' for result in results: # case named result try: texte += level*TAB + result.getName() + ': ' + str(result) + NL # case anonymous result except AttributeError: if not skipAnonymous: if defaultType: type = defaultType else: type = "<%s>" %(result.__class__.__name__) texte += level*TAB + type + ': ' + str(result) + NL # case compound result: walk through recursive nesting if result.__class__ == ParseResults and len(result) > 1: texte += treeView(result, level+1) return texte # ================================= # examples # ================================= # === g r a m m a r from pyparsing import * # !!! class Grammar(object): integer = Word(nums).setParseAction(lambda i: int(i[0])) point = Literal('.') decimal = Combine(integer + point + integer).setParseAction(lambda x: float(x[0])) num = Group(decimal | integer)("num") plus = Literal('+')("plus") op = Group(num + plus + num)("op") calc = OneOrMore(op) calc = Grammar.calc # === i n p u t t e x t text = "1+2 3.0+4 5.0+6.0" # === standard results results = calc.parseString(text) print "=== standard results :" print results # === show leaves print "=== lowest-level flat sequence :" leaves = pickLeaves(results) print showSeq(leaves) # === show treeView print "=== tree view :" print treeView(results) # ================================= # === g r a m m a r class Grammar(object): # tokens add = Literal('+') mult = Literal('*') l_paren = Literal('(') r_paren = Literal(')') num = Group(Word(nums).setParseAction(lambda i: int(i[0])))("num") # symbols mult_op = Group(num + mult + num)("mult_op") add_op = Group((mult_op|num) + add + (mult_op|num))("add_op") #group = Group(l_paren + in_op + r_paren)("group") operation = (add_op|mult_op) calc = OneOrMore(operation) calc = Grammar.calc # === i n p u t t e x t text = " 1+2*3 4*5+6" # === standard results results = calc.parseString(text) print;print "=== standard results :" print results # === show leaves print "=== lowest-level flat sequence :" leaves = pickLeaves(results) print showSeq(leaves) # === show treeView print "=== tree view :" print treeView(results) |