Re: [Pyparsing] parsing a string that represents a tree
Brought to you by:
ptmcg
From: Mark T. <met...@gm...> - 2011-09-08 16:05:19
|
"gyro funch" <gyr...@gm...> wrote in message news:4E6...@gm...... > Hi, > > I have a set of strings that look similar to this: > > N1-(N2,N3-(N4,N5),N6-(N7,N8-(N9,N10),N11)) > > Each string essentially represents a tree structure, where the 'N#' > represent nodes and the '-' are connections or edges. Entries in > parentheses represent branches from the previous node. > > I would like to parse the string and then enumerate all of the paths > through the tree, starting at the root and out to each leaf. > > In the case of the above string, the desired output would be as follows: > > N1-N2 > N1-N3-N4 > N1-N3-N5 > N1-N6-N7 > N1-N6-N11 > N1-N6-N8-N9 > N1-N6-N8-N10 > > > I would greatly appreciate hints on how to use Pyparsing to help > achieve this functionality. Use a recursive grammar: from pyparsing import * name = Word('N',nums) node = Forward() nodeList = delimitedList(node) node << Group(name + Group(Optional(Suppress('-(') + nodeList + Suppress(')')))) s = "N1-(N2,N3-(N4,N5),N6-(N7,N8-(N9,N10),N11))" tree = node.parseString(s) print tree This generates nodes in the format "[name,[child,...]]": [['N1', [['N2', []], ['N3', [['N4', []], ['N5', []]]], ['N6', [['N7', []], ['N8', [['N9', []], ['N10', []]]], ['N11', []]]]]]] Here's a function to display it in the format you require: def traverse(node): if node[1]: for n in node[1]: for item in traverse(n): yield '{}-{}'.format(node[0],item) else: yield node[0] for item in traverse(tree[0]): print item Output: N1-N2 N1-N3-N4 N1-N3-N5 N1-N6-N7 N1-N6-N8-N9 N1-N6-N8-N10 N1-N6-N11 -Mark |