Re: [Pyparsing] parsing a string that represents a tree
Brought to you by:
ptmcg
From: gyro f. <gyr...@gm...> - 2011-09-08 17:08:17
|
On 2011-09-08 9:49 AM, Mark Tolonen wrote: > > "gyro funch" <gyr...@gm...> wrote in message > news:4E6...@gm...... >> Hi, >> >> I have a set of strings that look similar to this: >> >> N1-(N2,N3-(N4,N5),N6-(N7,N8-(N9,N10),N11)) >> >> Each string essentially represents a tree structure, where the 'N#' >> represent nodes and the '-' are connections or edges. Entries in >> parentheses represent branches from the previous node. >> >> I would like to parse the string and then enumerate all of the paths >> through the tree, starting at the root and out to each leaf. >> >> In the case of the above string, the desired output would be as follows: >> >> N1-N2 >> N1-N3-N4 >> N1-N3-N5 >> N1-N6-N7 >> N1-N6-N11 >> N1-N6-N8-N9 >> N1-N6-N8-N10 >> >> >> I would greatly appreciate hints on how to use Pyparsing to help >> achieve this functionality. > > Use a recursive grammar: > > from pyparsing import * > name = Word('N',nums) > node = Forward() > nodeList = delimitedList(node) > node << Group(name + Group(Optional(Suppress('-(') + nodeList + > Suppress(')')))) > > s = "N1-(N2,N3-(N4,N5),N6-(N7,N8-(N9,N10),N11))" > > tree = node.parseString(s) > print tree > > This generates nodes in the format "[name,[child,...]]": > > [['N1', [['N2', []], ['N3', [['N4', []], ['N5', []]]], ['N6', [['N7', []], > ['N8', [['N9', []], ['N10', []]]], ['N11', []]]]]]] > > Here's a function to display it in the format you require: > > def traverse(node): > if node[1]: > for n in node[1]: > for item in traverse(n): > yield '{}-{}'.format(node[0],item) > else: > yield node[0] > > for item in traverse(tree[0]): > print item > > Output: > > N1-N2 > N1-N3-N4 > N1-N3-N5 > N1-N6-N7 > N1-N6-N8-N9 > N1-N6-N8-N10 > N1-N6-N11 > > -Mark > > Thanks, Mark! That was *extremely* helpful. -gyro |