[Pyparsing] Problem with Or Longest Match (I think)...
Brought to you by:
ptmcg
From: Shane M. <sha...@gm...> - 2016-07-09 09:45:35
|
I'm developing a parser for the Graphviz DOT language and am having problems with my STMT expression in the grammar fragment below. In this simplified grammar a STMT can be either a SUBGRAPH or a NODE_STMT. An example SUBGRAPH expression is "*subgraph cluster01 { n003 ; n004 ; }*" which is as you can see a composite statement. My problem is that whilst the SUBGRAPH expression will happily accept the test example, the STMT expression will not though it is defined as below: *STMT = SUBGRAPH("SUBGRAPH") ^ NODE_STMT("NODE")* and the test code runs: *Testing subgraph statements* *Match SUBGRAPH at loc 0(1,1)* *Match STMT_LIST at loc 20(1,21)* *Matched STMT_LIST -> ['n003', 'n004']* *Matched SUBGRAPH -> [['subgraph', 'cluster01'], ['n003', 'n004']]* *([(['subgraph', 'cluster01'], {'SUBGRAPHNAME': [('cluster01', 1)]}), (['n003', 'n004'], {'NODE': [('n003', 0), ('n004', 1)], 'STMT': [('n003', 0), ('n004', 1)]})], {})* *Match STMT at loc 0(1,1)* *Matched STMT -> ['subgraph']* *Problem Test Sample: LINE= 1 COL= 10* *subgraph cluster01 { n003 ; n004 ; }* *ERROR: Expected end of text (at char 9), (line:1, col:10)* My "belief" is that the STMT expression should preferentially match the SUBGRAPH expression rather than the NODE_STMT expression but clearly is not. What am I missing? BTW - StackOverflow points available : http://stackoverflow.com/questions/38258218/suspected-pyparsing-longest-match-error Thanks :-) *Grammar below:* LCURL = Literal("{").suppress() RCURL = Literal("}").suppress() STMTSEP = Literal(";").suppress() ID = Word(alphas, alphanums + "_") SUBGRAPH_KW = Keyword("subgraph", caseless=True) SUBGRAPH = Forward("SUBGRAPH") NODE_ID = ID("NODE_ID") NODE_STMT = NODE_ID("NODE") STMT = SUBGRAPH("SUBGRAPH") ^ NODE_STMT("NODE") STMT_LIST = ZeroOrMore(STMT("STMT") + Optional(STMTSEP)) SUBGRAPH << Group(SUBGRAPH_KW + ID("SUBGRAPHNAME")) + Group(LCURL + STMT_LIST + RCURL) ###################################################### SUBGRAPH.setName("SUBGRAPH") STMT.setName("STMT") STMT_LIST.setName("STMT_LIST") NODE_STMT.setName("NODE_STMT") ID.setName("ID") ###################################################### print("Testing subgraph statements") test_ids = [ '''subgraph cluster01 { n003 ; n004 ; }''' ] ################ FRAG_1 = STMT + StringEnd() ################ NODE_STMT.setDebug(True) SUBGRAPH.setDebug(True) ID.setDebug(True) STMT.setDebug(True) STMT_LIST.setDebug(True) for test in test_ids: try: result = FRAG_1.parseString(test) pprint.pprint(result) except ParseException, e: print("Problem Test Sample: LINE= %s COL= %s" % (e.lineno, e.col)) print (e.line) print (" " * (e.column - 1) + "^") print("ERROR: %s" % str(e)) |