[Pyparsing] Conditional Suppress?
Brought to you by:
ptmcg
From: June K. <jun...@gm...> - 2008-02-15 02:36:21
|
Hello, It's been a long while since I last used pyparsing(Hi Paul). I have good memories of pyparsing. Recently, I'm using it again, for parsing a subset of C and PL/I syntax. First, look at the following code please: from pyparsing import * from pprint import pprint t=""" a=a+1; if (a>b) { //if (1==2) gogogo(); if (c>d) a=a+1; else b=b+1; } else { c=c+1; if (3==4) printf("abc"); else if (4==5) dothis(); else dothat(); if (a>b) {go();go();come();} if (d>e) if (c>g) if(a>q) a=a+1; } """ ifst=Forward() stmt=Forward() cmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) p=ifst.ignore(cStyleComment).setDebug(False) pprint (list(p.scanString(t))[0][0].asList(),width=2,indent=2) ########################################### What I am trying to do is get the minimal tree-structure of if statements. I am not interested anything other than if statements. The result is: [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if', [ ]], [ 'if', [ 'if', [ 'if']]]]]] I am fairly satisfied with the result, but would like the blank list removed. That is, I want the {go();go();come();} part not present(not even as an empty list) in the parseResult. However I can't totally suppress the cmpdstmt since some of them might include 'if'. I played with setParseAction but couldn't get what I wanted. I also branched a few new grammars for treating non-if-cmpdstmt as following: ifst=Forward() stmt=Forward() cmpdstmt=Forward() stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) restcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) nonifstmt=Forward() nonifcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(nonifstmt)+ Literal('}').suppress()) # a statement that doesn't include if-statement nonifstmt << (Literal(';').suppress() | nonifcmpdstmt | (~oneOf('} if')+SkipTo(';',include=True)).suppress()) cmpdstmt << (nonifcmpdstmt.suppress() | restcmpdstmt) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) It returns the expected result. [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if'], [ 'if', [ 'if', [ 'if']]]]]] Does the code look right? Yet, the problem is that the branched grammar version is too complex to write, read, and maintain. What alternatives or improvements do you recommend? (maybe post-processing the parseResult after the parsing's finished?) June Kim |