## [Pyparsing] Conditional Suppress?

 [Pyparsing] Conditional Suppress? From: June Kim - 2008-02-15 02:36:21 ```Hello, It's been a long while since I last used pyparsing(Hi Paul). I have good memories of pyparsing. Recently, I'm using it again, for parsing a subset of C and PL/I syntax. First, look at the following code please: from pyparsing import * from pprint import pprint t=""" a=a+1; if (a>b) { //if (1==2) gogogo(); if (c>d) a=a+1; else b=b+1; } else { c=c+1; if (3==4) printf("abc"); else if (4==5) dothis(); else dothat(); if (a>b) {go();go();come();} if (d>e) if (c>g) if(a>q) a=a+1; } """ ifst=Forward() stmt=Forward() cmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) p=ifst.ignore(cStyleComment).setDebug(False) pprint (list(p.scanString(t))[0][0].asList(),width=2,indent=2) ########################################### What I am trying to do is get the minimal tree-structure of if statements. I am not interested anything other than if statements. The result is: [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if', [ ]], [ 'if', [ 'if', [ 'if']]]]]] I am fairly satisfied with the result, but would like the blank list removed. That is, I want the {go();go();come();} part not present(not even as an empty list) in the parseResult. However I can't totally suppress the cmpdstmt since some of them might include 'if'. I played with setParseAction but couldn't get what I wanted. I also branched a few new grammars for treating non-if-cmpdstmt as following: ifst=Forward() stmt=Forward() cmpdstmt=Forward() stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) restcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) nonifstmt=Forward() nonifcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(nonifstmt)+ Literal('}').suppress()) # a statement that doesn't include if-statement nonifstmt << (Literal(';').suppress() | nonifcmpdstmt | (~oneOf('} if')+SkipTo(';',include=True)).suppress()) cmpdstmt << (nonifcmpdstmt.suppress() | restcmpdstmt) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) It returns the expected result. [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if'], [ 'if', [ 'if', [ 'if']]]]]] Does the code look right? Yet, the problem is that the branched grammar version is too complex to write, read, and maintain. What alternatives or improvements do you recommend? (maybe post-processing the parseResult after the parsing's finished?) June Kim ```

 [Pyparsing] Conditional Suppress? From: June Kim - 2008-02-15 02:36:21 ```Hello, It's been a long while since I last used pyparsing(Hi Paul). I have good memories of pyparsing. Recently, I'm using it again, for parsing a subset of C and PL/I syntax. First, look at the following code please: from pyparsing import * from pprint import pprint t=""" a=a+1; if (a>b) { //if (1==2) gogogo(); if (c>d) a=a+1; else b=b+1; } else { c=c+1; if (3==4) printf("abc"); else if (4==5) dothis(); else dothat(); if (a>b) {go();go();come();} if (d>e) if (c>g) if(a>q) a=a+1; } """ ifst=Forward() stmt=Forward() cmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) p=ifst.ignore(cStyleComment).setDebug(False) pprint (list(p.scanString(t))[0][0].asList(),width=2,indent=2) ########################################### What I am trying to do is get the minimal tree-structure of if statements. I am not interested anything other than if statements. The result is: [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if', [ ]], [ 'if', [ 'if', [ 'if']]]]]] I am fairly satisfied with the result, but would like the blank list removed. That is, I want the {go();go();come();} part not present(not even as an empty list) in the parseResult. However I can't totally suppress the cmpdstmt since some of them might include 'if'. I played with setParseAction but couldn't get what I wanted. I also branched a few new grammars for treating non-if-cmpdstmt as following: ifst=Forward() stmt=Forward() cmpdstmt=Forward() stmt << (Literal(';').suppress() | cmpdstmt | ifst | (NotAny('}')+SkipTo(';',include=True)).suppress()) restcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress()) nonifstmt=Forward() nonifcmpdstmt=Group(Literal('{').suppress()+ ZeroOrMore(nonifstmt)+ Literal('}').suppress()) # a statement that doesn't include if-statement nonifstmt << (Literal(';').suppress() | nonifcmpdstmt | (~oneOf('} if')+SkipTo(';',include=True)).suppress()) cmpdstmt << (nonifcmpdstmt.suppress() | restcmpdstmt) ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ Optional(Keyword("else")+stmt)) It returns the expected result. [ [ 'if', [ [ 'if', 'else']], 'else', [ [ 'if', 'else', [ 'if', 'else']], [ 'if'], [ 'if', [ 'if', [ 'if']]]]]] Does the code look right? Yet, the problem is that the branched grammar version is too complex to write, read, and maintain. What alternatives or improvements do you recommend? (maybe post-processing the parseResult after the parsing's finished?) June Kim ```
 Re: [Pyparsing] Conditional Suppress? From: June Kim - 2008-02-15 03:26:49 ```Well, a much simpler solution just occurred to me: class EmptySuppress(Suppress): #suppress only empy tokens def postParse( self, instring, loc, tokenlist ): if len(tokenlist[0]): return tokenlist return [] cmpdstmt=EmptySuppress(Group(Literal('{').suppress()+ ZeroOrMore(stmt)+ Literal('}').suppress())) 2008/2/15, June Kim : > Hello, > > It's been a long while since I last used pyparsing(Hi Paul). I have > good memories of pyparsing. Recently, I'm using it again, for parsing > a subset of C and PL/I syntax. > > First, look at the following code please: > > from pyparsing import * > from pprint import pprint > > t=""" > a=a+1; > if (a>b) { //if (1==2) gogogo(); > if (c>d) a=a+1; > else b=b+1; > } else { > c=c+1; > if (3==4) > printf("abc"); > else if (4==5) > dothis(); > else > dothat(); > if (a>b) {go();go();come();} > if (d>e) if (c>g) if(a>q) a=a+1; > } > """ > > ifst=Forward() > stmt=Forward() > > cmpdstmt=Group(Literal('{').suppress()+ > ZeroOrMore(stmt)+ > Literal('}').suppress()) > > stmt << (Literal(';').suppress() | > cmpdstmt | > ifst | > (NotAny('}')+SkipTo(';',include=True)).suppress()) > > ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ > Optional(Keyword("else")+stmt)) > > p=ifst.ignore(cStyleComment).setDebug(False) > pprint (list(p.scanString(t))[0][0].asList(),width=2,indent=2) > > ########################################### > > What I am trying to do is get the minimal tree-structure of if > statements. I am not interested anything other than if statements. > > The result is: > > [ [ 'if', > [ [ 'if', > 'else']], > 'else', > [ [ 'if', > 'else', > [ 'if', > 'else']], > [ 'if', > [ ]], > [ 'if', > [ 'if', > [ 'if']]]]]] > > I am fairly satisfied with the result, but would like the blank list > removed. That is, I want the {go();go();come();} part not present(not > even as an empty list) in the parseResult. However I can't totally > suppress the cmpdstmt since some of them might include 'if'. > > I played with setParseAction but couldn't get what I wanted. I also > branched a few new grammars for treating non-if-cmpdstmt as following: > > ifst=Forward() > stmt=Forward() > cmpdstmt=Forward() > > > stmt << (Literal(';').suppress() | > cmpdstmt | > ifst | > (NotAny('}')+SkipTo(';',include=True)).suppress()) > > restcmpdstmt=Group(Literal('{').suppress()+ > ZeroOrMore(stmt)+ > Literal('}').suppress()) > > nonifstmt=Forward() > nonifcmpdstmt=Group(Literal('{').suppress()+ > ZeroOrMore(nonifstmt)+ > Literal('}').suppress()) > > # a statement that doesn't include if-statement > nonifstmt << (Literal(';').suppress() | > nonifcmpdstmt | > (~oneOf('} if')+SkipTo(';',include=True)).suppress()) > > cmpdstmt << (nonifcmpdstmt.suppress() | restcmpdstmt) > > ifst << Group(Keyword("if")+nestedExpr("(",")").suppress()+stmt+\ > Optional(Keyword("else")+stmt)) > > > It returns the expected result. > > [ [ 'if', > [ [ 'if', > 'else']], > 'else', > [ [ 'if', > 'else', > [ 'if', > 'else']], > [ 'if'], > [ 'if', > [ 'if', > [ 'if']]]]]] > > Does the code look right? > > Yet, the problem is that the branched grammar version is too complex > to write, read, and maintain. > > What alternatives or improvements do you recommend? (maybe > post-processing the parseResult after the parsing's finished?) > > > June Kim > ```