[Pyparsing] Variable comment char
Brought to you by:
ptmcg
From: Russell D. <Rus...@as...> - 2011-09-26 02:28:38
|
I need to parse a rather evil format that has the following line: [Comment Char] |_char Which would change the comment char to '|'. The comment char can be any non-alphanumeric printable other than [ or ] and the comment char can be changed at any time and as many times as desired. The non comment, non bracket characters are legal characters. Initially, I figured I'd hold onto the ParserElement objects in self and modify them in callbacks. However, it appears they get copied and thus modifying them does nothing. So I made a Token class that has an internal reference to something I can modify. This works for the comment char, but then I have the issue of parsing tokens that contains characters that could be comment characters. If I include all non printables in that word, then comments don't get ignored. What's the best way to do this? class DynamicCharHolder(): def __init__(self, char): self.match = char def getMatch(self): return self.match def setMatch(self, char): self.match = char class DynamicChar(Token): def __init__(self, holder): super(DynamicChar, self).__init__() self.holder = holder self.name = "DynamicChar" self.mayReturnEmpty = False self.mayIndexError = False def parseImpl(self, instring, loc, doActions=True ): if (instring[loc] == self.holder.getMatch()): print "match '" + instring[loc] + "' at " + str(loc) return loc + 1, instring[loc] exc = self.myException exc.loc = loc exc.pstr = instring raise exc class IBISParser: def setCommentDelim(self, tokens): self.holder.setMatch(tokens[0]) print "New comment delim " + tokens[0] def __init__(self, text): self.holder = DynamicCharHolder("|") comment_delim = DynamicChar(self.holder) comment = comment_delim + restOfLine [...] ibis_file.ignore(comment) |