I need to parse a rather evil format that has the following line:
[Comment Char] |_char
Which would change the comment char to '|'. The comment char can be
any non-alphanumeric printable other than [ or ] and the comment char
can be changed at any time and as many times as desired. The non
comment, non bracket characters are legal characters. Initially, I
figured I'd hold onto the ParserElement objects in self and modify
them in callbacks. However, it appears they get copied and thus
modifying them does nothing. So I made a Token class that has an
internal reference to something I can modify. This works for the
comment char, but then I have the issue of parsing tokens that
contains characters that could be comment characters. If I include all
non printables in that word, then comments don't get ignored. What's
the best way to do this?
class DynamicCharHolder():
def __init__(self, char):
self.match = char
def getMatch(self):
return self.match
def setMatch(self, char):
self.match = char
class DynamicChar(Token):
def __init__(self, holder):
super(DynamicChar, self).__init__()
self.holder = holder
self.name = "DynamicChar"
self.mayReturnEmpty = False
self.mayIndexError = False
def parseImpl(self, instring, loc, doActions=True ):
if (instring[loc] == self.holder.getMatch()):
print "match '" + instring[loc] + "' at " + str(loc)
return loc + 1, instring[loc]
exc = self.myException
exc.loc = loc
exc.pstr = instring
raise exc
class IBISParser:
def setCommentDelim(self, tokens):
self.holder.setMatch(tokens[0])
print "New comment delim " + tokens[0]
def __init__(self, text):
self.holder = DynamicCharHolder("|")
comment_delim = DynamicChar(self.holder)
comment = comment_delim + restOfLine
[...]
ibis_file.ignore(comment)
|