Lets say I have a file with the following format:
----
test1
begin
some lines
of
printable text
end
test2
begin
and some more text
end
----
My grammar for this format is:
----
name = Word(alphanums)
begin_ = LineStart() + Literal( "begin" )
end_ = LineStart() + Literal( "end" )
body = ZeroOrMore(Word(printables))
block = Group(name + begin_ + body + end_)
grammar = OneOrMore(block)
----
When I parse the example file with this grammar, I get an exception:
----
Traceback (most recent call last):
File "<stdin>", line 19, in ?
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 657, in parseFile
return self.parseString(file_contents)
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 503, in parseString
loc, tokens = self.parse( instring.expandtabs(), 0 )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1466, in parseImpl
loc, tokens = self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1321, in parseImpl
return self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1169, in parseImpl
loc, exprtokens = e.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1003, in parseImpl
raise exc
pyparsing.ParseException: Expected start of line (105), (11,1)
----
Could someone explain the reason for this exception to me?
-Klaas
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In order to satisfy your grammar, the first token has to be a 'Word(alphanumeric)'. I'm not sure what _isn't_ a Word(alphanumeric). Maybe you have a punctuation mark in there somewhere?
-tl
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your reply. However, I don't understand what your saying. The first token (name) already is a Word(alphanums) right? In case you're referring to the body = ZeroOrMore(Word(printables)) token, there can be non alphanums (_ , . ; < > etc.) between begin and end.
What I'm trying to do here, is to encapsulate unknown grammar between a begin-end pair. There could be a problem for the parser with my grammar because it doesn't 'know' whether the first 'end' or the second 'end' pairs up with the first begin, because the first 'end' can be seen as part of 'body'. If this is the problem, then I don't know how to solve it.
Cheers,
Klaas
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Lets say I have a file with the following format:
----
test1
begin
some lines
of
printable text
end
test2
begin
and some more text
end
----
My grammar for this format is:
----
name = Word(alphanums)
begin_ = LineStart() + Literal( "begin" )
end_ = LineStart() + Literal( "end" )
body = ZeroOrMore(Word(printables))
block = Group(name + begin_ + body + end_)
grammar = OneOrMore(block)
----
When I parse the example file with this grammar, I get an exception:
----
Traceback (most recent call last):
File "<stdin>", line 19, in ?
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 657, in parseFile
return self.parseString(file_contents)
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 503, in parseString
loc, tokens = self.parse( instring.expandtabs(), 0 )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1466, in parseImpl
loc, tokens = self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1321, in parseImpl
return self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1169, in parseImpl
loc, exprtokens = e.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1003, in parseImpl
raise exc
pyparsing.ParseException: Expected start of line (105), (11,1)
----
Could someone explain the reason for this exception to me?
-Klaas
In order to satisfy your grammar, the first token has to be a 'Word(alphanumeric)'. I'm not sure what _isn't_ a Word(alphanumeric). Maybe you have a punctuation mark in there somewhere?
-tl
Tom,
Thanks for your reply. However, I don't understand what your saying. The first token (name) already is a Word(alphanums) right? In case you're referring to the body = ZeroOrMore(Word(printables)) token, there can be non alphanums (_ , . ; < > etc.) between begin and end.
What I'm trying to do here, is to encapsulate unknown grammar between a begin-end pair. There could be a problem for the parser with my grammar because it doesn't 'know' whether the first 'end' or the second 'end' pairs up with the first begin, because the first 'end' can be seen as part of 'body'. If this is the problem, then I don't know how to solve it.
Cheers,
Klaas
Klaas -
This is exactly why this grammar wont match the closing 'end' - it gets eaten up as part of the body.
Assuming that you don't have 'end' as the body of any you could modify your grammar as follows:
name = Word(alphanums)
begin_ = LineStart() + Literal( "begin" )
end_ = LineStart() + Literal( "end" )
body = ZeroOrMore(~end_ + Word(printables))
block = Group(name + begin_ + body + end_)
Hope this works out for you - thanks for trying out pyparsing!
-- Paul McGuire