Python parsing module / Discussion / Help/Open Discussion: parse question

Klaas Hofstra - 2005-03-16

Lets say I have a file with the following format:
----
test1
begin
    some lines
    of
    printable text
end

test2
begin
    and some more text
end
----

My grammar for this format is:
----
name = Word(alphanums)
begin_ = LineStart() + Literal( "begin" )
end_ = LineStart() + Literal( "end" )
body = ZeroOrMore(Word(printables))
block = Group(name + begin_ + body + end_)

grammar = OneOrMore(block)
----

When I parse the example file with this grammar, I get an exception:
----
Traceback (most recent call last):
File "<stdin>", line 19, in ?
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 657, in parseFile
    return self.parseString(file_contents)
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 503, in parseString
    loc, tokens = self.parse( instring.expandtabs(), 0 )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1466, in parseImpl
    loc, tokens = self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1321, in parseImpl
    return self.expr.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1169, in parseImpl
    loc, exprtokens = e.parse( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 452, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
File "/usr/lib64/python2.3/site-packages/pyparsing.py", line 1003, in parseImpl
    raise exc
pyparsing.ParseException: Expected start of line (105), (11,1)
----

Could someone explain the reason for this exception to me?

-Klaas

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Tom Lynch - 2005-03-16
  
  In order to satisfy your grammar, the first token has to be a 'Word(alphanumeric)'. I'm not sure what _isn't_ a Word(alphanumeric). Maybe you have a punctuation mark in there somewhere?
  
  -tl
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Klaas Hofstra - 2005-03-17
    
    Tom,
    
    Thanks for your reply. However, I don't understand what your saying. The first token (name) already is a Word(alphanums) right? In case you're referring to the body = ZeroOrMore(Word(printables)) token, there can be non alphanums (_ , . ; < > etc.) between begin and end.
    
    What I'm trying to do here, is to encapsulate unknown grammar between a begin-end pair. There could be a problem for the parser with my grammar because it doesn't 'know' whether the first 'end' or the second 'end' pairs up with the first begin, because the first 'end' can be seen as part of 'body'. If this is the problem, then I don't know how to solve it.
    
    Cheers,
    Klaas
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Paul McGuire - 2005-03-17
      
      Klaas -
      
      This is exactly why this grammar wont match the closing 'end' - it gets eaten up as part of the body.
      
      Assuming that you don't have 'end' as the body of any you could modify your grammar as follows:
      
      name = Word(alphanums)
      begin_ = LineStart() + Literal( "begin" )
      end_ = LineStart() + Literal( "end" )
      body = ZeroOrMore(~end_ + Word(printables))
      block = Group(name + begin_ + body + end_)
      
      Hope this works out for you - thanks for trying out pyparsing!
      
      -- Paul McGuire
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

parse question

Forums

Help

parse question document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

parse question