Unparsed part of a string

2005-06-08
2013-05-14
  • linuxconvert2
    linuxconvert2
    2005-06-08

    High, Using with great pleasure pyparsing

    Having called
        MyPattern.parseString("nnn nnn")
    How do I know which part of  "nnn nnn" has not been parsed. Sorry, but I do not find in the documentation,

    Best regards, Mathias

     
    • Paul McGuire
      Paul McGuire
      2005-06-08

      Mathias -

      Welcome to pyparsing!

      This is a fairly frequently asked question, so I should probably add some notes to the "How to Use" docs.

      Don't forget that parseString returns a ParseResults object, which can be accessed like a list, a dictionary, or an object with attributes, depending on how you've built your grammar.  The simplest method is as a list.

      Ok, so let's assume that you are parsing strings composed of words made up of the letter 'n' (this is a pretty boring grammar, but you picked it!).  Here's one grammar you might use:

      nWord = Word('n')
      MyPattern = nWord
      print MyPattern.parseString( "nnn nnn")

      This will show the results as:
      ['nnn']
      so clearly we haven't parsed the entire string.  A more thorough grammar could be:
      MyPattern = OneOrMore(nWord)

      So that
      print MyPattern.parseString( "nnn nnn")
      ['nnn', 'nnn']
      now shows us more complete information.

      But the most certain way to ensure that we have or haven't parsed everything is to define our grammar in a way that says "when the grammar is completed, I should be at the end of the input string".  Use the StringEnd() class for this, as in:

      MyPattern = OneOrMore(nWord) + StringEnd()

      If we had defined our original grammar using StringEnd(), we would have gotten this:

      MyPattern = nWord + StringEnd()
      print MyPattern.parseString( "nnn nnn")
      Traceback (most recent call last):
        File "<stdin>", line 1, in ?
        File "c:\python24\Lib\site-packages\pyparsing.py", line 606, in parseString
          loc, tokens = self.parse( instring.expandtabs(), 0 )
        File "c:\python24\Lib\site-packages\pyparsing.py", line 547, in parse
          loc,tokens = self.parseImpl( instring, loc, doActions )
        File "c:\python24\Lib\site-packages\pyparsing.py", line 1357, in parseImpl
          loc, exprtokens = e.parse( instring, loc, doActions )
        File "c:\python24\Lib\site-packages\pyparsing.py", line 547, in parse
          loc,tokens = self.parseImpl( instring, loc, doActions )
        File "c:\python24\Lib\site-packages\pyparsing.py", line 1245, in parseImpl
          raise exc
      pyparsing.ParseException: Expected end of text (at char 4), (line:1, col:5)

      Now we have actually forced a ParseException to be thrown.  If you trap for this exception, you can extract from it the line and column where the exception occurred (although these are sometimes a bit misleading, as pyparsing isn't always good at picking the location of parse exceptions when they occur inside optional or repetitive expressions).

      Hope that helps, and thanks again for using pyparsing!

      -- Paul

       
    • linuxconvert2
      linuxconvert2
      2005-06-09

      High Paul,

      Thanks for your clear and supportive answer. StringEnd() solves my problem; so for me everything is clear.

      Best regards, Mathias