Menu

pyparsing seems to treat numbers differently than chars?

Dan Strohl
2016-02-12
2016-05-31
  • Dan Strohl

    Dan Strohl - 2016-02-12

    (using: pyparsing 2.1.0 (also tried 2.0.7) and Python 3.5.0)

    In playing around with pyparsing in preparation trying to use it, I started running into some weird cases where it seemed to work incorrectly, after trying some different approaches, I ended up finding that, at least for me, using numbers in Word() seems to operate differently than passing letters… for example:



    Example that started it, I was trying to parse numbers from 10 to 59

    Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:16:59) [MSC v.1900 32 bit (Intel)] on win32
    >>> import pyparsing
    >>> from pyparsing import *
    >>> x1 = Word('1234567890', max=1)
    >>> x2 = Word('12345', max=1)
    >>> n = x2 + x1 + StringEnd()
    >>> n.parseString('12')
    Traceback (most recent call last):
      File "<input>", line 1, in <module>
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1129, in parseString
        raise exc
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1119, in parseString
        loc, tokens = self._parse( instring, 0 )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 993, in _parseNoCache
        loc,tokens = self.parseImpl( instring, preloc, doActions )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 2390, in parseImpl
        loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 997, in _parseNoCache
        loc,tokens = self.parseImpl( instring, preloc, doActions )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1807, in parseImpl
        raise ParseException(instring, loc, self.errmsg, self)
    pyparsing.ParseException: Expected W:(1234...) (at char 1), (line:1, col:2)
    


    So, then I tried it with text from another example, which worked

    >>> a1 = Word('abcde', max=1)
    >>> a2 = Word('fghij', max=1)
    >>> b = a1 + a2
    >>> b.parseString('af')
    (['a', 'f'], {})
    


    And tried combining text and numbers, which worked:

    >>> b1 = Word('abcde12345')
    >>> b2 = Word('jklmn67890')
    >>> c = b1 + b2
    >>> c.parseString('18')
    (['1', '8'], {})
    


    So, then I tried the variable "nums" (maybe I was structuring my strings wrong)?
    (OK, I know that the below would work better like: d = Word(nums, max=2), but I was trying to duplicate my problem<grin>)

    >>> d = Word(nums, max=1) + Word(nums, max=1)
    >>> d.parseString('12')
    Traceback (most recent call last):
      File "<input>", line 1, in <module>
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1129, in parseString
        raise exc
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1119, in parseString
        loc, tokens = self._parse( instring, 0 )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 993, in _parseNoCache
        loc,tokens = self.parseImpl( instring, preloc, doActions )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 2390, in parseImpl
        loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 997, in _parseNoCache
        loc,tokens = self.parseImpl( instring, preloc, doActions )
      File "C:\Users\dstrohl\Documents\PyCharm Projects\VEnvs\is_email\lib\site-packages\pyparsing.py", line 1807, in parseImpl
        raise ParseException(instring, loc, self.errmsg, self)
    pyparsing.ParseException: Expected W:(0123...) (at char 1), (line:1, col:2)
    


    So... any thoughts? am I doing something wrong here?

     

    Last edit: Dan Strohl 2016-02-12
  • Dan Strohl

    Dan Strohl - 2016-02-12

    Ok, I dont think it is a bug, I am doing something wrong... I just tried some more things:

    w = Word('0123456789', max=1)
    x = Word('12345', max=1)
    
    y = x+w
    
    z = y.parseString('16')
    print(z)
    
    d = Word('0123456789', max=1)
    v = Word('12345', max=1)
    
    print(d.parseString('1'))
    print(v.parseString('2'))
    j = d+v
    t = x+w
    
    print(j.parseString('12'))
    print(t.parseString('12'))
    

    it worked for everything up to print(j.parseString('12')).

    in playing around some more, I think that "y" worked becaues in "16", the "1" is in x, but not the "6", so it moved on... for "12", the "1" and "2" are both in x, but the max=1 is there, so it errored out....

    my expectation was that it would try x, error out, then try y, not try x and error out without trying y... so... I am sure I am missing something fundemantal here, can you help?

     

    Last edit: Dan Strohl 2016-02-12
    • Dan Strohl

      Dan Strohl - 2016-02-12

      aaannnddddd.... when I replace the "max=1" with "exact=1", it seems to work... so... bug or id(10).t error?

       
  • Paul McGuire

    Paul McGuire - 2016-05-31

    This seems the logical behavior to me - "max" meaning "no more than", and in parsing "12", you had "more than", so "fail". Whereas "exact" implies that "more than this is not necessarily bad, just parse this exact amount." (Sorry for the delay in responding, I don't visit this Discussion board very often.) Also, each expression in the grammar has to pass or fail on its own, it doesn't lookahead to the next expression to see if there should be some partial match in order to leave a valid bit for the next expression in the grammar. You can implement this kind of lookahead, using NotAny and FollowedBy expressions, but it has to be explicit.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.