Menu

#108 weird Or expression behavior

v1.0 (example)
closed
nobody
None
5
2018-03-31
2018-03-30
No

I do not understand why the second expression here below

ab = ["a", "b"]
C = Group("c" + Suppress("d"))

# works
TOT = Or([Or(ab), C])
TOT.parseString("a")

# fails
TOT = Or(ab + [C])
TOT.parseString("a")

fails to build with the Traceback

Traceback (most recent call last):
  File ".../scrap.py", line 38, in <module>
    simple_bug()
  File ".../scrap.py", line 34, in simple_bug
    TOT = Or(ab + [C])
  File "python-path\lib\site-packages\pyparsing.py", line 3439, in __init__
    self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)
  File "python-path\lib\site-packages\pyparsing.py", line 3439, in <genexpr>
    self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)
AttributeError: 'str' object has no attribute 'mayReturnEmpty'

Discussion

  • Paul McGuire

    Paul McGuire - 2018-03-31

    Pyparsing tries to be helpful when creating expressions with lists, to intuit what the developer's intent is. When passing a list of strings to Or (or And, Each, or any other ParseExpression subclass), it will automatically convert them to a list of Literals. This is similar to the auto-conversion to Literal when adding a string to an expression, as in something like

    negative_int = Combine('-' + Word(nums))
    

    Pyparsing will auto-convert the '-' string to a Literal, and then use the '+' to create an And. If not using the operators, this would look like:

    negative_int = Combine(And([Literal('-'), Word(nums)]))
    

    Similarly, when you call Or(ab), and ab is a list of strings, pyparsing will convert the 'a' and 'b' strings to Literal('a') and Literal('b'), giving you the same as if you had called Or([Literal('a'), Literal('b')]).

    But in your second case, you are passing a list containing 2 strings and an expression. At this point, pyparsing's helpfulness gives up. It assumes that you have provided this weird list for a special reason, so the strings do not get auto-converted to Literals. That is why you get that exception, when the given strings do not have the attributes expected of a ParserElement.

    The solution is to move away from your explicit style of calling the Or, And, MatchFirst, etc. classes and use the operator definitions created by pyparsing.

    Then your expression would be clearer as well:

    ab = Literal('a') ^ Literal('b')
    

    or if you prefer:

    a = Literal('a')
    b = Literal('b')
    ab = a ^ b
    

    You just need to be careful not do something like this by accident:

    ab = "a" ^ "b"
    

    At which Python should give a TypeError exception, since the ^ operator is not defined for strings.

    This auto-convert-to-Literal behavior does cause problems sometimes, if you want the And of literals 'a' and 'b', and write:

    ab = "a" + "b"
    

    when you mean And([Literal('a'), Literal('b')]). Since neither term of the '+' operator is a ParserElement, there is no auto-conversion to Literal. But since the '+' operator is defined for Python strings, what you will get instead is, of course, the Python string "ab".

    Instead you must make one or both of the terms a ParserElement:

    ab = Literal("a") + "b"
    

    or:

    a = Literal('a')
    b = Literal('b')
    ab = a + b
    
     
    • Sebastien de Menten

      thank you Paul for your very detailed answer!

       
  • Paul McGuire

    Paul McGuire - 2018-03-31
    • status: open --> closed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.