Menu

_trim_arity hides user exceptions

2016-01-17
2016-01-18
  • Michael Cohen

    Michael Cohen - 2016-01-17

    I received the following strange error:

       value = self._parseNoCache( instring, loc, doActions, callPreParse )
      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py", line 1022, in _parseNoCache
        tokens = fn( instring, tokensStart, retTokens )
      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py", line 770, in wrapper
        ret = func(*args[limit[0]:])
    TypeError: _make_attribute() takes exactly 2 arguments (1 given)
    

    The I looked the source to pyparsing and was a little nausiated to see the _trimarity function.:

    'decorator to trim function calls to match the arity of the target'
    def _trim_arity(func, maxargs=2):
        if func in singleArgBuiltins:
            return lambda s,l,t: func(t)
        limit = [0]
        foundArity = [False]
        def wrapper(*args):
            while 1:
                try:
                    ret = func(*args[limit[0]:])
                    foundArity[0] = True
                    return ret
                except TypeError:
                    if limit[0] <= maxargs and not foundArity[0]:
                        limit[0] += 1
                        continue
                    raise
        return wrapper
    

    This code is suboptimal because:
    1) It assumes that a TypeError means that the call to the function is made with the wrong number of parameters. In fact if a TypeError is propagated from within the function it will just try to call it with fewer args leading to a very confusing error message for the user (especially if the function is a lambda they dont even have a name).

    2) It does this every single call which is unnecessary since the function prototype is not going to change at runtime!

    It is way better to replace with the following code:

    def _trim_arity(func, maxargs=2):
        func_args = inspect.getargspec(func).args
        if func_args[0] == "self":
            func_args.pop(0)
    
        if len(func_args) == 1:
            return lambda s, l, t: func(t)
        elif len(func_args) == 2:
            return lambda s, l, t: func(l, t)
        elif len(func_args) == 3:
            return func
    

    This look at definition time at the args list of the function and just dispatches the correct wrapper to each. It will have minimal impact at runtime, and more importantly will not disturb the python backtracing allowing the user to see important error messages about their callbacks:

      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py", line 1022, in _parseNoCache
        tokens = fn( instring, tokensStart, retTokens )
      File "/home/scudette/rekall/tools/layout_expert/layout_expert/lib/parsers.py", line 17, in <lambda>
        return lambda s, l, t: func(t)
      File "/home/scudette/rekall/tools/layout_expert/layout_expert/parser/parser.py", line 430, in _make_attribute
        *expression))
    TypeError: type object argument after * must be a sequence, not CNumber
    
     
    • Paul McGuire

      Paul McGuire - 2016-01-18

      Michael -

      Thanks for posting your note - sorry to hear that Pyparsing's code was
      nausea-inducing.

      This approach to _trim_arity was actually provided by Raymond Hettinger,
      long-time Python luminary and author of a number of modules in the Python
      standard library (including my favorite, itertools). Raymond is nothing if
      not diligent about avoiding unnecessary processing or overhead in his code.
      In fact, when I first saw this, I had a similar reaction that you did. Not
      the nausea part, but the part about thinking that this checking would be
      done in every call to the parse action. However, since the argument limit is
      saved outside the wrapper function, the repetitive argument count testing
      only occurs on the first call to the parse action - once the correct number
      of arguments is determined, subsequent calls use that number from then on.

      (My previous version of _trim_arity also used various introspection features
      to extract the arguments from the provided function, but this logic was
      quite fragile. There are a number of edge cases, beyond just the "skip over
      self if it is the first argument" one that you found, and the introspection
      calls had some incompatibilities between Py2 and Py3. My unit tests include
      several of these edge cases, and your straightforward proposed patch using
      inspect actually fails to pass them.)

      However, inspired by your email and several other recent postings, I took
      another run at making _trim_arity able to differentiate between TypeErrors
      raised during arity testing and those real TypeErrors raised within the
      body of the parse action. I think I now have a working version, having
      tried this with my own test case:

      Word('a').setParseAction(lambda t: t[0]+1).parseString('aaa')

      This parse action raises a TypeError because it tries to add a string and an
      int. With the latest updates to _trim_arity, I now get the correct
      exception message:

      TypeError: cannot concatenate 'str' and 'int' objects

      Instead of the previous (and misleading)

      <lambda>() takes exactly 1 argument (0 given)

      I've checked this version into the SourceForge SVN repository, and it will
      be included in the next Pyparsing release. You can extract it for yourself
      if you like and try it out.

      Thanks again for your post,

      -- Paul

      From: Michael Cohen [mailto:scudette@users.sf.net]
      Sent: Sunday, January 17, 2016 3:09 PM
      To: [pyparsing:discussion] 337293@discussion.pyparsing.p.re.sf.net
      Subject: [pyparsing:discussion] _trim_arity hides user exceptions

      I received the following strange error:

      value = self._parseNoCache( instring, loc, doActions, callPreParse )
      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py",
      line 1022, in _parseNoCache
      tokens = fn( instring, tokensStart, retTokens )
      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py",
      line 770, in wrapper
      ret = func(*args[limit[0]:])
      TypeError: _make_attribute() takes exactly 2 arguments (1 given)

      The I looked the source to pyparsing and was a little nausiated to see the
      _trimarity function.:

      'decorator to trim function calls to match the arity of the target'
      def _trim_arity(func, maxargs=2):
      if func in singleArgBuiltins:
      return lambda s,l,t: func(t)
      limit = [0]
      foundArity = [False]
      def wrapper(args):
      while 1:
      try:
      ret = func(
      args[limit[0]:])
      foundArity[0] = True
      return ret
      except TypeError:
      if limit[0] <= maxargs and not foundArity[0]:
      limit[0] += 1
      continue
      raise
      return wrapper

      This code is suboptimal because:
      1) It assumes that a TypeError means that the call to the function is made
      with the wrong number of parameters. In fact if a TypeError is propagated
      from within the function it will just try to call it with fewer args leading
      to a very confusing error message for the user (especially if the function
      is a lambda they dont even have a name).

      2) It does this every single call which is unnecessary since the function
      prototype is not going to change at runtime!

      It is way better to replace with the following code:

      def _trim_arity(func, maxargs=2):
      func_args = inspect.getargspec(func).args
      if func_args[0] == "self":
      func_args.pop(0)

      if len(func_args) == 1:
          return lambda s, l, t: func(t)
      elif len(func_args) == 2:
          return lambda s, l, t: func(l, t)
      elif len(func_args) == 3:
          return func
      

      This look at definition time at the args list of the function and just
      dispatches the correct wrapper to each. It will have minimal impact at
      runtime, and more importantly will not disturb the python backtracing
      allowing the user to see important error messages about their callbacks:

      File "/home/scudette/Dev/local/lib/python2.7/site-packages/pyparsing.py",
      line 1022, in _parseNoCache
      tokens = fn( instring, tokensStart, retTokens )
      File
      "/home/scudette/rekall/tools/layout_expert/layout_expert/lib/parsers.py",
      line 17, in <lambda>
      return lambda s, l, t: func(t)
      File
      "/home/scudette/rekall/tools/layout_expert/layout_expert/parser/parser.py",
      line 430, in _make_attribute
      *expression))
      TypeError: type object argument after * must be a sequence, not CNumber


      _trim_arity hides user exceptions
      https://sourceforge.net/p/pyparsing/discussion/337293/thread/8af2268f/?limi t=25#b2ad


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/pyparsing/discussion/337293/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/


      This email has been checked for viruses by Avast antivirus software.
      https://www.avast.com/antivirus

       
  • Paul McGuire

    Paul McGuire - 2016-01-18

    Michael –

    Thanks for posting your note – sorry to hear that Pyparsing’s code was nausea-inducing.

    This approach to _trim_arity was actually provided by Raymond Hettinger, long-time Python luminary and author of a number of modules in the Python standard library (including my favorite, itertools). Raymond is nothing if not diligent about avoiding unnecessary processing or overhead in his code. In fact, when I first saw this, I had a similar reaction that you did. Not the nausea part, but the part about thinking that this checking would be done in every call to the parse action. However, since the argument limit is saved outside the wrapper function, the repetitive argument count testing only occurs on the first call to the parse action – once the correct number of arguments is determined, subsequent calls use that number from then on.

    (My previous version of _trim_arity also used various introspection features to extract the arguments from the provided function, but this logic was quite fragile. There are a number of edge cases, beyond just the “skip over self if it is the first argument” one that you found, and the introspection calls had some incompatibilities between Py2 and Py3. My unit tests include several of these edge cases, and your straightforward proposed patch using inspect actually fails to pass them.)

    However, inspired by your email and several other recent postings, I took another run at making _trim_arity able to differentiate between TypeErrors raised during arity testing and those real TypeErrors raised within the body of the parse action. I think I now have a working version, having tried this with my own test case:

    Word('a').setParseAction(lambda t: t[0]+1).parseString('aaa')
    

    This parse action raises a TypeError because it tries to add a string and an int. With the latest updates to _trim_arity, I now get the correct exception message:

    TypeError: cannot concatenate 'str' and 'int' objects
    

    Instead of the previous (and misleading)

    <lambda>() takes exactly 1 argument (0 given)
    

    I’ve checked this version into the SourceForge SVN repository, and it will be included in the next Pyparsing release. You can extract it for yourself if you like and try it out.

    Thanks again for your post,
    -- Paul

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.