Menu

#42 Extension methods are not pickleable in Python 3

open
nobody
None
7
2019-06-18
2019-05-27
Saim Raza
No

I see that the methods defined in extension at global level are not pickleable. The functions defined in a module are pickleable by name reference in case of pure Python modules. Following example is take from the Demo directory which comes as a part of the source code. I am working on a linux machine and trying to write extensions to Python 3.7.

In [1]: import simple

In [2]: simple
Out[2]: <module 'simple' from '<pycxx_dir>/obj/simple.so'>

In [3]: pickle.dumps(simple.func)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-2284e569486f> in <module>
----> 1 pickle.dumps(simple.func)

TypeError: can't pickle PyCapsule objects

This is because simple.func.reduce returns a tuple including a PyCapsule object which is not pickleable.

In [1]: simple.func.__reduce__()
Out[1]:
(<function getattr>,
 ((<capsule object NULL at 0x7f8851d77cf0>,
   <capsule object NULL at 0x7f8851d77450>),
  'func'))

Also, the name of the function defined in extension module is wrong.

In [2]: simple.func
Out[2]: <function simple.tuple.func>

Discussion

  • Barry Alan Scott

    I can repo what you see.

    However I'm not clear that pickle support is usual for C++ extension functions.
    I did try with the PyQt5 code and cannot pickle its functions either.

    In the case of class instances the problem will get harder. Without all the help
    coming from the objects it is not possible to know what state to pickle and even if it makes sense to pickle, think of an object that holds a database connection.

    The docs do say that not all objects can be pickled, and that the code is not pickled at all.

    If you implement a class that has the __getstate__ and __setstate__ methods on a class you may have more success. I have no done that experiment myself.

    >>> from PyQt5 import QtCore
    >>> import pickle
    >>> pickle.dumps(x.majorVersion)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: can't pickle QVersionNumber objects
    >>> x.majorVersion
    <built-in method majorVersion of QVersionNumber object at 0x10e1d8748>
    >>> x.majorVersion()
    0

     

    Last edit: Barry Alan Scott 2019-05-29
  • Saim Raza

    Saim Raza - 2019-05-30

    The problem is that I should be able to use a function as a part of dumping an instance. My use case is to call the function from Python to create an object after doing computations in C++. Thus I need the function to be pickleable.

    To answer whether pickle support is usual for C++ extensions functions - extension methods defined using CPython documentation are indeed pickleable. See doc for more details.

    Further, simple.func should remain a function of the simple module. Pycxx wrongly injects it as a member of tuple type.

     
  • Barry Alan Scott

    Specifically which part of the docs did you want to draw to my attention?

    It sounds to me like you want to pickle the return value from the function, not the function?

    For your returned object to be picklable its seems that you need to
    the pickle protocol as documented in https://docs.python.org/3/library/pickle.html

    I suspect that I need to implement METH_CLASS in pycxx to allow the unpickling to work.

    I baffled by the "tuple", its not apparent in the PyCXX the code. It will take some debugging to track down where it comes from.

    (I tried reading the code of _sre to see how it does pickling, but I have found where it's
    __reduce__ code is implemented yet.

     
  • Saim Raza

    Saim Raza - 2019-05-30

    I will try to illustrate my use case below.

    C++: Say we have an extension called simple.

    // We define a class named ClassInExtension and a function named func in simple extension module.
    
    PyObject* func() {
        // Do some heavy computations.
        // Create and return an instance of ClassInExtension.
    }
    

    Suppose func can't be added as reduce method of the ClassInExtension.

    Python:

    from simple import func
    import pickle
    
    class Foo(object):
        def __int__(self, func, x):
            self.x = func(x)
    
        def __reduce__(self):
            return self.__class__, (func, self.x)
    
    foo = Foo(123)
    pickle.dumps(foo) # Will dump (__main__.Foo, (<function simple.func(x)>, 123))
    
    # To see what is dumped on pickle.dumps(foo), we can execute foo.__reduce__().
    

    pickle.dumps(foo) is what breaks with Pycxx+ Python 3. This works fine with Pycxx+Python2 though.

    Specifically which part of the docs did you want to draw to my attention?

    In the doc, we define spam.system as a method in the extension named spam. Pickling of this function works fine in both Python 2 and 3.

    I tried debugging the Pycxx code. The reason is that we switched to use from PyCObject_FromVoidPtr in Python 2 to PyCapsule_New in Python 3 while initializing functions in ExtensionModule::initialize.

    Python2:

    Tuple args( 2 );
    args[0] = Object( self, true );
    args[1] = Object( PyCObject_FromVoidPtr( method_def, do_not_dealloc ), true );
    
    assert( m_module != NULL );
    PyObject *func = PyCFunction_NewEx
                        (
                        &method_def->ext_meth_def,
                        new_reference_to( args ),
                        m_module
                        );
    

    Python 3:

    Tuple args( 2 );
    args[0] = Object( self, true );
    args[1] = Object( PyCapsule_New( method_def, NULL, NULL ), true );
    
    assert( m_module != NULL );
    PyObject *func = PyCFunction_NewEx
                        (
                        &method_def->ext_meth_def,
                        new_reference_to( args ),
                        m_module
                        );
    
     

    Last edit: Saim Raza 2019-05-30
  • Barry Alan Scott

    The use of PyCapsule is required since python 3.1, that is not the problem in itself.

    So if I understand it pickle will pickle the reduce function into the pickle stream so
    that the unpickling works.

    Do you have a example that fully works to pickle and unpickle in python2?
    Can you share the code with me to debug with?

    This may be a hard enough problem that I will not have enough free time to fix it quickly,

    In the mean time can you solve your pickling problem by side stepping this part of picking?
    You could have a function on your object that converted the object into simple python objects
    and pickle that that. Then when you unpickle pass that back to a object constructor?
    This would be the implementations of __getstate__ and __setstate__ that you
    need anyway.

     
  • Barry Alan Scott

    I had some insperation. See Demo/Python3/pickle_test.py and the changes to Demon/Python3/simple.cxx. Added in commit R414.

    Run with:

    $ ./build-unlimited-api.sh python3.7
    $ PYTHONPATH=obj python3.7 Demo/Pytohn3/pickle_test.py

    Seems I only needed to chage the c'tor and add __reduce__.

     
  • Saim Raza

    Saim Raza - 2019-06-10

    Tests in pickle_tests.py only test that a class defined in an extension is pickleable. However, the issue here is that any method defined as a part of simplemodule is unpickleable.

    Following tests should pass for this bug to be resolved:

    assert pickle.loads(pickle.dumps(simple.func)) is  simple.func # test keyword method is pickleable
    assert pickle.loads(pickle.dumps(simple.old_style_class)) is  simple.old_style_class # test varargs method is pickleable
    
    # Also add test for noargs method
    
     
  • Barry Alan Scott

    I do not understand why pickling the function is required for the use case of pickling objects,
    which I have shown can be made to work with no changes.

    Until I understand why its a real world problem I do not plan to work towars a fix.

     
  • Saim Raza

    Saim Raza - 2019-06-17

    The following example will try to demonstrate a real world use case. I have a class MyComputations which delegates heavy computation to C++. The class saves the functions it has to call, and we execute those functions (defined in the extension module) by calling its method do_computations.

    I have used the simple extension module which comes as a part of pycxx source distribution to make things simple.

    import simple # extension module defined using pycxx
    
    class MyComputations(object):
        def __init__(self, func1, func2):
            self.func1 = func1
            self.func2 = func2
    
        def do_computations(self, *args, **kwargs):
            return [self.func1(*args), self.func2(*args, **kwargs)]
    
        def __reduce__(self):
            return MyComputations, (self.func1, self.func2)
    
    mycomp = MyComputations(simple.mod_func_varargs, simple.mod_func_keyword)
    mycomp.do_computations(1, 2, 3, a=4, b=5)
    

    OUTPUT:

    In [29]: mycomp.do_computations(1, 2, 3, a=4, b=5)
    mod_func_varargs Called with 3 normal arguments.
    mod_func_varargs Called with 3 normal arguments.
    and with 2 keyword arguments:
        a
        b
    arg 1 is not a new_style_class
    Out[29]: [None, None]
    

    Now calling __reduce__ on MyComputations will return a tuple containing references to the functions passed to the class. Native Python mechanism will try to pickle these functions and is able to do so successfully.

    In [30]: mycomp.__reduce__()
    Out[30]:
    (__main__.MyComputations,
     (<function simple.mod_func_varargs>, <function simple.mod_func_keyword>))
    

    We can see that pickling of mycomp object works.

     In [31]: pickle.loads(pickle.dumps(mycomp))
    Out[31]: <__main__.MyComputations at 0x7faddedc87d0>
    

    This is what is actually broke in Python 3 with PYCXX.

    Note: We can work-around this problem by pickling only names of the functions passes to MyComputations class and importing them from the simple module using the names. Following is the implementation which avoids this limitation (bug?) of pycxx in Python 3.

    import simple
    
    class MyComputations(object):
        def __init__(self, func1_name, func2_name):
            self.func1_name = func1_name
            self.func2_name = func2_name
            self.func1 = getattr(simple, func1_name)
            self.func2 = getattr(simple, func2_name)
    
        def do_computations(self, *args, **kwargs):
            return [self.func1(*args), self.func2(*args, **kwargs)]
    
        def __reduce__(self):
            return MyComputations, (self.func1_name, self.func2_name)
    
    mycomp = MyComputations("mod_func_varargs", "mod_func_keyword")
    mycomp.do_computations(1, 2, 3, a=4, b=5)
    mycomp.__reduce__()
    pickle.loads(pickle.dumps(mycomp)) # Works in both Python 2 and 3
    

    This is indeed the gist of what I have done with my code. However, this is a hack. I should be able to treat functions as functions as is supported by native Python.

    Please let me know if you need more clarification.

     
  • Barry Alan Scott

    Thanks for taking the time to explain why this matters.

    I suspect this is going to take a while for me to understand what has to be changed to make this work.

    You should stay with your work around in the mean time.

     

Log in to post a comment.