Several modifications, please comment!

2009-11-15
2013-05-02
  • Daniel Filonik
    Daniel Filonik
    2009-11-15

    First, thank you for a great Python module, I believe it has a lot of potential! I was never quite happy with pickle because of its cryptic format. The XML schema you use is so much nicer to read.

    That said, I've had some thoughts about how to make Pyxser even better. I am working on an application that exposes several classes through boost::python. My goal was to be able to use Pyxser for serialization of these exposed classes. Before I go into more detail, a short disclaimer: The changes I'm propose are geared towards accomplishing that specific goal, there may be problems that I didn't think of. I would very much appreciate comments.

     
  • Daniel Filonik
    Daniel Filonik
    2009-11-15

    First, a couple of smaller things when it comes to searching in modules:

    The need to have an "\__all\__" field in your python module in order for Pyxser seems unnecessary to me and it introduces some problems later on. For that reason I suggest to change __pyxser_SearchModuleType__ so that this:

    objKeys = PyObject_GetAttrString(mod, pyxser_attr_all);
    dict = PyObject_GetAttrString(mod, pyxser_attr_dict);

    Becomes this:

    dict = PyObject_GetAttrString(mod, pyxser_attr_dict);
    objKeys = PyDict_Keys(dict);

    Further, I believe that in __pyxser_SearchObjectInMain__ the following statement:

        Py_XDECREF(item);

    Should be removed, because it leads to symbols in the main module being deleted.

     
  • Daniel Filonik
    Daniel Filonik
    2009-11-16

    Now, some of the more interesting changes that are necessary to be able to serialize objects of classes exposed through boost::python. First of all, the problem is that the members of such classes are not stored in "\__dict\__". That makes it impossible to use the current way of getting class members in __pyxser_SerializeXml__:

        lstItems = PyObject_GetAttrString(o, pyxser_attr_dict);

    Instead, to provide as much flexiblitiy as possible, I suggest to make this part configurable by the user by allowing him to provide a function that returns a dictionary of all members that should be serialized and their values. This change requires the most modifications to the existing code. First, a new field is added to __pythonSerializationArguments__:

        typedef struct pythonSerializationArguments_ {
            PyObject **item;
            PyObject **ck;
            PyObject **o;
            xmlDocPtr *docptr;
            xmlNodePtr *rootNode;
            xmlNodePtr *currentNode;
            PyListObject **dupSrcItems;
            char *enc;
            char *name;
            int *depth;
            int *depthcnt;
            PyObject **selector; // This will hold the function that selects members
        } PythonSerializationArguments;

    Now, it is neccessary to change __pyxserxml__, __pyxserxmlc14n__, __pyxserxmlc14nstrict__, __u\_pyxserxml__, __u\_pyxserxmlc14n__ and __u\_pyxserxmlc14nstrict__ to allow the user to pass this additional argument to them. Most notably:

        PyObject *input = (PyObject *)NULL;
        PyListObject *dupItems = (PyListObject *)NULL;
        PyObject *selector = (PyObject *)NULL;
        /* … */
        static char *kwlist = {"obj", "enc", "depth", "selector ", NULL};
        /* … */
        ok = PyArg_ParseTupleAndKeywords(args, keywds, "Os|iO", kwlist,
                                         &input, &in_enc, &py_depth, &selector);
        /* … */
        sargs.o = &input;
        sargs.docptr = &docXml;
        sargs.rootNode = &rootNode;
        sargs.currentNode = &rootNode;
        sargs.dupSrcItems = &dupItems;
        sargs.enc = py_enc;
        sargs.depth = &py_depth;
        sargs.depthcnt = &py_depth_cnt;
        sargs.selector = &selector;

        serXml = pyxser_SerializeXml(&sargs);

        Py_XDECREF(input);
        Py_XDECREF(dupItems);
        Py_XDECREF(selector);

    Finally, we can replace the aforementioned piece of code in __pyxser_SerializeXml__ with:

        PyObject *selector= *args->selector;
        if(selector) {
            arglist = Py_BuildValue("(O)", o);
            lstItems = PyObject_CallObject(selector, arglist);
        } else {
            lstItems = PyObject_GetAttrString(o, pyxser_attr_dict);
        }

    Here is an example how this facility can be used from Python in order to get the members of boost::python exposed class objects:

        import inspect
        from boostpython import Object
       
        obj = Object()
       
        selector = lambda o: dict(filter(lambda (name,value): not (name.startswith('__') or callable(value)), inspect.getmembers(o)))
       
        serialized = pyxser.serialize(obj = obj, enc = "utf-8", depth = 0, selector = selector)

    With these changes it is possible to serialize objects of classes exposed by boost::python.

     
  • Daniel Filonik
    Daniel Filonik
    2009-11-16

    While serialize works now, unserialize needs some additional work. This change is the one I am least happy about, because it imposes a requirement on the classes being serialized. Namely, they need to be default constructible. This is something that I wanted to avoid, but I could not find an easy way around.

    The problem is that __pyxser\_UnserializeXml__ and __pyxser\_UnserializeElement__ currently rely on:

        PyObject *ndict = (PyObject *)NULL;
        /* … */
        ndict = PyDict_New();
        unser = PyInstance_NewRaw(ct, ndict);
        Py_XDECREF(ndict);

    To create instances without invoking a constructor. This however doesn't seem to work with instances of classes exposed through boost::python. The only way I could get this to work is by using the following instead:

        unser = PyObject_CallFunctionObjArgs(ct, NULL);

    With this change, unserialization should work as well. I'm looking forward to hearing some comments and I would love to see some of my suggestions making their way into Pyxser if they are deemed valuable.

     
  • Thanks for your comments, I really was expecting that most users were using the mailing lists, so I need to read this forum more continuously. I think that your feedback is a good contribution.

    To implement those changes it will take a few days and I will reply to each comment about pyxser separately.

    Best regards,
    DMW

     
  • Thanks for your feedback, most of those changes are implemented on the **pyxser-1.3r** release, plus some bugs removal.

     
  • Daniel Filonik
    Daniel Filonik
    2009-12-02

    I'm glad that I could contribute. I've had a look at 1.3r and it seems that **pyxser\_UnserializeXml** still relies on PyInstance\_NewRaw(). If you chose to use the PyObject_CallFunctionObjArgs() approach, you should make sure replace that code as well. Be aware of the downside of relying on a default constructor to be present, though. Perhabs you could try using PyInstance\_NewRaw() as a fallback…

     
  • OK, I've released pyxser-1.3r-p1 with a fallback on **PyInstance\_NewRaw()** and I've added support for **PyObject\_CallFunctionObjArgs()** as first try on all calls. I hope this would be useful for you and most users.

    Best regards