From: Alan K. <sql...@xh...> - 2004-10-15 20:39:34
|
Greetings all, I'm getting to grips with SQLObject at the moment. I like it very much, it is an excellent and pythonic bridge between the object and relational worlds. I'm trying to pickle SQLObjects, so that I can store them in structured documents. As far as I can see, pickles only need to contain the name of the SQLObject class, and the id which uniquely identifies the instance. From what I understand, the id attributes of SQLObjects uniquely identify the object instance in the SQLObject's class/table. This should mean that I could use these ids in references in data stored outside SQLObject, and know that I will get the correct instance when I recreate the SQLObject, through the use of the ClassName.get(id). The same should hold true when I store ids in pickles. If I store and retrieve SQLObjects through the pickle protocol, i.e. by defining the __getstate__ and __setstate__ methods like so class MoSQLObject(SQLObject): def __getstate__(self): return self.id def __setstate__(self, id): return self.__class__.get(id) Then I should get the same SQLObject, i.e. an object with the same id, back from unpickling process. But I'm finding that that isn't working. The object that I get back from the unpickling process seems to be the right SQLObject, but it's missing the id attribute. The following code illustrates the problem. #source -------- import pickle import sqlobject __connection__ = sqlobject.connectionForURI('sqlite:/data/database.db') class PickleableSQLObject(sqlobject.SQLObject): def __getstate__(self): return self.id def __setstate__(self, id): return self.__class__.get(id) class MoSQLObject(PickleableSQLObject): name = sqlobject.StringCol() if __name__ == "__main__": MoSQLObject.createTable(ifNotExists=True) obj = MoSQLObject(name='alan') print "Object is %s" % str(obj) pickled = pickle.dumps(obj) print "Pickled object is %d bytes" % len(pickled) unpickled = pickle.loads(pickled) print "Unpickled object is %s" % str(unpickled) #source -------- The above code outputs the following #output -------- Object is <MoSQLObject 1 name='alan'> Pickled object is 91 bytes Traceback (most recent call last): File "test_sqlobject_pickle.py", line 25, in ? print "Unpickled object is %s" % str(unpickled) File "C:\Python23\Lib\site-packages\sqlobject\main.py", line 1072, in __repr__ return '<%s %r %s>' \ AttributeError: 'MoSQLObject' object has no attribute 'id' #output -------- So it seems the id attribute has gone missing. I have verified that it is there before the SQLObject is returned from __setstate__, but has disappeared by the time it gets returned by pickle.loads(). Which leads me to think one of two things 1. There is some form of clash between the semantics of 'id' attributes in the SQLObject and pickle modules. 2. I have failed to grok SQLObject, and am not using it correctly. Particularly, I may have missed some essential constraints or rules relating to object identity? I'd be grateful if anyone could shed any light. TIA, Alan. |
From: Brian R. <br...@se...> - 2004-10-15 21:00:45
|
On Oct 15, 2004, at 3:39 PM, Alan Kennedy wrote: > I'm trying to pickle SQLObjects, so that I can store them in > structured documents. As far as I can see, pickles only need to > contain the name of the SQLObject class, and the id which uniquely > identifies the instance. I thought SQLObject does not support pickleing because it uses pickel for persistance: http://sourceforge.net/mailarchive/message.php?msg_id=7578480 This may have changed, but this was the last I heard. |
From: Alan K. <sql...@xh...> - 2004-10-15 22:48:54
|
[Alan Kennedy] >> I'm trying to pickle SQLObjects, so that I can store them in >> structured documents. As far as I can see, pickles only need to >> contain the name of the SQLObject class, and the id which uniquely >> identifies the instance. [Brian Ray] > I thought SQLObject does not support pickleing because it uses pickel > for persistance: > > http://sourceforge.net/mailarchive/message.php?msg_id=7578480 Thanks for pointing out that reference Brian, I hadn't seen it before. Looking at Ian's message at that url, it looks like he's talking about support for pickling the entire SQLObject, which I would imagine would have complex consequences when it comes to connections, etc, because object identity seems to be related to the connection that the object came from. However, I'm using the pickle protocol to do something different, which is to pickle a *reference* to an SQLObject, so that when the reference is unpickled, the original object can be retrieved. So it's much less complex than pickling the entire object. On reading further about the pickle protocol, I see that it does have support for pickling references to external objects, i.e. objects that are going to be serialised by reference. This is done through the persistent_id and persistent_load methods of Pickler and Unpickler objects. http://docs.python.org/lib/node69.html I've implemented that (code below), and it works just fine, although the mechanism is a little messy. For example, I don't like having to map an SQLObject class definition from a string, which could get messy if there are multiple modules and imports involved. But that's the fault of the pickle protocol, not SQLObject. Problem for me is that I'm actually trying to do this with David Mertz gnosis.xml.pickle, which also uses the pickle protocol, i.e. __getstate__, etc. But unfortunately gnosis.xml.pickle doesn't seem to support the persistent extensions to the pickle protocol. Maybe I'll have to hack it (gnosis.xml.pickle). Seems like a lot of hard work for such a simple thing :-( Anyway, here is code that successfully pickles and unpickles *references* to SQLObjects, should anyone be interested. #source --------------------- import pickle import sqlobject import StringIO __connection__ = sqlobject.connectionForURI('sqlite:/data/database.db') class SQLObjectPickler(pickle.Pickler): def persistent_id(self, obj): if isinstance(obj, sqlobject.SQLObject): return "%s:%d" % (obj.__class__.__name__, obj.id) return None class SQLObjectUnpickler(pickle.Unpickler): def persistent_load(self, obj_str): klass_name, obj_id_str = obj_str.split(':') klass = eval(klass_name) return klass.get(int(obj_id_str)) class MoSQLObject(sqlobject.SQLObject): name = sqlobject.StringCol() if __name__ == "__main__": MoSQLObject.createTable(ifNotExists=True) obj = MoSQLObject(name='alan') buffer = StringIO.StringIO() pickler = SQLObjectPickler(buffer, 0) print "Object is %s" % str(obj) pickler.dump(obj) pickled = buffer.getvalue() print "Pickled object is %d bytes" % len(pickled) unpickler = SQLObjectUnpickler(StringIO.StringIO(pickled)) unpickled = unpickler.load() print "Unpickled object is %s" % str(unpickled) #source --------------------- which outputs #output --------------------- Object is <MoSQLObject 1 name='alan'> Pickled object is 16 bytes Unpickled object is <MoSQLObject 1 name='alan'> #output --------------------- Regards, Alan. |
From: Ian B. <ia...@co...> - 2004-10-18 02:36:46
|
Alan Kennedy wrote: > [Alan Kennedy] > >> I'm trying to pickle SQLObjects, so that I can store them in > >> structured documents. As far as I can see, pickles only need to > >> contain the name of the SQLObject class, and the id which uniquely > >> identifies the instance. > > [Brian Ray] > > I thought SQLObject does not support pickleing because it uses pickel > > for persistance: > > > > http://sourceforge.net/mailarchive/message.php?msg_id=7578480 > > Thanks for pointing out that reference Brian, I hadn't seen it before. > > Looking at Ian's message at that url, it looks like he's talking about > support for pickling the entire SQLObject, which I would imagine would > have complex consequences when it comes to connections, etc, because > object identity seems to be related to the connection that the object > came from. I was just thinking about a reference (i.e., no column data). The reference might or might not include the connection; both instances have valid use cases. E.g., the connection may often be configurable, and if you change the configuration and load the pickle you may want lot load it from the newly configured connection, not the old connection. OTOH, if you are using multiple databases, you may want the connection included in the reference. __setstate__ would require refactoring SQLObject a bit, so that it could set up all the special variables that SQLObject creates in __init__. That would be doable. Another option would be to implement __new__, and potentially include some magic keyword argument that would cause a .get() to be called. Of course, __init__ gets *re*-called if __new__ returns an instance of the class. Frankly __new__ is a stupid, stupid method. Which makes me think __setstate__ may be the better way to go. Oh, wait, no... that's no good either. Because sometimes you'll be returning an existing instance when unpickling, if that row has already been loaded up. Blah. And what if you want to load the objects into a transaction? Then you'd want to say, "unpickle to this connection". Again, rather annoying, since there's no way to pass information from the pickler into the class, except through a global. At that point, a pickle subclass starts to seem better, where this configuration information could be stored in the pickle subclass. Blah. Well, there's my indecisive opinion. -- Ian Bicking / ia...@co... / http://blog.ianbicking.org |