From: Stefan S. <se...@sy...> - 2006-03-31 06:55:11
|
Hello, I'm trying to process a string into a docutils.nodes.document, then saving it as part of an object to a file - as a python pickle. Doing this I get the error message 'exceptions.TypeError: can't pickle file objects' Since the addition of this document is about the only change I made I believe the file object I get a complaint about stems from docutils. Where could that be ? I generated the document via publish_string(comment.text, writer=writer, reader=reader) where reader is (essentially) a docutils.readers.standalone.Reader and writer a docutils.writers.Writer. The text to be parsed is a python string, and so I'm not sure why any file objects are involved. Any help would be highly appreciated ! Thanks, Stefan |
From: Stefan S. <se...@sy...> - 2006-03-31 19:20:16
|
Following up on my own mail: Stefan Seefeld wrote: > publish_string(comment.text, writer=writer, reader=reader) where > reader is (essentially) a docutils.readers.standalone.Reader and > writer a docutils.writers.Writer. > > The text to be parsed is a python string, and so I'm not sure > why any file objects are involved. It appears adding a 'settings_override' keyword parameter containing an empty ('') warning_stream does the trick. However, that only let me get a tiny bit further, as now the complaint is about function objects. So I generalize the question a bit: is it possible to serialize documents (I notice document has a 'asdom()' method...) ? To add a bit of context: I'm trying to add support for ReST markup to the synopsis tool (http://synopsis.fresco.org). Synopsis allows to parse and process source code with embedded comments in a variety of languages. It lets me define a 'processor pipeline' for maximum flexibility. Each processor may dump the data to a file, and read it back in. RIght now I'm able to parse and format embedded ReST markup only if I run the pipeline in a single process, i.e. without intermediate storage. Trying to store the internal data will result in an error message telling me that the data can't be pickled. Any ideas ? Thanks, Stefan |
From: David G. <go...@py...> - 2006-03-31 20:27:41
|
TWFydGluIEJsYWlzIGNhbWUgYWNyb3NzIHRoZSBzYW1lIHByb2JsZW0gd2hpbGUgZGV2ZWxvcGlu ZyBOYWJ1CihodHRwOi8vZnVyaXVzLmNhL25hYnUpLCB3aGljaCBsZWQgdG8gdGhlIGludHJvZHVj dGlvbiBvZiB0aGUKZG9jdXRpbHMuY29yZS5wdWJsaXNoX2RvY3RyZWUgYW5kIC5wdWJsaXNoX2Zy b21fZG9jdHJlZSBtZXRob2RzLgpMb29raW5nIGludG8gaGlzIGNvZGUgYSB0aW55IGJpdCwgSSBz ZWUgaGUgZG9lcyB0aGlzIGJlZm9yZSBwaWNrbGluZzoKCiAgICBkb2N0cmVlLnJlcG9ydGVyID0g Tm9uZQoKVGhpcyByZW1vdmVzIHRoZSB3YXJuaW5nIHN0cmVhbSByZWZlcmVuY2UuCgo+IEhvd2V2 ZXIsIHRoYXQgb25seSBsZXQgbWUgZ2V0IGEgdGlueSBiaXQgZnVydGhlciwKPiBhcyBub3cgdGhl IGNvbXBsYWludCBpcyBhYm91dCBmdW5jdGlvbiBvYmplY3RzLgoKKldoaWNoKiBmdW5jdGlvbiBv YmplY3RzPwoKPiBTbyBJIGdlbmVyYWxpemUgdGhlIHF1ZXN0aW9uIGEgYml0OiBpcyBpdCBwb3Nz aWJsZSB0byBzZXJpYWxpemUKPiBkb2N1bWVudHMgKEkgbm90aWNlIGRvY3VtZW50IGhhcyBhICdh c2RvbSgpJyBtZXRob2QuLi4pID8KCk9mIGNvdXJzZSEgVGhhdCdzIHdoYXQgRG9jdXRpbHMgZG9l czsgaXQgc2VyaWFsaXplcyBkb2N1bWVudHMgaW4KZGlmZmVyZW50IHdheXMuIFRoZSBhc2RvbSBt ZXRob2Qgd2lsbCBoZWxwIHRvIHNlcmlhbGl6ZSB0aGUgZG9jdW1lbnQKdHJlZSBhcyBEb2N1dGls cy1uYXRpdmUgWE1MIChzZWUgZG9jdXRpbHMud3JpdGVycy5kb2N1dGlsc194bWwpLCBidXQKdGhl cmUncyBhIGxvdCBvZiB1c2VmdWwgaW5mb3JtYXRpb24gdGhhdCdzIGluIHRoZSBkb2N0cmVlIGJ1 dCBub3QgaW4KdGhlIFhNTC4KCklmIHlvdXIgcmVhbCBxdWVzdGlvbiBpcyAiaXMgaXQgcG9zc2li bGUgdG8gcGlja2xlIGRvY3VtZW50IHRyZWVzPyIsCnRoZSBhbnN3ZXIgaXMgeWVzLiBJIGRvbid0 IGtub3cgYWxsIHRoZSBkZXRhaWxzLCBiZWNhdXNlIEkndmUgbmV2ZXIKbmVlZGVkIHRvIGRvIGl0 LCBidXQgaXQgd29ya3MgaW4gTmFidS4gSSBzdWdnZXN0IHlvdSBsb29rIGF0IE1hcnRpbidzCk5h YnUgY29kZS4KCkkgcmVjYWxsIGZyb20gcGFzdCBleHBlcmllbmNlIHdpdGggcGlja2xpbmcgdGhh dCB0aGUgcHJvdG9jb2wgaW5jbHVkZXMKYSBjYWxsIHRvIF9fZ2V0c3RhdGVfXyB3aGVuIHBpY2ts aW5nIGFuZCBfX3NldHN0YXRlX18gd2hlbiB1bnBpY2tsaW5nLgpQZXJoYXBzIHRoZXNlIHNob3Vs ZCBiZSBpbXBsZW1lbnRlZCBvbiBkb2N1bWVudCB0cmVlcyB0byBtYWtlCnBpY2tsZXJzJyBsaXZl cyBlYXNpZXIuIFBhdGNoZXMgYXJlIHdlbGNvbWUhCgotLQpEYXZpZCBHb29kZ2VyIDxodHRwOi8v cHl0aG9uLm5ldC9+Z29vZGdlcj4K |
From: Stefan S. <se...@sy...> - 2006-04-01 00:10:42
|
David Goodger wrote: > Martin Blais came across the same problem while developing Nabu > (http://furius.ca/nabu), which led to the introduction of the > docutils.core.publish_doctree and .publish_from_doctree methods. > Looking into his code a tiny bit, I see he does this before pickling: > > doctree.reporter = None > > This removes the warning stream reference. Thanks ! I don't yet understand the API very well; does removing the reporter attribute any impact on the doctree's future use, such as after I unpickle it ? >>However, that only let me get a tiny bit further, >>as now the complaint is about function objects. > > > *Which* function objects? If only I knew ! :-) The error message generated from the pickle call doesn't tell where exactly the error occurs, only that I can't pickle function objects. >>So I generalize the question a bit: is it possible to serialize >>documents (I notice document has a 'asdom()' method...) ? > > > Of course! That's what Docutils does; it serializes documents in > different ways. The asdom method will help to serialize the document > tree as Docutils-native XML (see docutils.writers.docutils_xml), but > there's a lot of useful information that's in the doctree but not in > the XML. I imagine. And I fear that mapping to a dom and back again is a bit expensive as opposed to python's native cpickle protocol. As I mentioned, synopsis allows to serialize its internal representation (mostly an AST with annotations from comments in the code) so it can operate like a compiler, i.e. with a 'parse', 'link', 'format', etc. mode. Thus making the serialization not too expensive is somewhat important. :-) > If your real question is "is it possible to pickle document trees?", > the answer is yes. I don't know all the details, because I've never > needed to do it, but it works in Nabu. I suggest you look at Martin's > Nabu code. > > I recall from past experience with pickling that the protocol includes > a call to __getstate__ when pickling and __setstate__ when unpickling. > Perhaps these should be implemented on document trees to make > picklers' lives easier. Patches are welcome! Ok, I may look into that, once I'm more familiar with it. Meanwhile, though, I'd like to figure out whether there is any workaround I could use, since I guess that synopsis' next release will happen before docutils' next release. ;-) Thanks, Stefan |
From: David G. <go...@py...> - 2006-04-01 00:54:28
Attachments:
signature.asc
|
[Stefan Seefeld] > Thanks ! I don't yet understand the API very well; does removing the > reporter attribute any impact on the doctree's future use, such as > after I unpickle it ? I believe the doctree will need a new reporter when it's unpickled, but I'm not sure. The Nabu code should tell. Hopefully Martin Blais will chime in; I'm copying him on this. --=20 David Goodger <http://python.net/~goodger> |
From: Felix W. <Fel...@gm...> - 2006-04-01 21:50:12
|
Stefan Seefeld wrote: > does removing the reporter attribute any impact on the doctree's > future use, such as after I unpickle it ? It does, but I think that docutils.core.publish_from_doctree will take care of setting the reporter object up again. -- For private mail please ensure that the header contains 'Felix Wiemann'. "the number of contributors [...] is strongly and inversely correlated with the number of hoops each project makes a contributing user go through." -- ESR |
From: David G. <go...@py...> - 2006-04-01 23:29:27
Attachments:
signature.asc
|
> Stefan Seefeld wrote: >> does removing the reporter attribute any impact on the doctree's >> future use, such as after I unpickle it ? [Felix Wiemann] > I think that docutils.core.publish_from_doctree will take > care of setting the reporter object up again. It does not, and it should not. ISTM that the correct way to handle this is to add __getstate__ and __setstate__ to the doctree. --=20 David Goodger <http://python.net/~goodger> |
From: Stefan S. <se...@sy...> - 2006-04-01 19:38:22
|
David Goodger wrote: > Martin Blais came across the same problem while developing Nabu > (http://furius.ca/nabu), which led to the introduction of the > docutils.core.publish_doctree and .publish_from_doctree methods. > Looking into his code a tiny bit, I see he does this before pickling: > > doctree.reporter = None > > This removes the warning stream reference. > > >>However, that only let me get a tiny bit further, >>as now the complaint is about function objects. > > > *Which* function objects? With some debugging I found it was document.note_transform_message attached to the reporter. Resetting the reporter to None as per your suggestion fixed this. However, there was another pickling error due to the fact that somewhere a 'language' module was stored, which obviously can't be pickled either. Resetting the document.transformer to None solved that. > If your real question is "is it possible to pickle document trees?", > the answer is yes. I don't know all the details, because I've never > needed to do it, but it works in Nabu. I suggest you look at Martin's > Nabu code. > > I recall from past experience with pickling that the protocol includes > a call to __getstate__ when pickling and __setstate__ when unpickling. > Perhaps these should be implemented on document trees to make > picklers' lives easier. Patches are welcome! I'm actually not sure this is the right thing to do. Of course I could define __getstate__ in a way that just ignores the reporter and transformer attributes, but I couldn't recover them in the __setstate__, making pickle * unpickle a non-identity operation. But before I can suggest anything else I have to ask: why is the document class not simply a data object (the 'Model' in the Model-View-Controller paradigm) ? If it was, pickling would be trivial as all the 'active' objects would be kept outside. Regards, Stefan |
From: Felix W. <Fel...@gm...> - 2006-05-04 18:45:38
|
David Goodger wrote: > Felix Wiemann wrote: > >> I think that docutils.core.publish_from_doctree will take >> care of setting the reporter object up again. > > It does not, and it should not. Well, it does -- from the docstring:: Also, new document.transformer and document.reporter objects are generated. ISTM that there's nothing wrong with this as there may be new settings (like warning_stream) that influence the behavior of the reporter. -- For private mail please ensure that the header contains 'Felix Wiemann'. "the number of contributors [...] is strongly and inversely correlated with the number of hoops each project makes a contributing user go through." -- ESR |
From: David G. <go...@py...> - 2006-05-04 18:56:40
|
[Felix Wiemann, on 1 April] >>> I think that docutils.core.publish_from_doctree will take care of >>> setting the reporter object up again. [David Goodger, on 1 April] > > It does not, and it should not. [Felix Wiemann, on 4 May] > Well, it does -- from the docstring:: If you're going to revisit a dead argument from over a month ago, please have the courtesy of reading the *entire* thread before replying. :-) On April 2 I wrote: ... In fact, after a reexamination of the DocTree Reader that Martin Blais wrote for Docutils (docutils.readers.doctree), I rediscovered that it *does* create a new Reporter and Transformer for processing the doctree. I had forgotten. Apologies for any confusion. -- David Goodger <http://python.net/~goodger> |