From: <gr...@us...> - 2002-05-21 08:43:06
|
i am slowly getting a grasp of writer visit_Text is called on leaves i guess ? i recognized a new function supports in the writer: means the parser could ask the writer if it supports a certain construct and if not might use some other ? the NodeVisitor has all visit_ departure_ procedures defined. this way a writer might not know that he missed something which makes it hard to support everything if you donot know. cheers -- BINGO: This left unindentionally unblank --- Engelbert Gruber -------+ SSG Fintl,Gruber,Lassnig / A6410 Telfs Untermarkt 9 / Tel. ++43-5262-64727 ----+ |
From: David G. <go...@us...> - 2002-05-22 04:25:58
|
Engelbert Gruber wrote: > visit_Text is called on leaves i guess ? Yes, ``visit_Text`` is called whenever a ``nodes.Text`` instance is encountered. ``nodes.Text`` objects are terminal nodes (leaves) containing text only; no child nodes or attributes. > i recognized a new function supports in the writer: > means the parser could ask the writer if it supports a certain > construct and if not might use some other ? No, writers should support all elements defined in docutils.nodes. There's no communication between parser and writer at parse time. The fully parsed document instance may contain "pending" elements, which are a form of delayed communication, and that's what the "supports" method can be used for. ``docutils.Component.supports()`` (defined in docutils/__init__.py) is used by transforms to ask the component (reader or writer) controlling the transform if that component supports a certain input context or output format. Specifically, it's used by the "meta" directive, which uses the ``docutils.transforms.components.Filter`` transform; only writers supporting HTML will include the meta tag, others will discard it. (See the docstring of ``docutils.transforms.components.Filter`` for a detailed explanation.) I've updated the docstring. > the NodeVisitor has all visit_ departure_ procedures > defined. this way a writer might not know that he missed something > which makes it hard to support everything if you donot know. This aspect is useful for sparse traversals, such as those done by transforms. It's not so useful for writers, though, as you say. I've removed the definitions of ``visit_...`` and ``depart_...`` methods from NodeVisitor, and added a new subclass, SparseNodeVisitor. You'll want your PDFTranslator to continue subclassing NodeVisitor. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: <gr...@us...> - 2002-05-22 06:00:55
|
On Wed, 22 May 2002, David Goodger wrote: > > i recognized a new function supports in the writer: > > means the parser could ask the writer if it supports a certain > > construct and if not might use some other ? > > No, writers should support all elements defined in docutils.nodes. everything derived from Element ? how should i know :-) does anyone know the automatic way to get the list ?not important now, but for completeness checks it might be valid. should but might not do always, as new elements are added not all writers will support immediatly and we should have a way to get documents out anyway. (your modification os NodeVisitor should enable fallback elements in writers). > There's no communication between parser and writer at parse time. The > fully parsed document instance may contain "pending" elements, which > are a form of delayed communication, and that's what the "supports" > method can be used for. > > ``docutils.Component.supports()`` (defined in docutils/__init__.py) is > used by transforms to ask the component (reader or writer) controlling > the transform if that component supports a certain input context or > output format. Specifically, it's used by the "meta" directive, which > uses the ``docutils.transforms.components.Filter`` transform; only > writers supporting HTML will include the meta tag, others will discard > it. (See the docstring of ``docutils.transforms.components.Filter`` > for a detailed explanation.) should this be the meta information in a pdf file and maybe a comment in latex or pdf metainformation in pdflatex ? -- BINGO: synergy end to end --- Engelbert Gruber -------+ SSG Fintl,Gruber,Lassnig / A6410 Telfs Untermarkt 9 / Tel. ++43-5262-64727 ----+ |
From: David G. <go...@us...> - 2002-05-23 04:24:37
|
Engelbert Gruber wrote: >>> i recognized a new function supports in the writer: >>> means the parser could ask the writer if it supports a certain >>> construct and if not might use some other ? >> >> No, writers should support all elements defined in docutils.nodes. > > everything derived from Element ? Actually, everything derived from Node, except Element and TextElement (classes in docutils.nodes). > how should i know :-) The docs should tell you, and if they don't, that's a bug. In this case, it's clearly a bug. So you find out by asking questions, which you've just done; thanks. This prods me into updating the internal docs, which I've now done. See the changes to docutils/writers/__init__.py. Please continue to ask questions; it helps me pinpoint where documentation is lacking (or perhaps more precisely, where the general lack of documentation is most strongly and immediately felt). Also, you (and others) are using the code in ways I haven't anticipated, so you'll find places where the code needs tweaking, reworking, or expansion. That's great for the project; keep those questions and bug reports coming! > does anyone know the automatic way to get the list ? ``docutils.nodes.node_class_names`` is a list of all concrete node classes. Here's some code to derive the list:: from docutils import nodes from types import ClassType node_class_names = [] for x in dir(nodes): c = getattr(nodes, x) if type(c) is ClassType and issubclass(c, nodes.Node) \ and len(c.__bases__) > 1: node_class_names.append(x) I've added a test to confirm that the stored list stays up to date (in test/test_nodes.py). > should but might not do always, as new elements are added > not all writers will support immediatly and we should have a way > to get documents out anyway. If a new node type is added, all writers should be updated accordingly (if a writer is not updated, it's a bug). Once a writer has been accepted into core Docutils, it's the responsibility of whoever adds the new node type (probably me) to update all the writers. I'd be happy to help with writers in the sandbox too. > (your modification os NodeVisitor should enable fallback elements in > writers). ``Node.walk`` & ``Node.walkabout`` call ``NodeVisitor.unknown_visit`` & ``.unknown_departure`` when unknown node types are encountered. These can be overridden if you want to get fancy, but I don't see the point. Better to get explicit feedback (exception tracebacks) if your code isn't complete. That's what testing is for. >> only writers supporting HTML will include the meta tag, others >> will discard it. (See the docstring of >> ``docutils.transforms.components.Filter`` for a detailed >> explanation.) > > should this be the meta information in a pdf file and maybe a > comment in latex or pdf metainformation in pdflatex ? Could be. I don't know much about PDF or LaTeX. The "meta" directive was intended for HTML, so I wouldn't be surprised if it doesn't match the PDF idea of metadata. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: <gr...@us...> - 2002-05-24 16:01:29
|
On Thu, 23 May 2002, David Goodger wrote: > >> goodger > >> No, writers should support all elements defined in docutils.nodes. > > > > gruber > > Writers should but might not do always, as new elements are added > > not all writers will support immediatly and we should have a way > > to get documents out anyway. > goodger > If a new node type is added, all writers should be updated accordingly > (if a writer is not updated, it's a bug). Once a writer has been > accepted into core Docutils, it's the responsibility of whoever adds > the new node type (probably me) to update all the writers. I'd be who might not know anything about the writer and i want to keep the writer making output, maybe ugly but never loose content. i.e. even if i donot know what it is then things will be written as standard text. -- --- Engelbert Gruber -------+ SSG Fintl,Gruber,Lassnig / A6410 Telfs Untermarkt 9 / Tel. ++43-5262-64727 ----+ |
From: Tony J I. (Tibs) <to...@ls...> - 2002-05-27 09:03:39
|
David Goodger wrote: > No, writers should support all elements defined in > docutils.nodes. That is clearly the desirable aim, and certainly should be true initially. > Engelbert Gruber wrote: > Writers should but might not do always, as new elements are added > not all writers will support immediatly and we should have a way > to get documents out anyway. Skipping the issue of whether (how) all writers should be updated if docutils.nodes grows new nodes (an occurrence we expect to be rare once docutils is fully out in the world), there is a related issue, which may be more profitable to pursue (i.e., if we solve it, then Engelbert Gruber's concerns should also be solved). Let us imagine that we have a product-specific tool (for instance, to pick a random example(!) pysource, or perhaps a Wiki tool), which generates a DPS tree, and wishes to use a Writer to output the results. The tree-construction part of the tool has two basic choices: 1. work entirely with the existing DPS nodes, to render what it wants to do (hint: I think this is the right approach) 2. possibly create new node types specific to its application area (which I believe David has advocated for pysource, at least, in the past). *If* (2) is the "proper" approach, then we automatically have the issues that Engelbert Gruber is concerned about - what does a Writer do when it encounters nodes it does no recognise? It seems to me that the counter-argument that anyone who invents such nodes must amend any Writers they "care about" is not a sufficient answer - I have two "obvious" counter-examples: i. the author of the Reader phase may not have the time or ability (or permission, even) to alter the Writer. ii. given how simple it is to write XML out from docutils (in fact, the capability is already provided), and also, to read it back in (not provided, but trivial to do), there is no particular need for Reader and Writer to be in the same tool. I can already think of applications for this (I've been handling transfer formats too long, I guess...) On the other hand, since all of David's current nodes "declare" (by inheritance) what sort of entity they are, it should be possible (note my hands waving vaguely in the air) to make sensible "fall back" code for any future nodes, whether they are added "officially" or not. This is a bit of a pain for me, because it makes my case against option (2) above less strong, but I believe it helps Engelbert Gruber's case. (Briefly, my reason for liking option (1) is that with the single addition of a new "style" attribute to all nodes, I can get 80% of what I want, and with a new node called "group", I get 100% - where "group" translates, in HTML, to <DIV> or <SPAN> as appropriate, and "style" translates, in HTML, to "class" - and if those become standard Writer components, then I don't need any amendment in Writers to be able to output what pysource produces.) Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: David G. <go...@us...> - 2002-05-30 01:38:39
|
Tony J Ibbs (Tibs) wrote: > Let us imagine that we have a product-specific tool (for instance, to > pick a random example(!) pysource, or perhaps a Wiki tool), which > generates a DPS tree, and wishes to use a Writer to output > the results. > > The tree-construction part of the tool has two basic choices: > > 1. work entirely with the existing DPS nodes, to render what > it wants to do (hint: I think this is the right approach) > > 2. possibly create new node types specific to its application > area (which I believe David has advocated for pysource, at > least, in the past). > > *If* (2) is the "proper" approach, What I've advocated in the past (or meant to, anyhow) is a blend of #1 and #2. I'll describe the implementation. Introducing custom node types is OK, as long as: (a) a transform resolves them to standard Docutils nodes before they reach the Writer proper, or (b) the custom node is explicitly supported by certain Writers, and is wrapped in a filtered ``pending`` node. In the case of PySource, an example of (a), I would expect to have a transform reduce any custom nodes to standard node structures; tables may suit the current code. Then a standard HTML writer, perhaps in conjunction with a specialized stylesheet, could produce HTML very different from that produced from standalone reStructuredText. (Not having worked though the code yet, I wouldn't be surprised if this isn't enough. That's OK; we'll fix it in time.) The HTML <meta> tag is an example of (b); the *only* example, currently. The ``.. meta::`` directive creates a ``pending`` node, which contains knowledge that the embedded ``meta`` node can only be handled by HTML-compatible writers. The ``pending`` node is resolved by the ``transforms.components.Filter`` transform, which checks that the calling writer supports HTML; if it doesn't, the ``meta`` node is removed from the document. The Writer itself works entirely with existing Docutils nodes, and any nodes specific to its format. Readers know about input contexts (PySource, PEP, standalone file, etc.), but Writers are intentionally ignorant of context. > then we automatically have the issues that Engelbert Gruber is > concerned about - what does a Writer do when it encounters nodes it > does not recognise? It raises a "NotImplementedError" exception. It is an error for a Writer to encounter an unknown node. It might not be the Writer's fault though. > It seems to me that the counter-argument that anyone who invents > such nodes must amend any Writers they "care about" is not a > sufficient answer "Care about" doesn't enter into it. The requirements are simple: all Writers must handle all standard Docutils nodes, and any non-standard nodes not explicitly supported by certain Writers must be transformed into standard nodes or removed. Whenever new standard nodes are introduced *all* Writers *must* be updated. > I have two "obvious" counter-examples: > > i. the author of the Reader phase may not have the time or > ability (or permission, even) to alter the Writer. That's why the API has to be well-defined and components have to be decoupled. We want the Writers to be as independent of the Readers as possible. > ii. given how simple it is to write XML out from docutils (in > fact, the capability is already provided), and also, to > read it back in (not provided, but trivial to do), there > is no particular need for Reader and Writer to be in the > same tool. Except that there's a lot of internal data that doesn't get stored with the XML, and will need to be recreated by the Writer-equivalent. The ``nodes.document`` object (the root of a Docutils document tree) stores a lot of details. The consumer of the XML would have to be quite sophisticated (like a web browser, which can resolve links). It's quite possible that there would be some data loss; I couldn't say without auditing the code (it's a tad hairy). > On the other hand, since all of David's current nodes "declare" (by > inheritance) what sort of entity they are, This is meant for transforms to use to identify nodes. The transforms in ``transforms.frontmatter`` skip nodes descended from ``nodes.PreBibliographic`` (title, comment, system_message, etc.). The ``transforms.parts.Contents`` transform searches for nodes that are instances of ``nodes.section``, including *subclasses* (which opens the door for custom sections). > it should be possible (note my hands waving vaguely in the air) to > make sensible "fall back" code for any future nodes, whether they > are added "officially" or not. Except for interim, under development code, I don't think this is a good idea. > (Briefly, my reason for liking option (1) is that with the single > addition of a new "style" attribute to all nodes, I can get 80% of > what I want ... and "style" translates, in HTML, to "class" There's already a "class" attribute on all nodes, which remains on the HTML generated from the node. For example, see how the ``topic`` node is handled in ``transforms.parts.Contents``. > and with a new node called "group", I get 100% - where "group" > translates, in HTML, to <DIV> or <SPAN> as appropriate I think there's a danger in an overly generic node like "group", which is why I'm resisting. If you look (once again) at http://docutils.sf.net/spec/pysource.dtd, you'll see my first cut at a structure for representing the custom Python-source-related nodes needed (probably out of date). Look at the "Additional Structural Elements" section. Each of the ``..._section`` elements have a different structure, composed of custom child elements. How are you going to represent all of those with a single "group"? Especially when different views of the data (different styles) will probably be required? But since I haven't gone through the pysource code, my arguments may not hold water. My gut says "group" is an evil generalization. Perhaps my head just needs to see it in action to override my gut. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: Tony J I. (Tibs) <to...@ls...> - 2002-05-30 09:08:11
|
David Goodger wrote a detailed critique of my wanderings upon a theme from Engelbert Gruber - thanks, David, that makes it make more sense to me. A couple of specific points to reply to: > In the case of PySource, an example of (a), I would expect to have a > transform reduce any custom nodes to standard node structures; tables > may suit the current code. Then a standard HTML writer, perhaps in > conjunction with a specialized stylesheet, could produce HTML very > different from that produced from standalone reStructuredText. (Not > having worked though the code yet, I wouldn't be surprised if this > isn't enough. That's OK; we'll fix it in time.) Ah. The way pysource works is to build an internal structure representing the Python code, insofar as it cares (i.e., it doesn't try to represent stuff it's not interested in), and then transform *this* into DPS nodes. This means that (within pysource) I have no need of "custom" nodes - the DPS datastructures are purely being built for consumption by a Writer (it's obviously a little bit more complex than that, because docstrings become DPS fragment trees within my "python" datastructure, but that makes sense in terms of treating individual docstrings as entire documents for the purposes of footnotes, headers, etc.). > The Writer itself works entirely with existing Docutils nodes, and any > nodes specific to its format. Readers know about input contexts > (PySource, PEP, standalone file, etc.), but Writers are intentionally > ignorant of context. If I understand that correctly, that is exactly what I am aiming for. > > (Briefly, my reason for liking option (1) is that with the single > > addition of a new "style" attribute to all nodes, I can get 80% of > > what I want ... and "style" translates, in HTML, to "class" > > There's already a "class" attribute on all nodes, which remains on the > HTML generated from the node. For example, see how the ``topic`` node > is handled in ``transforms.parts.Contents``. > > > and with a new node called "group", I get 100% - where "group" > > translates, in HTML, to <DIV> or <SPAN> as appropriate > > I think there's a danger in an overly generic node like "group", which > is why I'm resisting. If you look (once again) at > http://docutils.sf.net/spec/pysource.dtd, you'll see my first cut at a > structure for representing the custom Python-source-related nodes > needed (probably out of date). ...deletia... > But since I haven't gone through the pysource code, my arguments may > not hold water. My gut says "group" is an evil generalization. > Perhaps my head just needs to see it in action to override my gut. As I indicated above, I don't store Python structure "as such" in the DPS node tree (what *is* the current correct term for these entities, by the way, now that DPS has gone away?) - I just store the document that "talks about" the Python structure. Since the pysource.dtd is talking about DPS nodes, I'm not particularly interested (after all, they're not nodes a Writer is required to recognise). The reason I want "group" is that one of the things I want to be able to do is to group together, visually, for instance, a section heading and some text thereafter (the layout is wrong, to get the idea across in ASCII art):: Class Fred <-- a "section heading" fullname: jimbob.clancy.Fred <-- a paragraph or somesuch subclasses: BigClass <-- ditto but not the text that follows *that* - which might, for instance, be a docstring. Now, I may be able to do that with CSS just by use of the "class" values, but I had thought not - whereas (a I understand it?) this is exactly the sort of thing that <DIV> (or is it <SPAN>?) is aimed at in HTML. And I'm pretty sure I could do the same sort of thing in TeX (it's really the TeX concept of "group" - i.e., ``{\something .. }`` - that I'm after, I think!). It may be that my aims are evil, but they are certainly simply and easily met by provision of the equivalent of a "group" node (which may, of course, be translated into a table in a non-CSS environment). Anyway, I'm away all next week (school half-term holiday, and off to the sun), and maybe will be able to think about this more after that. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: David G. <go...@us...> - 2002-05-31 01:18:03
|
Tony J Ibbs (Tibs) wrote (about "DPS nodes"): > (what *is* the current correct term for these entities, by the way, > now that DPS has gone away?) I've been calling them "Docutils nodes". They come from the "docutils.nodes" module after all. [David] > > In the case of PySource, an example of (a), I would expect to have > > a transform reduce any custom nodes to standard node structures... [Tibs] > Ah. The way pysource works is to build an internal structure > representing the Python code, insofar as it cares (i.e., it doesn't > try to represent stuff it's not interested in), and then transform > *this* into DPS nodes. That's fine. I'd just be wary of converting to standard Docutils nodes too early, because then you're locking in one style. This comes back to a discussion we had some time ago, about "stylist" components... > This means that (within pysource) I have no need of "custom" nodes - > the DPS datastructures are purely being built for consumption by a > Writer I think that a set of custom nodes in conjunction with some "stylist" transforms would work well. The PySource Reader wouldn't have to make any decisions as to style; just produce a logical tree, parsed & linked. Stylist transforms would understand the custom nodes; one of them would turn the custom nodes into a standard Docutils nodes. The point is that by decoupling the context-understanding part of the Reader from the layout-generating part(s), the whole becomes much more flexible. I'm not a big fan of the PyDoc-style layout, and PySource is similar. I'd like to be able to experiment with radically alternate layouts without having to touch the AST-grokking part of the code. To do that, I need access to the parsed data at an early stage, before it's been altered too much (at all) stylistically. PySource could have a "--stylist" option specifying which stylist transform is desired. In between the Reader proper and the transform, custom nodes would be fine, and probably preferable to standard Docutils nodes. > (it's obviously a little bit more complex than that, because > docstrings become DPS fragment trees within my "python" > datastructure, but that makes sense in terms of treating individual > docstrings as entire documents for the purposes of footnotes, > headers, etc.). This is where a "merge" transform would come in handy. See http://docutils.sf.net/spec/notes.html#reference-merging. To get all the details right there's lots of work to do! > As I indicated above, I don't store Python structure "as such" in > the DPS node tree - I just store the document that "talks about" the > Python structure. They're close parallels though, aren't they? By storing the 'document that "talks about" the Python structure' too soon, you lose the flexibility to render that document in multiple ways. > Since the pysource.dtd is talking about DPS nodes, I'm not > particularly interested (after all, they're not nodes a Writer is > required to recognise). But if the "internal structure representing the Python code" was made up of Docutils nodes (custom & standard), you could make use of the full Docutils transform machinery, without having to roll your own. I need to work out and explain the whole inner workings better. I *know* that if you understood the Docutils model better, you'd realize the benefits of the approach I'm advocating. Not enough hours in the day, and not enough days in the weekend. Gradually we'll get there; this is part of the process. > The reason I want "group" is that one of the things I want to be > able to do is to group together, visually, for instance, a section > heading and some text thereafter (the layout is wrong, to get the > idea across in ASCII art):: > > Class Fred <-- a "section heading" > fullname: jimbob.clancy.Fred <-- a paragraph or somesuch > subclasses: BigClass <-- ditto But "group" has no intrinsic meaning; it's too generic. I'd rather see a whole bunch of specialized "groups", each specific to its task. The PySource DTD is a list of such groups. Incorporating your ideas (good ones, too), I've revised http://docutils.sf.net/spec/pysource.dtd quite a bit. Here's the new "class_section" and "fullname" element specs:: <!ELEMENT class_section (class, inheritance_list?, fullname?, subclasses?, %structure.model;)> <!ELEMENT fullname (package | module | class | method | function)+> So a class_section begins with a class name, then an inheritance list, then a full name, and a list of subclasses. A "fullname" is a list of references, rendered as a dotted name. "inheritance_list" and "subclasses" are lists of class references. In HTML, each of the references could be clicked on, providing navigation. If the internal document tree used such nodes, it would be easy to transform them into a table-based colorful structure like now, or a more austere linear style, or others I can't even imagine now. It looks to me like you want to render "fullname" and "subclasses" as a field list; a stylist could do so. Saying that you "want to be able to ... group together, visually" something is a dead giveaway. PySource is a Reader, and should not be concerned with style *at all*. Writers shouldn't be concerned with input context either, so the Python-specific stuff has to be gone by the time the document gets that far. An intermediate stylist transform is the answer. If the piece of code that does the styling is kept small and modular, it would be much easier for people to roll their own styles. The "barrier to entry" is too high now; extracting the stylist code would lower it considerably. You must resist the urge to mix content and style. That way lies the dark side. If once you start down the dark path, forever will it dominate your destiny, consume you it will. Keeping them separate is more work initially, but pays off enormously in the end. We're already starting to see the payoff with Docutils, with PDF and MoinMoin Writers under development. > but not the text that follows *that* - which might, for instance, be a > docstring. Now, I may be able to do that with CSS just by use of the > "class" values, but I had thought not CSS is limited. CSS1, which is well-supported in many of today's browsers, can't do much more than decorate; it can't transform structures the way you want. CSS2 can do more, but it's not supported well yet. > And I'm pretty sure I could do the same sort of thing in TeX (it's > really the TeX concept of "group" - i.e., ``{\something .. }`` - > that I'm after, I think!). Docutils nodes (described by the Docutils DTD) are exactly analogous to TeX structural markup. > It may be that my aims are evil, but they are certainly simply and > easily met by provision of the equivalent of a "group" node (which may, > of course, be translated into a table in a non-CSS environment). I want to give you a *plethora* of group nodes; one just won't cut it! Enjoy the sun! -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: Tony J I. (Tibs) <to...@ls...> - 2002-05-31 09:28:58
|
David Goodger wrote: > I'd just be wary of converting to standard Docutils > nodes too early, because then you're locking in one style. > This comes back to a discussion we had some time ago, about > "stylist" components... and went on to explain how this could be used in a Pysource context to allow the "initial" docutils nodes tree to be more closely related to the Python data, so that it could be transformed into differing *specific* docutils node trees. Sort of an "aha!" moment for me, in fact. (and, of course, my summary is a lot less cogent than what he said) Thanks, David - that makes a lot of sense to me. I think that what you describe is indeed the correct way to go - something like (excuse the imprecision):: Python source --> ||reader|| --> internal representation internal rep --> ||transform|| --> abstract docutils tree abstract tree --> ||stylists|| --> specific docutils tree where the "abstract" tree contains Python specific nodes. I'll look more at all of this when I get back from holiday - some interesting restructuring of pysource needs to be done (and some learning of what docutils is now capable of), but you're clearly right that it will lead to a better tool. > Enjoy the sun! Thanks Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "I'm a little monster, short and stout Here's my horns and here's my snout When you come a calling, hear me shout I will ROAR and chase you out" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: David G. <go...@us...> - 2002-06-01 01:29:17
|
Tony J Ibbs (Tibs) wrote: > Thanks, David - that makes a lot of sense to me. I'm very glad (and relieved) it did! I'll migrate parts of the discussion into the docs in time. > I think that what you describe is indeed the correct way to go - > something like (excuse the imprecision):: > > Python source --> ||reader|| --> internal representation > > internal rep --> ||transform|| --> abstract docutils tree > > abstract tree --> ||stylists|| --> specific docutils tree > > where the "abstract" tree contains Python specific nodes. I'd put it more like this (expanding a bit):: Python source --> ||Reader (internal representation)|| --> custom Docutils tree --> ||stylist transform|| --> standard Docutils tree --> ||other transforms|| --> standard Docutils tree --> ||Writer|| --> final data format (HTML, PDF, etc.) In other words, the Reader's "internal representation" is entirely internal to the reader, and can be anything at all. The PySource Reader will emit a Python-specific custom Docutils tree. The stylist (transform) will convert this to a standard Docutils tree. And so on. Different words expressing the same thing, I suspect. So, how was the sun? -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: Tony J I. (Tibs) <to...@ls...> - 2002-06-10 08:23:34
|
David Goodger wrote: > I'd put it more like this (expanding a bit):: > > Python source > --> ||Reader (internal representation)|| > --> custom Docutils tree > --> ||stylist transform|| > --> standard Docutils tree > --> ||other transforms|| > --> standard Docutils tree > --> ||Writer|| > --> final data format (HTML, PDF, etc.) > > In other words, the Reader's "internal representation" is entirely > internal to the reader, and can be anything at all. OK - that makes sense, because then all the complex "gubbins" is inside the Reader (which I have as multiple components, of course, but that's my business/problem). > The PySource Reader will emit a Python-specific custom Docutils tree. > The stylist (transform) will convert this to a standard Docutils tree. Thus making it easier for other people who want to amend stuff that doesn't depend on the innards of the Reader to ignore it entirely, and just play with adding stylists, etc - yes, a good paradigm to work to. > Different words expressing the same thing, I suspect. Essentially, but the *detail* of the words we're working towards leads towards a better implementation strategy. > So, how was the sun? Very nice - we enjoyed ourselves. Lots of work stuff to catch up on, though. Broadband connection at home due tomorrow - lets cross fingers it works... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: David G. <go...@us...> - 2002-05-30 01:31:36
|
Sorry I've taken so long to reply; been busy. Engelbert Gruber wrote: > who might not know anything about the writer and i want to keep > the writer making output, maybe ugly but never loose content. > i.e. even if i donot know what it is then things will be > written as standard text. As I replied last time, ``Node.walk`` & ``Node.walkabout`` call ``NodeVisitor.unknown_visit`` & ``.unknown_departure`` when unknown node types are encountered. You can implement the ``unknown_visit``/``unknown_departure`` methods of your ``NodeVisitor`` subclass to do whatever you like (such as call ``node.astext()``). If you choose to do this, I'd recommend issuing a warning (a Python warning is OK, doesn't have to be a system_message) saying "an unknown node (name) has been encountered", otherwise it will be difficult to track down the remaining unimplemented node types. This is only an interim solution though. It's OK during development, but the writer won't be complete until *all* node types are handled in some way. At that point, the unknown_visit/unknown_departure methods should be removed, so that later bugs *will* be noticed. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: <gr...@us...> - 2002-05-31 06:01:09
|
On Wed, 29 May 2002, David Goodger wrote: > Sorry I've taken so long to reply; been busy. i am busy too, as you might have noticed by my sandbox. > As I replied last time, > > ``Node.walk`` & ``Node.walkabout`` call > ``NodeVisitor.unknown_visit`` & ``.unknown_departure`` when > unknown node types are encountered. > > You can implement the ``unknown_visit``/``unknown_departure`` methods > of your ``NodeVisitor`` subclass to do whatever you like (such as call > ``node.astext()``). If you choose to do this, I'd recommend issuing a > warning (a Python warning is OK, doesn't have to be a system_message) > saying "an unknown node (name) has been encountered", otherwise it > will be difficult to track down the remaining unimplemented node > types. > > This is only an interim solution though. It's OK during development, > but the writer won't be complete until *all* node types are handled in > some way. At that point, the unknown_visit/unknown_departure methods > should be removed, so that later bugs *will* be noticed. > A Python warning means what to you ? And this is not an interim solution, as a user i want a document output and this has to contain every text, so i would leave the unknown_visit/departure in the writer. it does no harm if everything is known, but it gives at least illformatted output if something is unknown. -- BINGO: Red Flag --- Engelbert Gruber -------+ SSG Fintl,Gruber,Lassnig / A6410 Telfs Untermarkt 9 / Tel. ++43-5262-64727 ----+ |
From: David G. <go...@us...> - 2002-06-01 01:33:03
|
> On Wed, 29 May 2002, David Goodger wrote: > > As I replied last time, > > > > ``Node.walk`` & ``Node.walkabout`` call > > ``NodeVisitor.unknown_visit`` & ``.unknown_departure`` when > > unknown node types are encountered. > > > > You can implement the ``unknown_visit``/``unknown_departure`` > > methods of your ``NodeVisitor`` subclass to do whatever you like > > (such as call ``node.astext()``). If you choose to do this, I'd > > recommend issuing a warning (a Python warning is OK, doesn't have > > to be a system_message) saying "an unknown node (name) has been > > encountered", otherwise it will be difficult to track down the > > remaining unimplemented node types. > > > > This is only an interim solution though. It's OK during > > development, but the writer won't be complete until *all* node > > types are handled in some way. At that point, the > > unknown_visit/unknown_departure methods should be removed, so that > > later bugs *will* be noticed. Engelbert Gruber wrote: > A Python warning means what to you ? The Python ``warnings`` module:: import warnings warnings.warn("This is a warning.") > And this is not an interim solution, as a user i want a document > output and this has to contain every text, so i would leave the > unknown_visit/departure in the writer. it does no harm if everything > is known, but it gives at least illformatted output if something is > unknown. Leaving the unknown_visit/departure methods in the Writer *once it's complete* is *not* an option. It *will* do harm. I'll spell it out as explicitly and completely as I can. During Writer development (i.e., while it's still in the sandbox), having unknown_visit/departure methods is acceptable; you want to test the code without getting exception tracebacks from node types you haven't had time to implement yet. The goal of the development is a **complete** Writer, one that handles *all* standard Docutils nodes. At this point, the Writer can be moved into the Docutils distribution. Until the Writer is complete, it *will not* be moved into the distribution. Once Docutils is mature, new node types will be rare. The ones that *are* introduced will mostly be esoteric (all the basic ones are already there) and therefore rarely used and perhaps not easily noticed in the output. When a new node class *is* introduced, *all* Writers *must* be updated to support it. However... Let's imagine that a Writer is missed by accident and not updated with support for the new node. If that Writer contains catch-all unknown_visit/departure methods, it would give *no sign* that support for the new node type is missing. The Writer would produce broken output silently (i.e., without some kind of explicit indication that "I don't know what an XYZ node is!"), which is *not* acceptable. Such a Writer could languish for a long time, producing broken output which users may not notice, because of the esoteric nature of the new node, or because presumably, the unknown_visit/departure methods would produce some *approximation* of the correct output, like plain text. Approximation is not good enough. Rather, the Writer should raise an exception if it hasn't been updated properly. Leaving the catch-all unknown_visit/departure methods in a Writer is dangerous. It's similar to having try/except blocks that don't explicitly specify exception classes (they'll catch *anything*, which is usually not what you want). But as I said before, in the interim (sandbox), anything goes. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |