|
From: Dethe E. <de...@ma...> - 2002-09-21 06:34:09
|
On Friday, September 20, 2002, at 07:55 PM, David Goodger wrote: > Dethe Elza wrote: >> Also, if we parse directly to a DOM it makes reST more flexible and >> easier to port, since a DOM binding exists for most languages. Many >> DOMs use either SAX or Expat to build the DOM itself, my idea would >> be to replace the low-level parser with reST. > > I don't see how that would improve flexibility. The parser can > already build a real DOM tree; just call ``document.asdom()``. What > benefits would a DOM approach provide? I'm not being defensive; I'd > really like to know. If there is a benefit that outweighs the cost, > it should be explored. I didn't know about asdom(), I'll have to explore that to see how expensive it is and which DOM implementation it uses. >> Building up a true XML DOM internally has several advantages. More >> potential developers would be familiar with the API than are >> currently comfortable with the reST internals. > > I don't think the document tree is the bottleneck. Rather, I think > it's the complexity of the parser. Unfortunately, parsing > reStructuredText *is* complex, because it has to grok two-dimensional > patterns that humans understand implicitly. It's the curse of > user-friendliness. ;-) Yes, that's certainly true, and one of the things I really like about reST is the effort it takes to make the document author's life easier. I still think that the internals could be simplified and that this would encourage more participation in the project. I've seen some complaints about the complexity of reST in toto, which I think could be addressed by modularizing reST, but that's another issue. I agree that the parser is the most complex component of reST, so if it focuses on the parser and reuses architecture from the python libraries for the rest of reST it may be easier to grok for a programmer coming to it fresh. >> Writers could be written in XSLT without knowing anything about reST >> besides it's DTD. > > This can already be done: just use ``document.asdom()`` then run that > through the XSLT engine. The reason we don't go that route is because > there is no XSLT engine in core Python. If PyXML is ever incorporated > into the core, we can re-examine that decision. I thought a version of PyXML was part of the core now, but not 4Suite. Besides, Optik is not part of core python, but it drastically simplifies the reST code, so it's included in docutils. >> And my *other* project of converting existing HTML and DocBook >> documents into reST for maintenance would be that much easier! > > I don't follow this at all. Can you elaborate? Sorry, that wasn't very clear. I want to think of the reST DOM as it's canonical form, so I can transform XHTML and DocBook to reST via XSLT. Ideally I also want a writer to create reST from the reST DOM. >> Even further off-topic, the docs mention that reST has constructs >> which are missing from DocBook. What are they? > > There are plenty. Off the top of my head: field lists, option lists, > decorations (headers & footers), doctest blocks, line blocks, > transitions. None of these are difficult to render or approximate > using regular DocBook elements, it's just that there's no one-to-one > correspondence. Even in elements where there *is* a strong > correspondence, some are not completely compatible, such as definition > lists. It is the goal of http://docutils.sf.net/spec/doctree.html to > document all of this; any assistance would be gratefully accepted and > much appreciated. Thanks, that's a good start. DocBook has added support for describing EBNF in documentation, as well as including modules for MathML and SVG, it is essentially a superset of XmlSpec, which is the *other* widely used XML documentation format (at least in the W3C). I just had a wild idea that instead of inventing a new XML DTD for internal structure, reST could use DocBook (or a subset of it) for it's DOM representation. Like I said, a wild idea. > The Docutils document model was designed by me (with much input, of > course), as it makes sense to me. I've had some experience with > various models, including DocBook and TEI, and I've designed several > DTDs before. Every document designer has different sensibilities, so > differences and incompatibilies are inevitable. For example, I know > of no DTD that has the equivalent of a "transition" element, although > they're quite common in novels and articles. <hr /> doesn't qualify? Thanks for the eloquent feedback. --Dethe |