|
From: Aahz <aa...@py...> - 2002-08-03 00:38:51
|
Okay, so there's simple docs on writing documents in reST, but I'm having difficulty figuring out how to write tools for processing reST without understanding the whole system, particularly given that I'm not exactly an XML geek. My primary interest -- I think -- is in a Writer class, because I'm trying to create OpenOffice documents. -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ |
|
From: David G. <go...@us...> - 2002-08-03 01:21:39
|
Aahz wrote: > Okay, so there's simple docs on writing documents in reST, but I'm > having difficulty figuring out how to write tools for processing reST > without understanding the whole system, particularly given that I'm not > exactly an XML geek. A "How a Writer works & how to write one" document is on the to-do list, but that's all so far. > My primary interest -- I think -- is in a Writer class, because I'm > trying to create OpenOffice documents. OpenOffice docs are XML; internal Docutils document trees are equivalent to DOM trees, which is XML. Guess what? You're becoming an XML geek. ;-) A Writer walks the internal document tree, and translates each node or group of nodes into the target (OpenOffice) structure. To write a Writer, you have to understand both the Docutils document tree and the target format. I believe the OpenOffice document structure has some documentation (how good, I don't know). The Docutils document tree is documented in spec/docutils.dtd, which is current but skeletal (it's only a gross structure description; it says nothing about semantics or internal node attributes), and in spec/doctree.txt, which is incomplete. To get something more complete and fleshed-out, ask questions. I'll be happy to answer, and those answers will go into the docs. I'll try to work on the docs too, but it will go faster with prompting from you. Start by examining the most complete Writer module we have, docutils/writers/html4css1.py. Take the output from processing a document with tools/docutils-xml.py, and compare to the output from tools/html.py. The HTMLTranslator class in html4css1.py traverses the tree given by the docutils-xml.py output, which is equivalent to the internal document tree, using a Visitor pattern. Every node in the document tree triggers a "visit_node" method on entry, and a "depart_node" method on exit. These methods build a target format document one piece at a time. Next, ask questions. Iterate. Continual incremental improvement. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
|
From: Aahz <aa...@py...> - 2002-08-04 00:15:38
|
On Fri, Aug 02, 2002, David Goodger wrote: > Aahz wrote: >> >> My primary interest -- I think -- is in a Writer class, because I'm >> trying to create OpenOffice documents. > > OpenOffice docs are XML; internal Docutils document trees are > equivalent to DOM trees, which is XML. Guess what? You're becoming > an XML geek. ;-) <sour look> Yes, I knew that already. Fortunately, I think I mostly don't need to actually understand what I'm doing. > A Writer walks the internal document tree, and translates each node > or group of nodes into the target (OpenOffice) structure. To write > a Writer, you have to understand both the Docutils document tree and > the target format. I believe the OpenOffice document structure has > some documentation (how good, I don't know). The Docutils document > tree is documented in spec/docutils.dtd, which is current but skeletal > (it's only a gross structure description; it says nothing about > semantics or internal node attributes), and in spec/doctree.txt, which > is incomplete. To get something more complete and fleshed-out, ask > questions. I'll be happy to answer, and those answers will go into > the docs. I'll try to work on the docs too, but it will go faster > with prompting from you. I was a bit confused about the relationship between write() and translate(); I think the latter might be better named "visit()". I'm less using the OpenOffice docs than I am writing OpenOffice documents and examining the XML output. > Start by examining the most complete Writer module we have, > docutils/writers/html4css1.py. Take the output from processing a > document with tools/docutils-xml.py, and compare to the output from > tools/html.py. The HTMLTranslator class in html4css1.py traverses > the tree given by the docutils-xml.py output, which is equivalent > to the internal document tree, using a Visitor pattern. Every node > in the document tree triggers a "visit_node" method on entry, and a > "depart_node" method on exit. These methods build a target format > document one piece at a time. Okay, got that. It's not that different from sgmllib. Now, how do I add a new directive? (For index tags.) -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ |
|
From: David G. <go...@us...> - 2002-08-04 15:27:36
|
Aahz wrote: > I was a bit confused about the relationship between write() and > translate(); I think the latter might be better named "visit()". I think of "visit()" as an abstract tree traversal operation: the mechanics. "translate()" is what the "visit" is actually accomplishing: semantics. Perhaps "transform()" would have been better, but it was already taken. ;-) -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
|
From: David G. <go...@us...> - 2002-08-04 15:44:24
|
Aahz wrote:
> Now, how do I add a new directive? (For index tags.)
Take a look at docutils/parsers/rst/directives/images.py for relevant
examples. If you could show us some concrete examples of what you want, it
would be easier to advise.
Were I writing a book with an index, I guess I'd need two different kinds of
index tags: inline/implicit and out-of-line/explicit. For example::
In this `paragraph`:index:, several words are being
`marked`:index: inline as implicit `index`:index:
entries.
.. index:: markup
.. index:: syntax
The explicit index directives above would refer to
this paragraph.
The words "paragraph", "marked", and "index" would become index entries
pointing at the words in the first paragraph. The index entries appear
verbatim in the text. (Don't worry about the ugly ":index:" part; if
indexing is the only application of interpreted text in your documents, it
can be implicit and omitted.) The two directives provide manual indexing,
where the index entry words ("markup" and "syntax") do not appear in the
main text. We could combine the two directives into one::
.. index:: markup; syntax
Semicolons instead of commas because commas could *be* part of the index
entry, like::
.. index:: van Rossum, Guido
Sometimes index entries have multiple levels. Given::
.. index:: statement syntax: expression statements
In the index, combined with other entries, it might look like this::
statement syntax
expression statements ..... 56
assignment ................ 57
simple statements ......... 58
compound statements ....... 60
How does all that sound? These are just my initial ideas. Nothing's been
implemented yet, so tell us your requirements to get them incorporated.
Last time around, you mentioned "see / see also" index entries; I'm
unfamiliar with the semantics of these. I guess I'll be reading more of the
DocBook docs, a great reference for this type of thing (although the DocBook
implementation is not always ideal).
--
David Goodger <go...@us...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|
|
From: Aahz <aa...@py...> - 2002-08-04 20:26:03
|
On Sun, Aug 04, 2002, David Goodger wrote: > Aahz wrote: >> >> Now, how do I add a new directive? (For index tags.) > > Take a look at docutils/parsers/rst/directives/images.py for relevant > examples. If you could show us some concrete examples of what you want, it > would be easier to advise. After a quick skim of images.py: So, in other words, in order to add a new directive, I'm going to have to understand how the state machine works? > In the index, combined with other entries, it might look like this:: > > statement syntax > expression statements ..... 56 > assignment ................ 57 > simple statements ......... 58 > compound statements ....... 60 Note that currently I don't care about reST generating the actual index; I'm only interested in emitting index tags for OpenOffice. That goes double because OpenOffice will be converted to Word and thence to Frame before the index gets generated. (Yes, I'm ignoring your questions about requirements for now; I'm more interested in hacking up something which gives minimal results.) -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/ |
|
From: David G. <go...@us...> - 2002-08-04 21:06:43
|
Aahz wrote:
> After a quick skim of images.py: So, in other words, in order to add a
> new directive, I'm going to have to understand how the state machine
> works?
I wouldn't go that far. You'll need some knowledge of the inner workings,
yes; there's no free lunch. The directive API is summarized in
docutils/parsers/rst/directives/__init__.py, except for details about the
return value, which I've just added:
Directive functions return a tuple of two values:
- a list of nodes which will be inserted into the document tree at
the point where the directive was encountered (can be an empty
list), and
- a boolean, true iff the directive block finished on a blank line.
You'll need some services from the state machine, to get the line(s)
following the directive, usually an indented block, and to determine the
"blank_finish" state. The "image" directive does everything you'd need
yours to do, I think, so it should be possible to hack something up.
To use the directive, it has to be registered in the directives/__init__.py
module. You'll also need a new node class to instantiate, perhaps
"index_target" or "index_term". Node classes usually live in
docutils/nodes.py, but directives can make their own if they're specialized;
see directives/html.py, class "meta" (inside class "MetaBody").
> Note that currently I don't care about reST generating the actual index;
> I'm only interested in emitting index tags for OpenOffice.
Understood. I was just giving an example to illustrate semantics.
> (Yes, I'm ignoring your questions about requirements for now; I'm more
> interested in hacking up something which gives minimal results.)
If you get stuck, let us know what you need, and you may get some help.
Until then, happy hacking!
--
David Goodger <go...@us...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|