[Docstring-checkins] CVS: dps/spec dps-notes.txt,1.22,1.23
Status: Pre-Alpha
Brought to you by:
goodger
From: David G. <go...@us...> - 2002-02-06 03:11:44
|
Update of /cvsroot/docstring/dps/spec In directory usw-pr-cvs1:/tmp/cvs-serv12722/dps/spec Modified Files: dps-notes.txt Log Message: updated Index: dps-notes.txt =================================================================== RCS file: /cvsroot/docstring/dps/spec/dps-notes.txt,v retrieving revision 1.22 retrieving revision 1.23 diff -C2 -d -r1.22 -r1.23 *** dps-notes.txt 2002/01/30 04:56:54 1.22 --- dps-notes.txt 2002/02/06 03:11:41 1.23 *************** *** 47,51 **** Doc-SIG post 'Suggestions for reST "modes"' as a base. ! - Write modules for common transforms. See Transforms_ below. - Ask Python-dev for opinions (GvR for a pronouncement) on special --- 47,52 ---- Doc-SIG post 'Suggestions for reST "modes"' as a base. ! - Write modules for common transforms. See `Unimplemented Transforms`_ ! below. - Ask Python-dev for opinions (GvR for a pronouncement) on special *************** *** 61,64 **** --- 62,68 ---- - Apply the `coding conventions`_ as given below. + - Merge reStructuredText into DPS and rename it to "docutils". + (SourceForge project registered & waiting.) + Coding Conventions *************** *** 86,91 **** ! Transforms ! ========== Footnote Gathering --- 90,95 ---- ! Unimplemented Transforms ! ======================== Footnote Gathering *************** *** 111,115 **** - duplicate reference and/or substitution names that need to be made ! unique; and - duplicate footnote numbers that need to be renumbered. --- 115,119 ---- - duplicate reference and/or substitution names that need to be made ! unique; and/or - duplicate footnote numbers that need to be renumbered. *************** *** 119,122 **** --- 123,143 ---- + Document Splitting + ------------------ + + If the processed document is written to multiple files (possibly in a + directory tree), it will need to be split up. References will have to + be adjusted. + + (HTML only? See Deployment_ below.) + + + Navigation + ---------- + + If a document is split up, each segment will need navigation links: + parent, children (small TOC), previous (preorder), next (preorder). + + Table of Contents ----------------- *************** *** 124,130 **** This runs over the entire tree, and locates <section> elements. It produces a <contents> subtree, which can be inserted at the ! appropriate place, with links to the <section>s. It needs to make sure ! that the links it uses are *real*, so ideally it will use the ! "implicit" link for a section when it exists, and it will have to invent one when the implicit link isn't there (presumably because the section is the twelfth "Introduction" in the document...). --- 145,151 ---- This runs over the entire tree, and locates <section> elements. It produces a <contents> subtree, which can be inserted at the ! appropriate place, with links to the <section> elements. It needs to ! make sure that the links it uses are *real*, so ideally it will use ! the "implicit" link for a section when it exists, and it will have to invent one when the implicit link isn't there (presumably because the section is the twelfth "Introduction" in the document...). *************** *** 171,205 **** ! Modes and Styles ! ================ ! The Python docstring mode model that's evolving in my mind goes ! something like this: ! 1. Extract the docstring/namespace tree from the module(s) and/or package(s). 2. Run the parser on each docstring in turn, producing a forest of trees (internal data structure as per nodes.py). ! 3. Run various transformations on the individual docstring trees. ! Examples: resolving cross-references; resolving hyperlinks; ! footnote auto-numbering; first field list -> bibliographic ! elements. ! 4. Join the docstring trees together into a single tree, running more ! transformations (such as creating various sections like "Module ! Attributes", "Functions", "Classes", "Class Attributes", etc.; see ! the DPS spec/ppdi.dtd). ! 5. Pass the resulting unified tree to the output formatter. I've had trouble reconciling the roles of input parser and output ! formatter with the idea of "modes". Does the mode govern the ! tranformation of the input, the output, or both? Perhaps the mode ! should be split into two. For example, say the source of our input is a Python module. Our ! "input mode" should be "Python Docstring Mode". It discovers (from ``__docformat__``) that the input parser is "reStructuredText". If we want HTML, we'll specify the "HTML" output formatter. But there's a --- 192,234 ---- ! Python Source Reader ! ==================== ! The Python Source Reader ("PySource") model that's evolving in my mind ! goes something like this: ! 1. Extract the docstring/namespace [#]_ tree from the module(s) and/or package(s). + .. [#] See `Docstring Extractor`_ above. + 2. Run the parser on each docstring in turn, producing a forest of trees (internal data structure as per nodes.py). ! 3. Join the docstring trees together into a single tree, running ! transforms: ! - merge hyperlinks ! - merge namespaces ! - create various sections like "Module Attributes", "Functions", ! "Classes", "Class Attributes", etc.; see the DPS spec/ppdi.dtd ! - convert the above special sections to ordinary DPS nodes ! 4. Run transforms on the combined doctree. Examples: resolving ! cross-references/hyperlinks (including interpreted text on Python ! identifiers); footnote auto-numbering; first field list -> ! bibliographic elements. + (Or should step 4's transforms come before step 3?) + + 5. Pass the resulting unified tree to the writer/builder. + I've had trouble reconciling the roles of input parser and output ! writer with the idea of modes ("readers" or "directors"). Does the ! mode govern the tranformation of the input, the output, or both? ! Perhaps the mode should be split into two. For example, say the source of our input is a Python module. Our ! "input mode" should be the "Python Source Reader". It discovers (from ``__docformat__``) that the input parser is "reStructuredText". If we want HTML, we'll specify the "HTML" output formatter. But there's a *************** *** 210,266 **** the input mode? Or can/should they be independent? ! I envision interaction between the input parser, an "input mode" ! (would control steps 1, 2, & 3), a "transformation style" (would ! control step 4), and the output formatter. The same intermediate data ! format would be used between each of these, gaining detail as it ! progresses. - This requires thought. ! Tony's contribution: ! OK - my model is not dissimilar, but goes like: ! ! 1. Parse the Python module(s) [remembering we may have a package] ! This locates the docstrings, amongst other things. ! ! 2. Trim the tree to lose stuff we didn't need (!). ! ! 3. Parse the docstrings (this might, instead, be done at the time ! that each docstring is "discovered"). ! ! 4. Integrate the docstring into the tree - this *may* be as simple ! as having "thing.docstring = <docstring instance>" ! ! 5. Perform internal resolutions on the docstring (footnotes, etc.) ! ! 6. Perform intra-module/package resolutions on the docstring ! (so this is when we work out that `Fred` in *this* docstring ! refers to class Fred over here in the datastructure). ! ! 7. Format. ! ... ! A mode needs to: ! ! 1. Provide plugins for parsing - this *may* go so far as to ! subsume the DPS functionality into a new program, as I'm doing ! for Python. In this case the "plugin" for parsing may be ! virtual - I just need to ferret around in the docstring looking ! for things that are already there, perhaps. ! ! 2. Provide plugins for formatting - again, these may subsume a ! DPS parser process. In the Python case, I clearly want to *use* ! the normal HTML parser for HTML output, but with extra support ! "around it" for the Python specific infrastructure. Visitors ======== ! To nodes.py, add ``Node.walkabout()``, ``Visitor.walkabout()``, ! ``Visitor.leave_*()``, and ``GenericVisitor.default_leave()`` methods ! to catch elements on the way out? Here's ``Node.walkabout()``:: def walkabout(self, visitor, ancestry=()): --- 239,439 ---- the input mode? Or can/should they be independent? ! I envision interaction between the input parser, an "input mode" , and ! the output formatter. The same intermediate data format would be used ! between each of these, being transformed as it progresses. ! Docutils Project Model ! ====================== ! Here's the latest project model:: ! 1,3,5 6,8 ! +--------+ +--------+ ! | READER | =======================> | WRITER | ! +--------+ (purely presentational) +--------+ ! // \ / \ ! // \ / \ ! 2 // 4 \ 7 / 9 \ ! +--------+ +------------+ +------------+ +--------------+ ! | PARSER |...| reader | | writer |...| deployment | ! +--------+ | transforms | | transforms | | | ! | | | | | - one file | ! | - docinfo | | - styling | | - many files | ! | - titles | | - writer- | | - objects in | ! | - linking | | specific | | memory | ! | - lookups | | - etc. | +--------------+ ! | - reader- | +------------+ ! | specific | ! | - parser- | ! | specific | ! | - layout | ! | - etc. | ! +------------+ ! The numbers indicate the path a document would take through the code. ! Double-width lines between reader & parser and between reader & ! writer, indicating that data sent along these paths should be standard ! (pure & unextended) DPS doc trees. Single-width lines signify that ! internal tree extensions are OK (but must be supported internally at ! both ends), and may in fact be totally unrelated to the DPS doc tree ! structure. I've added "reader-specific" and "layout" transforms to the ! list of transforms. BTW, these transforms are not necessarily all in ! one directory; it's a nebulous grouping (it's hard to draw ASCII ! clouds). ! ! ! Issues ! ------ ! ! - Naming. Use "director"/"builder" instead of "reader"/"writer"? Then ! "deployment" could be replaced by "writer". ! ! - Transforms. How to specify which transforms (and in what order) ! apply to each combination of reader, parser/syntax, writer, and ! deployment? Even if we restrict ourselves to one parser, there will ! eventually be a multitude of readers, writers, and deployment ! options. + Or are readers & writers independent? Then we have reader/parser and + writer/deployment combinations to consider. + + + Components + ---------- + + Parsers + ``````` + + Responsibilities: Given raw input text and an empty doctree, populate + the doctree by parsing the input text. + + + Readers + ``````` + + ("Readers" may be renamed to "Directors".) + + Most Readers will have to be told what parser to use. So far (see the + list of examples below), only the Python Source Reader (PySource) will + be able to determine the syntax on its own. + + Responsibilities: + + - Do raw input on the source. + - Pass the raw text to the parser, along with a fresh doctree. + - Combine and collate doctrees if necessary. + - Run transforms over the doctree(s). + + Examples: + + - Standalone/Raw/Plain: Just read a text file and process it. The + reader needs to be told which parser to use. Parser-specific + readers? + - Python Source: See `Python Source Reader`_ above. + - Email: RFC-822 headers, quoted excerpts, signatures, MIME parts. + - PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to + URIs. Either interpret PEPs' indented sections or convert existing + PEPs to reStructuredText. + - Wiki: Global reference lookups of "wiki links" incorporated into + transforms. (CamelCase only or unrestricted?) Lazy indentation? + - Web Page: As standalone, but recognize meta fields as meta tags. + - FAQ: Structured "question & answer(s)" constructs. + - Compound document: Merge chapters into a book. Master TOC file? + + + Transforms + `````````` + + Responsibilities: + + - Modify a doctree in-place. + + Examples: + + - Already implemented: DocInfo, DocTitle (in frontmatter.py); + Hyperlinks, Footnotes, Substitutions (in references.py). + - Unimplemented: See `Unimplemented Transforms`_ above. + + + Writers + ``````` + + ("Writers" may be renamed to "Builders".) + + Responsibilities: + + - Transform doctree into specific output formats. + - Transform references into format-native forms. + + Examples: + + - XML: Various forms, such as DocBook. Also, raw doctree XML. + - HTML + - TeX + - Plain text + - reStructuredText? + + + Deployment + `````````` + + ("Deployment" may be renamed to "Writers" or "Publishers", and current + writer/deployment [renamed to builder/writer] components may change + places. After renaming, the model would look like this:: + + 1,3,5 6,8 + +--------+ +---------+ formerly + | READER | =======================> | BUILDER | writer + +--------+ (purely presentational) +---------+ + // \ / \ + // \ / \ + 2 // 4 \ 7 / 9 \ + +--------+ +------------+ +------------+ +--------+ + | PARSER |...| reader | | writer |...| WRITER | + +--------+ | transforms | | transforms | +--------+ + | ... | | ... | formerly + deployment + + After renaming *and* rearrangement, the model would look like this:: + + 1,3,5 6,8 + +--------+ +--------+ formerly + | READER | =======================> | WRITER | deployment + +--------+ (purely presentational) +--------+ + // \ / \ + // \ / \ + 2 // 4 \ 7 / 9 \ + +--------+ +------------+ +------------+ +---------+ + | PARSER |...| reader | | writer |...| BUILDER | + +--------+ | transforms | | transforms | +---------+ + | ... | | ... | formerly writer + + We'll wait and see which arrangement works out best. Is it better for + the writer/builder to control the deployment/writer, or vice versa? Or + should they be equals? + + Looking at the list of writers, it seems that only HTML would require + anything other than monolithic output. Perhaps merge the "deployment" + into the "writer"?) + + Responsibilities: + + - Do raw output to the destination. + - Transform references per incarnation. + + Examples: + - Single file + - Multiple files & directories + - Objects in memory + + Visitors ======== ! To nodes.py, add ``Node.walkabout()``, ``Visitor.leave_*()``, and ! ``GenericVisitor.default_leave()`` methods to catch elements on the ! way out? Here's ``Node.walkabout()``:: def walkabout(self, visitor, ancestry=()): *************** *** 286,294 **** method(self, ancestry) - Here's ``Visitor.walkabout()``:: - - def walkabout(self): - self.doctree.walkabout(self) - Mixing Automatic and Manual Footnote Numbering --- 459,462 ---- *************** *** 301,307 **** numbering; it would cause numbering and referencing conflicts. ! Would such mixing inevitably cause conflicts? We could probably work around ! potential conflicts with a decent algorithm. Should we? Requires thought. ! Opinions? [Tony] --- 469,475 ---- numbering; it would cause numbering and referencing conflicts. ! Would such mixing inevitably cause conflicts? We could probably work ! around potential conflicts with a decent algorithm. Should we? ! Requires thought. Opinions? [Tony] *************** *** 309,314 **** was in the category of "don't, in practice, care" so far as I was concerned. This is the same category I put the forbidding of nested ! inline markup - quite clearly one *can* do it, but equally clearly it's ! a pain to implement, and not a terribly great gain, all things considered. --- 477,482 ---- was in the category of "don't, in practice, care" so far as I was concerned. This is the same category I put the forbidding of nested ! inline markup - quite clearly one *can* do it, but equally clearly ! it's a pain to implement, and not a terribly great gain, all things considered. *************** *** 316,329 **** had some experience of people *using* reST in the wild". ! Thus, given there are lots of other things to do, I would tend to leave ! it as-is (especially if you are able to *warn* people about it if they ! do it by mistake). To my mind, being able to do ``[#thing]_`` probably give people enough ! precision over footnotes whils still allowing autonumbering - the *only* ! potential problem is when referring to a footnote in a different ! document (and that, again, is something I would leave fallow for the ! moment, although we know I tend to want to use roles as annotation for ! that sort of thing). --- 484,497 ---- had some experience of people *using* reST in the wild". ! Thus, given there are lots of other things to do, I would tend to ! leave it as-is (especially if you are able to *warn* people about it ! if they do it by mistake). To my mind, being able to do ``[#thing]_`` probably give people enough ! precision over footnotes whils still allowing autonumbering - the ! *only* potential problem is when referring to a footnote in a ! different document (and that, again, is something I would leave fallow ! for the moment, although we know I tend to want to use roles as ! annotation for that sort of thing). |