[Docstring-checkins] CVS: dps/spec dps-notes.txt,1.22,1.23

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/docstring/dps/spec
In directory usw-pr-cvs1:/tmp/cvs-serv12722/dps/spec

Modified Files:
	dps-notes.txt 
Log Message:
updated

Index: dps-notes.txt
===================================================================
RCS file: /cvsroot/docstring/dps/spec/dps-notes.txt,v
retrieving revision 1.22
retrieving revision 1.23
diff -C2 -d -r1.22 -r1.23
*** dps-notes.txt	2002/01/30 04:56:54	1.22
--- dps-notes.txt	2002/02/06 03:11:41	1.23
***************
*** 47,51 ****
    Doc-SIG post 'Suggestions for reST "modes"' as a base.

! - Write modules for common transforms. See Transforms_ below.

  - Ask Python-dev for opinions (GvR for a pronouncement) on special
--- 47,52 ----
    Doc-SIG post 'Suggestions for reST "modes"' as a base.

! - Write modules for common transforms. See `Unimplemented Transforms`_
!   below.

  - Ask Python-dev for opinions (GvR for a pronouncement) on special
***************
*** 61,64 ****
--- 62,68 ----
    - Apply the `coding conventions`_ as given below.

+ - Merge reStructuredText into DPS and rename it to "docutils".
+   (SourceForge project registered & waiting.)
+ 

  Coding Conventions
***************
*** 86,91 ****

! Transforms
! ==========

  Footnote Gathering
--- 90,95 ----

! Unimplemented Transforms
! ========================

  Footnote Gathering
***************
*** 111,115 ****

  - duplicate reference and/or substitution names that need to be made
!   unique; and
  - duplicate footnote numbers that need to be renumbered.

--- 115,119 ----

  - duplicate reference and/or substitution names that need to be made
!   unique; and/or
  - duplicate footnote numbers that need to be renumbered.

***************
*** 119,122 ****
--- 123,143 ----

+ Document Splitting
+ ------------------
+ 
+ If the processed document is written to multiple files (possibly in a
+ directory tree), it will need to be split up. References will have to
+ be adjusted.
+ 
+ (HTML only? See Deployment_ below.)
+ 
+ 
+ Navigation
+ ----------
+ 
+ If a document is split up, each segment will need navigation links:
+ parent, children (small TOC), previous (preorder), next (preorder).
+ 
+ 
  Table of Contents
  -----------------
***************
*** 124,130 ****
  This runs over the entire tree, and locates <section> elements. It
  produces a <contents> subtree, which can be inserted at the
! appropriate place, with links to the <section>s. It needs to make sure
! that the links it uses are *real*, so ideally it will use the
! "implicit" link for a section when it exists, and it will have to
  invent one when the implicit link isn't there (presumably because the
  section is the twelfth "Introduction" in the document...).
--- 145,151 ----
  This runs over the entire tree, and locates <section> elements. It
  produces a <contents> subtree, which can be inserted at the
! appropriate place, with links to the <section> elements. It needs to
! make sure that the links it uses are *real*, so ideally it will use
! the "implicit" link for a section when it exists, and it will have to
  invent one when the implicit link isn't there (presumably because the
  section is the twelfth "Introduction" in the document...).
***************
*** 171,205 ****

! Modes and Styles
! ================

! The Python docstring mode model that's evolving in my mind goes
! something like this:

! 1. Extract the docstring/namespace tree from the module(s) and/or
     package(s).

  2. Run the parser on each docstring in turn, producing a forest of
     trees (internal data structure as per nodes.py).

! 3. Run various transformations on the individual docstring trees.
!    Examples: resolving cross-references; resolving hyperlinks;
!    footnote auto-numbering; first field list -> bibliographic
!    elements.

! 4. Join the docstring trees together into a single tree, running more
!    transformations (such as creating various sections like "Module
!    Attributes", "Functions", "Classes", "Class Attributes", etc.; see
!    the DPS spec/ppdi.dtd).

! 5. Pass the resulting unified tree to the output formatter.

  I've had trouble reconciling the roles of input parser and output
! formatter with the idea of "modes". Does the mode govern the
! tranformation of the input, the output, or both? Perhaps the mode
! should be split into two.

  For example, say the source of our input is a Python module. Our
! "input mode" should be "Python Docstring Mode". It discovers (from
  ``__docformat__``) that the input parser is "reStructuredText". If we
  want HTML, we'll specify the "HTML" output formatter. But there's a
--- 192,234 ----

! Python Source Reader
! ====================

! The Python Source Reader ("PySource") model that's evolving in my mind
! goes something like this:

! 1. Extract the docstring/namespace [#]_ tree from the module(s) and/or
     package(s).

+    .. [#] See `Docstring Extractor`_ above.
+ 
  2. Run the parser on each docstring in turn, producing a forest of
     trees (internal data structure as per nodes.py).

! 3. Join the docstring trees together into a single tree, running
!    transforms:

!    - merge hyperlinks
!    - merge namespaces
!    - create various sections like "Module Attributes", "Functions",
!      "Classes", "Class Attributes", etc.; see the DPS spec/ppdi.dtd
!    - convert the above special sections to ordinary DPS nodes

! 4. Run transforms on the combined doctree.  Examples: resolving
!    cross-references/hyperlinks (including interpreted text on Python
!    identifiers); footnote auto-numbering; first field list ->
!    bibliographic elements.

+    (Or should step 4's transforms come before step 3?)
+ 
+ 5. Pass the resulting unified tree to the writer/builder.
+ 
  I've had trouble reconciling the roles of input parser and output
! writer with the idea of modes ("readers" or "directors"). Does the
! mode govern the tranformation of the input, the output, or both?
! Perhaps the mode should be split into two.

  For example, say the source of our input is a Python module. Our
! "input mode" should be the "Python Source Reader". It discovers (from
  ``__docformat__``) that the input parser is "reStructuredText". If we
  want HTML, we'll specify the "HTML" output formatter. But there's a
***************
*** 210,266 ****
  the input mode? Or can/should they be independent?

! I envision interaction between the input parser, an "input mode"
! (would control steps 1, 2, & 3), a "transformation style" (would
! control step 4), and the output formatter. The same intermediate data
! format would be used between each of these, gaining detail as it
! progresses.

- This requires thought.

! Tony's contribution:

!     OK - my model is not dissimilar, but goes like:
!     
!     1. Parse the Python module(s) [remembering we may have a package]
!        This locates the docstrings, amongst other things.
!     
!     2. Trim the tree to lose stuff we didn't need (!).
!     
!     3. Parse the docstrings (this might, instead, be done at the time
!        that each docstring is "discovered").
!     
!     4. Integrate the docstring into the tree - this *may* be as simple
!        as having "thing.docstring = <docstring instance>"
!     
!     5. Perform internal resolutions on the docstring (footnotes, etc.)
!     
!     6. Perform intra-module/package resolutions on the docstring
!        (so this is when we work out that `Fred` in *this* docstring
!        refers to class Fred over here in the datastructure).
!     
!     7. Format.

!     ...

!     A mode needs to:
!     
!     1. Provide plugins for parsing - this *may* go so far as to
!        subsume the DPS functionality into a new program, as I'm doing
!        for Python. In this case the "plugin" for parsing may be
!        virtual - I just need to ferret around in the docstring looking
!        for things that are already there, perhaps.
!     
!     2. Provide plugins for formatting - again, these may subsume a
!        DPS parser process. In the Python case, I clearly want to *use*
!        the normal HTML parser for HTML output, but with extra support
!        "around it" for the Python specific infrastructure.

  Visitors
  ========

! To nodes.py, add ``Node.walkabout()``, ``Visitor.walkabout()``,
! ``Visitor.leave_*()``, and ``GenericVisitor.default_leave()`` methods
! to catch elements on the way out? Here's ``Node.walkabout()``::

      def walkabout(self, visitor, ancestry=()):
--- 239,439 ----
  the input mode? Or can/should they be independent?

! I envision interaction between the input parser, an "input mode" , and
! the output formatter. The same intermediate data format would be used
! between each of these, being transformed as it progresses.

! Docutils Project Model
! ======================

! Here's the latest project model::

!            1,3,5                               6,8
!            +--------+                          +--------+
!            | READER | =======================> | WRITER |
!            +--------+ (purely presentational)  +--------+
!             //    \                              /    \
!            //      \                            /      \
!     2     //     4  \               7          /     9  \
!     +--------+   +------------+     +------------+   +--------------+
!     | PARSER |...| reader     |     | writer     |...| deployment   |
!     +--------+   | transforms |     | transforms |   |              |
!                  |            |     |            |   | - one file   |
!                  | - docinfo  |     | - styling  |   | - many files |
!                  | - titles   |     | - writer-  |   | - objects in |
!                  | - linking  |     |   specific |   |   memory     |
!                  | - lookups  |     | - etc.     |   +--------------+
!                  | - reader-  |     +------------+
!                  |   specific |
!                  | - parser-  |
!                  |   specific |
!                  | - layout   |
!                  | - etc.     |
!                  +------------+

! The numbers indicate the path a document would take through the code.
! Double-width lines between reader & parser and between reader &
! writer, indicating that data sent along these paths should be standard
! (pure & unextended) DPS doc trees. Single-width lines signify that
! internal tree extensions are OK (but must be supported internally at
! both ends), and may in fact be totally unrelated to the DPS doc tree
! structure. I've added "reader-specific" and "layout" transforms to the
! list of transforms. BTW, these transforms are not necessarily all in
! one directory; it's a nebulous grouping (it's hard to draw ASCII
! clouds).
! 
! 
! Issues
! ------
! 
! - Naming. Use "director"/"builder" instead of "reader"/"writer"? Then
!   "deployment" could be replaced by "writer".
! 
! - Transforms. How to specify which transforms (and in what order)
!   apply to each combination of reader, parser/syntax, writer, and
!   deployment?  Even if we restrict ourselves to one parser, there will
!   eventually be a multitude of readers, writers, and deployment
!   options.

+   Or are readers & writers independent? Then we have reader/parser and
+   writer/deployment combinations to consider.
+ 
+ 
+ Components
+ ----------
+ 
+ Parsers
+ ```````
+ 
+ Responsibilities: Given raw input text and an empty doctree, populate
+ the doctree by parsing the input text.
+ 
+ 
+ Readers
+ ```````
+ 
+ ("Readers" may be renamed to "Directors".)
+ 
+ Most Readers will have to be told what parser to use. So far (see the
+ list of examples below), only the Python Source Reader (PySource) will
+ be able to determine the syntax on its own.
+ 
+ Responsibilities:
+ 
+ - Do raw input on the source.
+ - Pass the raw text to the parser, along with a fresh doctree.
+ - Combine and collate doctrees if necessary.
+ - Run transforms over the doctree(s).
+ 
+ Examples:
+ 
+ - Standalone/Raw/Plain: Just read a text file and process it. The
+   reader needs to be told which parser to use. Parser-specific
+   readers?
+ - Python Source: See `Python Source Reader`_ above.
+ - Email: RFC-822 headers, quoted excerpts, signatures, MIME parts.
+ - PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to
+   URIs. Either interpret PEPs' indented sections or convert existing
+   PEPs to reStructuredText.
+ - Wiki: Global reference lookups of "wiki links" incorporated into
+   transforms. (CamelCase only or unrestricted?) Lazy indentation?
+ - Web Page: As standalone, but recognize meta fields as meta tags.
+ - FAQ: Structured "question & answer(s)" constructs.
+ - Compound document: Merge chapters into a book. Master TOC file?
+ 
+ 
+ Transforms
+ ``````````
+ 
+ Responsibilities:
+ 
+ - Modify a doctree in-place.
+ 
+ Examples:
+ 
+ - Already implemented: DocInfo, DocTitle (in frontmatter.py);
+   Hyperlinks, Footnotes, Substitutions (in references.py).
+ - Unimplemented: See `Unimplemented Transforms`_ above.
+ 
+ 
+ Writers
+ ```````
+ 
+ ("Writers" may be renamed to "Builders".)
+ 
+ Responsibilities:
+ 
+ - Transform doctree into specific output formats.
+ - Transform references into format-native forms.
+ 
+ Examples:
+ 
+ - XML: Various forms, such as DocBook. Also, raw doctree XML.
+ - HTML
+ - TeX
+ - Plain text
+ - reStructuredText?
+ 
+ 
+ Deployment
+ ``````````
+ 
+ ("Deployment" may be renamed to "Writers" or "Publishers", and current
+ writer/deployment [renamed to builder/writer] components may change
+ places. After renaming, the model would look like this::
+ 
+            1,3,5                               6,8
+            +--------+                          +---------+ formerly
+            | READER | =======================> | BUILDER | writer
+            +--------+ (purely presentational)  +---------+
+             //    \                              /    \
+            //      \                            /      \
+     2     //     4  \               7          /     9  \
+     +--------+   +------------+     +------------+   +--------+
+     | PARSER |...| reader     |     | writer     |...| WRITER |
+     +--------+   | transforms |     | transforms |   +--------+
+                  | ...        |     | ...        |    formerly
+                                                       deployment
+ 
+ After renaming *and* rearrangement, the model would look like this::
+ 
+            1,3,5                               6,8
+            +--------+                          +--------+ formerly
+            | READER | =======================> | WRITER | deployment
+            +--------+ (purely presentational)  +--------+
+             //    \                              /    \
+            //      \                            /      \
+     2     //     4  \               7          /     9  \
+     +--------+   +------------+     +------------+   +---------+
+     | PARSER |...| reader     |     | writer     |...| BUILDER |
+     +--------+   | transforms |     | transforms |   +---------+
+                  | ...        |     | ...        |    formerly writer
+ 
+ We'll wait and see which arrangement works out best. Is it better for
+ the writer/builder to control the deployment/writer, or vice versa? Or
+ should they be equals?
+ 
+ Looking at the list of writers, it seems that only HTML would require
+ anything other than monolithic output. Perhaps merge the "deployment"
+ into the "writer"?)
+ 
+ Responsibilities:
+ 
+ - Do raw output to the destination.
+ - Transform references per incarnation.
+ 
+ Examples:

+ - Single file
+ - Multiple files & directories
+ - Objects in memory
+ 
+ 
  Visitors
  ========

! To nodes.py, add ``Node.walkabout()``, ``Visitor.leave_*()``, and
! ``GenericVisitor.default_leave()`` methods to catch elements on the
! way out? Here's ``Node.walkabout()``::

      def walkabout(self, visitor, ancestry=()):
***************
*** 286,294 ****
          method(self, ancestry)

- Here's ``Visitor.walkabout()``::
- 
-     def walkabout(self):
-         self.doctree.walkabout(self)
- 

  Mixing Automatic and Manual Footnote Numbering
--- 459,462 ----
***************
*** 301,307 ****
      numbering; it would cause numbering and referencing conflicts.

! Would such mixing inevitably cause conflicts? We could probably work around
! potential conflicts with a decent algorithm. Should we? Requires thought.
! Opinions?

  [Tony]
--- 469,475 ----
      numbering; it would cause numbering and referencing conflicts.

! Would such mixing inevitably cause conflicts? We could probably work
! around potential conflicts with a decent algorithm. Should we?
! Requires thought.  Opinions?

  [Tony]
***************
*** 309,314 ****
  was in the category of "don't, in practice, care" so far as I was
  concerned. This is the same category I put the forbidding of nested
! inline markup - quite clearly one *can* do it, but equally clearly it's
! a pain to implement, and not a terribly great gain, all things
  considered.

--- 477,482 ----
  was in the category of "don't, in practice, care" so far as I was
  concerned. This is the same category I put the forbidding of nested
! inline markup - quite clearly one *can* do it, but equally clearly
! it's a pain to implement, and not a terribly great gain, all things
  considered.

***************
*** 316,329 ****
  had some experience of people *using* reST in the wild".

! Thus, given there are lots of other things to do, I would tend to leave
! it as-is (especially if you are able to *warn* people about it if they
! do it by mistake).

  To my mind, being able to do ``[#thing]_`` probably give people enough
! precision over footnotes whils still allowing autonumbering - the *only*
! potential problem is when referring to a footnote in a different
! document (and that, again, is something I would leave fallow for the
! moment, although we know I tend to want to use roles as annotation for
! that sort of thing).

--- 484,497 ----
  had some experience of people *using* reST in the wild".

! Thus, given there are lots of other things to do, I would tend to
! leave it as-is (especially if you are able to *warn* people about it
! if they do it by mistake).

  To my mind, being able to do ``[#thing]_`` probably give people enough
! precision over footnotes whils still allowing autonumbering - the
! *only* potential problem is when referring to a footnote in a
! different document (and that, again, is something I would leave fallow
! for the moment, although we know I tend to want to use roles as
! annotation for that sort of thing).