[Docstring-checkins] CVS: dps/spec pep-0258.txt,1.6,1.7
Status: Pre-Alpha
Brought to you by:
goodger
From: David G. <go...@us...> - 2002-03-28 04:36:56
|
Update of /cvsroot/docstring/dps/spec In directory usw-pr-cvs1:/tmp/cvs-serv14008/dps/spec Modified Files: pep-0258.txt Log Message: Overhauled; changed title. Index: pep-0258.txt =================================================================== RCS file: /cvsroot/docstring/dps/spec/pep-0258.txt,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** pep-0258.txt 15 Feb 2002 22:53:17 -0000 1.6 --- pep-0258.txt 28 Mar 2002 04:36:53 -0000 1.7 *************** *** 1,7 **** PEP: 258 ! Title: DPS Generic Implementation Details Version: $Revision$ Last-Modified: $Date$ ! Author: dgo...@bi... (David Goodger) Discussions-To: do...@py... Status: Draft --- 1,7 ---- PEP: 258 ! Title: Docutils Design Specification Version: $Revision$ Last-Modified: $Date$ ! Author: go...@us... (David Goodger) Discussions-To: do...@py... Status: Draft *************** *** 14,23 **** Abstract ! This PEP documents generic implementation details for a Python ! Docstring Processing System (DPS). The rationale and high-level ! concepts of the DPS are documented in PEP 256, "Docstring ! Processing System Framework" [1]. No changes to the core Python language are required by this PEP. --- 14,27 ---- Abstract ! This PEP documents design issues and implementation details for ! Docutils, a Python Docstring Processing System (DPS). The ! rationale and high-level concepts of a DPS are documented in PEP ! 256, "Docstring Processing System Framework" [1]. No changes to the core Python language are required by this PEP. + Its deliverables consist of a package for the standard library and + its documentation. + + @@@ s/dps/Docutils/ from here down *************** *** 29,39 **** 1. What to examine: ! a) If the '__all__' variable is present in the module being ! documented, only identifiers listed in '__all__' are examined for docstrings. ! b) In the absense of '__all__', all identifiers are examined, ! except those whose names are private (names begin with '_' ! but don't begin and end with '__'). c) 1a and 1b can be overridden by a parameter or command-line --- 33,43 ---- 1. What to examine: ! a) If the "__all__" variable is present in the module being ! documented, only identifiers listed in "__all__" are examined for docstrings. ! b) In the absense of "__all__", all identifiers are examined, ! except those whose names are private (names begin with "_" ! but don't begin and end with "__"). c) 1a and 1b can be overridden by a parameter or command-line *************** *** 58,75 **** concatenated. See "Additional Docstrings" below. 3. How: ! Whenever possible, Python modules should be parsed by the ! docstring processing system, not imported. There are security ! reasons for not importing untrusted code. Also, docstrings are ! to be recognized in places where the bytecode compiler ignores ! string literal expressions (2b and 2c above), meaning importing ! the module will lose these docstrings. Of course, standard ! Python parsing tools such as the 'parser' library module may ! be used. When the Python source code for a module is not available (i.e. only the .pyc file exists) or for C extension modules, to ! access docstrings the module must be imported. Since attribute docstrings and additional docstrings are ignored --- 62,83 ---- concatenated. See "Additional Docstrings" below. + d) @@@ 2.2-style "properties" with attribute docstrings? + 3. How: ! Whenever possible, Python modules should be parsed by Docutils, ! not imported. There are security reasons for not importing ! untrusted code. Information from the source is lost when using ! introspection to examine an imported module, such as comments ! and the order of definitions. Also, docstrings are to be ! recognized in places where the bytecode compiler ignores string ! literal expressions (2b and 2c above), meaning importing the ! module will lose these docstrings. Of course, standard Python ! parsing tools such as the "parser" library module may be used. When the Python source code for a module is not available (i.e. only the .pyc file exists) or for C extension modules, to ! access docstrings the module can only be imported, and any ! limitations must be lived with. Since attribute docstrings and additional docstrings are ignored *************** *** 79,82 **** --- 87,91 ---- module may take a slight performance hit. + Attribute Docstrings -------------------- *************** *** 98,102 **** b) At the top level of a class definition: a class attribute. ! c) At the top level of the '__init__' method definition of a class: an instance attribute. --- 107,111 ---- b) At the top level of a class definition: a class attribute. ! c) At the top level of the "__init__" method definition of a class: an instance attribute. *************** *** 116,121 **** b) For context 1c above, the target must be of the form ! 'self.attrib', where 'self' matches the '__init__' method's ! first parameter (the instance parameter) and 'attrib' is a simple indentifier as in 3a. --- 125,130 ---- b) For context 1c above, the target must be of the form ! "self.attrib", where "self" matches the "__init__" method's ! first parameter (the instance parameter) and "attrib" is a simple indentifier as in 3a. *************** *** 125,129 **** Examples:: ! g = 'module attribute (global variable)' """This is g's docstring.""" --- 134,138 ---- Examples:: ! g = 'module attribute (module-global variable)' """This is g's docstring.""" *************** *** 137,147 **** """This is self.i's docstring.""" Additional Docstrings --------------------- Many programmers would like to make extensive use of docstrings for API documentation. However, docstrings do take up space in the running program, so some of these programmers are reluctant to ! 'bloat up' their code. Also, not all API documentation is applicable to interactive environments, where __doc__ would be displayed. --- 146,160 ---- """This is self.i's docstring.""" + Additional Docstrings --------------------- + (This idea was adapted from PEP 216, Docstring Format [3], by + Moshe Zadka.) + Many programmers would like to make extensive use of docstrings for API documentation. However, docstrings do take up space in the running program, so some of these programmers are reluctant to ! "bloat up" their code. Also, not all API documentation is applicable to interactive environments, where __doc__ would be displayed. *************** *** 166,170 **** pass ! Issue: This breaks 'from __future__ import' statements in Python 2.1 for multiple module docstrings. The Python Reference Manual specifies: --- 179,183 ---- pass ! Issue: This breaks "from __future__ import" statements in Python 2.1 for multiple module docstrings. The Python Reference Manual specifies: *************** *** 173,180 **** only lines that can appear before a future statement are: ! * the module docstring (if any), ! * comments, ! * blank lines, and ! * other future statements. Resolution? --- 186,193 ---- only lines that can appear before a future statement are: ! * the module docstring (if any), ! * comments, ! * blank lines, and ! * other future statements. Resolution? *************** *** 187,193 **** 3. Or should we not even worry about this? There shouldn't be ! __future__ statements in production code, after all. Modules ! with __future__ statements will have to put up with the ! single-docstring limitation. Choice of Docstring Format --- 200,207 ---- 3. Or should we not even worry about this? There shouldn't be ! __future__ statements in production code, after all. Will ! modules with __future__ statements simply have to put up with ! the single-docstring limitation? ! Choice of Docstring Format *************** *** 203,208 **** format being used, a case-insensitive string matching the input parser's module or package name (i.e., the same name as required ! to 'import' the module or package), or a registered alias. If no ! __docformat__ is specified, the default format is 'plaintext' for now; this may be changed to the standard format once determined. --- 217,222 ---- format being used, a case-insensitive string matching the input parser's module or package name (i.e., the same name as required ! to "import" the module or package), or a registered alias. If no ! __docformat__ is specified, the default format is "plaintext" for now; this may be changed to the standard format once determined. *************** *** 214,347 **** exists; RFC 1766 is currently being revised to allow 3-letter codes). If no language identifier is specified, the default is ! 'en' for English. The language identifier is passed to the parser and can be used for language-dependent markup features. - DPS Structure - ============= ! - package 'dps' ! - function 'dps.main()' (in 'dps/__init__.py') ! - package 'dps.parsers' ! - module 'dps.parsers.model'; see 'Input Parser API' below. - - package 'dps.formatters' ! - module 'dps.formatters.model'; see 'Output Formatter API' ! below. ! - package 'dps.languages' - - module 'dps.languages.en' (English) ! - others to be added ! - utility modules: 'dps.nodes', 'dps.statemachine', 'dps.utils' ! Command-Line Interface ! ====================== ! XXX To be determined. ! System Python API ! ================= ! XXX To be determined. ! Input Parser API ! ================ ! Each input parser is a module or package exporting a 'Parser' ! class, with the following interface: ! class Parser: ! def __init__(self, inputstring, warninglevel=1, ! errorlevel=3, language='en'): ! """Initialize the Parser instance.""" ! def parse(self): ! """Parse the input string and return a tree.""" ! XXX This needs a lot of work. What is required for this API? ! A model 'Parser' class implementing the full interface along with ! utility functions can be found in the 'dps.parsers.model' module. ! Output Formatter API ! ==================== ! Each output formatter is a module or package exporting a ! 'Formatter' class, with the following interface: ! class Formatter: ! def __init__(self, domtree, language='en', ! showwarnings=0): ! """Initialize the Formatter instance.""" - def format(self): - """Return a formatted string from the DOM tree.""" ! XXX This also needs a lot of work. What is required for this API? ! XXX How to handle unimplemented elements? ! A model 'Formatter' class implementing the full interface along ! with utility functions can be found in the 'dps.formatters.model' ! module. ! Language Module API ! =================== ! Language modules will contain language-dependent strings and ! mappings. They will be named for their language identifier (as ! defined in 'Choice of Docstring Format' above), converting dashes ! to underscores. - XXX Specifics to be determined. ! Intermediate Data Structure ! =========================== ! A single intermediate data structure is used by the docstring ! processing system, in the interfaces between parsers, the DPS ! itself, and formatters. It is not required that this data ! structure be used internally by any of the componentes. This data ! structure is similar to a DOM tree whose schema is documented in ! an XML DTD (eXtensible Markup Language Document Type Definition), ! which comes in three parts: ! - the Python Plaintext Document Interface DTD, ppdi.dtd [6], ! - the Generic Plaintext Document Interface DTD, gpdi.dtd [7], ! - and the OASIS Exchange Table Model, soextbl.dtd [8]. ! The DTD defines a rich set of elements, suitable for any input ! syntax or output format. The input parser and the output ! formatter share the same intermediate data structure. The ! processing system may do transformations on the data from the ! input parser before passing it on to the output formatter. The ! DTD retains all information necessary to reconstruct the original ! input text, or a reasonable facsimile thereof. ! XXX Specifics (about the DOM tree) to be determined. ! Output Management ! ================= ! XXX To be determined. - Type of output: filesystem only, or in-memory data structure too? - File/directory naming & structure conventions. In-memory data - structure should follow filesystem naming; file/directory == - leaf/node. Use a directory hierarchy rather than long file names. - (The files generated by pythondoc used compound file names, like - 'packagename.modulename.classname.html', which were often too long - for the 38-character MacOS file name length limit. This is one of - the reasons pythondoc couldn't run on MacOS). Error Handling --- 228,578 ---- exists; RFC 1766 is currently being revised to allow 3-letter codes). If no language identifier is specified, the default is ! "en" for English. The language identifier is passed to the parser and can be used for language-dependent markup features. ! Docutils Project Model ! ====================== ! :: ! +--------------------------+ ! | Docutils: | ! | docutils.core.Publisher, | ! | docutils.core.publish() | ! +--------------------------+ ! / \ ! / \ ! 1,3,5 / \ 6,8 ! +--------+ +--------+ ! | READER | =======================> | WRITER | ! +--------+ +--------+ ! // \ / \ ! // \ / \ ! 2 // 4 \ 7 / 9 \ ! +--------+ +------------+ +------------+ +--------------+ ! | PARSER |...| reader | | writer |...| DISTRIBUTOR? | ! +--------+ | transforms | | transforms | | | ! | | | | | - one file | ! | - docinfo | | - styling | | - many files | ! | - titles | | - writer- | | - objects in | ! | - linking | | specific | | memory | ! | - lookups | | - etc. | +--------------+ ! | - reader- | +------------+ ! | specific | ! | - parser- | ! | specific | ! | - layout | ! | - etc. | ! +------------+ ! The numbers indicate the path a document would take through the ! code. Double-width lines between reader & parser and between ! reader & writer, indicating that data sent along these paths ! should be standard (pure & unextended) DPS doc trees. ! Single-width lines signify that internal tree extensions or ! completely unrelated representations are possible, but they must ! be supported internally at both ends. ! Publisher ! --------- ! The "dps.core" module contains a "Publisher" facade class and ! "publish" convenience function. Publisher encapsulates the ! high-level logic of a Docutils system. The Publisher.publish() ! method passes its input to its Reader, then passes the resulting ! document tree through its Writer to its destination. ! Readers ! ------- ! Readers understand the input context (where the data is coming ! from), send the whole input or discrete "chunks" to the parser, ! and provide the context to bind the chunks together back into a ! cohesive whole. Using transforms_, Readers also resolve ! references, footnote numbers, interpreted text processing, and ! anything else that requires context-sensitive computation. ! Each reader is a module or package exporting a "Reader" class with ! a "read" method. The base "Reader" class can be found in the ! dps/readers/__init__.py module. ! Most Readers will have to be told what parser to use. So far (see ! the list of examples below), only the Python Source Reader ! (PySource) will be able to determine the syntax on its own. ! Responsibilities: ! - Do raw input on the source ("Reader.scan()"). ! - Pass the raw text to the parser, along with a fresh doctree ! root ("Reader.parse()"). ! - Run transforms over the doctree(s) ("Reader.transform()"). ! Examples: ! - Standalone/Raw/Plain: Just read a text file and process it. The ! reader needs to be told which parser to use. Parser-specific ! readers? ! - Python Source: See `Python Source Reader`_ above. ! - Email: RFC-822 headers, quoted excerpts, signatures, MIME parts. ! - PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to ! URIs. Either interpret PEPs' indented sections or convert existing ! PEPs to reStructuredText (or both?). ! - Wiki: Global reference lookups of "wiki links" incorporated into ! transforms. (CamelCase only or unrestricted?) Lazy indentation? ! - Web Page: As standalone, but recognize meta fields as meta tags. ! Support for templates of some sort? (After <body>, before </body>?) ! - FAQ: Structured "question & answer(s)" constructs. ! - Compound document: Merge chapters into a book. Master TOC file? ! Parsers ! ------- ! Parsers analyze their input and produce a Docutils `document ! tree`_. They don't know or care anything about the source or ! destination of the data. ! Each input parser is a module or package exporting a "Parser" ! class with a "parse" method. The base "Parser" class can be found ! in the dps/parsers/__init__.py module. ! Responsibilities: Given raw input text and a doctree root node, ! populate the doctree by parsing the input text. ! Example: The only parser implemented so far is for the ! reStructuredText markup. ! Transforms ! ---------- ! Transforms change the document tree from one form to another, add ! to the tree, or prune it. Transforms are run by Reader and Writer ! objects. Some transforms are Reader-specific, some are ! Parser-specific, and others are Writer-specific. The choice and ! order of transforms is specified in the Reader and Writer objects. ! Each transform is a class in a module in the dps/transforms ! package, a subclass of dps.tranforms.Transform. ! Responsibilities: ! - Modify a doctree in-place, either purely transforming one ! structure into another, or adding new structures based on the ! doctree and/or external data. ! Examples (in "dps.transforms"): ! - frontmatter.DocInfo: conversion of document metadata ! (bibliographic information). ! - references.Hyperlinks: resolution of hyperlinks. ! - document.Merger: combining multiple populated doctrees into one. ! ! ! Writers ! ------- ! ! Writers produce the final output (HTML, XML, TeX, etc.). Writers ! translate the internal document tree structure into the final data ! format, possibly running output-specific transforms_ first. ! ! Each writer is a module or package exporting a "Writer" class with ! a "write" method. The base "Writer" class can be found in the ! dps/writers/__init__.py module. ! ! Responsibilities: ! ! - Run transforms over the doctree(s). ! ! - Translate doctree(s) into specific output formats. ! ! - Transform references into format-native forms. ! ! - Write output to the destination (possibly via a "Distributor"). ! ! Examples: ! ! - XML: Various forms, such as DocBook. Also, raw doctree XML. ! ! - HTML ! ! - TeX ! ! - Plain text ! ! - reStructuredText? ! ! ! Distributors ! ------------ ! ! Distributors exist for each method of storing the results of ! processing: ! ! - In a single file on disk. ! ! - In a tree of directories and files on disk. ! ! - In a single tree-shaped data structure in memory. ! ! - Some other set of data structures in memory. ! ! @@@ Distributors are currently just an idea; they may or may not ! be practical. Issues: ! ! Is it better for the writer to control the distributor, or ! vice versa? Or should they be equals? ! ! Looking at the list of writers, it seems that only HTML would ! require anything other than monolithic output. Perhaps merge ! the HTML "distributor" into "writer" variants? ! ! Perhaps translator/writer instead of writer/distributor? ! ! Responsibilities: ! ! - Do raw output to the destination. ! ! - Transform references per incarnation (method of distribution). ! ! Examples: ! ! - Single file. ! ! - Multiple files & directories. ! ! - Objects in memory. ! ! ! DPS Package Structure ! ========================== ! ! - Package "dps". ! ! - Module "dps.core" contains facade class "Publisher" and ! convenience function "publish()". See `Publisher API`_ below. ! ! - Module "dps.nodes" contains the Docutils document tree element ! class library plus Visitor pattern base classes. See ! `Document Tree`_ below. ! ! - Module "dps.roman" contains Roman numeral conversion ! routines. ! ! - Module "dps.statemachine" contains a finite state machine ! specialized for regular-expression-based text filters. The ! reStructuredText parser implementation is based on this ! module. ! ! - Module "dps.urischemes" contains a mapping of known URI ! schemes ("http", "ftp", "mail", etc.). ! ! - Module "dps.utils" contains utility functions and classes, ! including a logger class ("Reporter"; see `Error Handling`_ ! below). ! ! - Package "dps.parsers": markup parsers_. ! ! - Function "get_parser_class(parsername)" returns a parser ! module by name. Class "Parser" is the base class of ! specific parsers. (dps/parsers/__init__.py) ! ! - Package "dps.parsers.restructuredtext": the reStructuredText ! parser. ! ! - Alternate markup parsers may be added. ! ! - Package "dps.readers": context-aware input readers. ! ! - Function "get_reader_class(readername)" returns a reader ! module by name or alias. Class "Reader" is the base class ! of specific readers. (dps/readers/__init__.py) ! ! - Module "dps.readers.standalone": reads independent document ! files. ! ! - Readers to be added for: Python source code (structure & ! docstrings), PEPs, email, FAQ, and perhaps Wiki and others. ! ! - Package "dps.writers": output format writers. ! ! - Function "get_writer_class(writername)" returns a writer ! module by name. Class "Writer" is the base class of ! specific writers. (dps/writers/__init__.py) ! ! - Module "dps.writers.pprint" is a simple internal document ! tree writer; it writes indented pseudo-XML. ! ! - Module "dps.writers.html" is a simple HyperText Markup ! Language document tree writer for HTML 4.01 and CSS1. ! ! @@@ Change name to html4_css1.py? Support aliases? ! ! - Writers to be added: HTML 3.2 or 4.01-loose, XML (various ! forms, such as DocBook and the raw internal doctree), TeX, ! plaintext, reStructuredText, and perhaps others. ! ! - Package "dps.transforms": tree transform classes. ! ! - Class "Transform" is the base class of specific transforms; ! see `Transform API`_ below. (dps/transforms/__init__.py) ! ! - Each module contains related transform classes. ! ! - Package "dps.languages": Language modules contain ! language-dependent strings and mappings. They are named for ! their language identifier (as defined in `Choice of Docstring ! Format`_ above), converting dashes to underscores. ! ! - Function "getlanguage(languagecode)", returns matching ! language module. (dps/languages/__init__.py) ! ! - Module "dps.languages.en" (English). ! ! - Other languages to be added. ! ! ! Front-End Tools ! =============== ! ! @@@ To be determined. ! ! @@@ Document tools & summarize their command-line interfaces. ! ! ! Document Tree ! ============= ! ! A single intermediate data structure is used internally by the ! DPS, in the interfaces between components; it is defined in the ! dps.nodes module. It is not required that this data structure be ! used *internally* by any of the components, just *between* ! components. This data structure is similar to a DOM tree whose ! schema is documented in an XML DTD (eXtensible Markup Language ! Document Type Definition), which comes in two parts: ! ! - the Generic Plaintext Document Interface DTD, gpdi.dtd [6], and ! ! - the OASIS Exchange Table Model, soextbl.dtd [7]. ! ! The DTD defines a rich set of elements, suitable for many input ! and output formats. The DTD retains all information necessary to ! reconstruct the original input text, or a reasonable facsimile ! thereof. Error Handling *************** *** 349,353 **** When the parser encounters an error in markup, it inserts a system ! message (DTD element 'system_message). There are five levels of system messages: --- 580,584 ---- When the parser encounters an error in markup, it inserts a system ! message (DTD element "system_message"). There are five levels of system messages: *************** *** 356,370 **** handled separately from the others. ! - Level-1, "INFO": a minor issue that can be ignored. There is no ! effect on the processing. Typically level-1 system messages are ! not reported. - Level-2, "WARNING": an issue that should be addressed. If ignored, there may be unpredictable problems with the output. ! - Level-3, "ERROR": an error that should be addressed. If ! ignored, the output will contain errors. ! - Level-4, "SEVERE": a severe error that must be addressed. Typically level-4 system messages are turned into exceptions which halt processing. If ignored, the output will contain --- 587,604 ---- handled separately from the others. ! - Level-1, "INFO": a minor issue that can be ignored. There is ! little or no effect on the processing. Typically level-1 system ! messages are not reported. - Level-2, "WARNING": an issue that should be addressed. If ignored, there may be unpredictable problems with the output. + Typically level-2 system messages are reported but do not halt + processing ! - Level-3, "ERROR": a major issue that should be addressed. If ! ignored, the output will contain errors. Typically level-3 ! system messages are reported but do not halt processing ! - Level-4, "SEVERE": a critical error that must be addressed. Typically level-4 system messages are turned into exceptions which halt processing. If ignored, the output will contain *************** *** 373,379 **** Although the initial message levels were devised independently, they have a strong correspondence to VMS error condition severity ! levels [9]; the names in quotes for levels 1 through 4 were borrowed from VMS. Error handling has since been influenced by ! the log4j project [10]. --- 607,613 ---- Although the initial message levels were devised independently, they have a strong correspondence to VMS error condition severity ! levels [8]; the names in quotes for levels 1 through 4 were borrowed from VMS. Error handling has since been influenced by ! the log4j project [9]. *************** *** 381,391 **** [1] PEP 256, Docstring Processing System Framework, Goodger ! http://www.python.org/peps/pep-0256.html [2] PEP 224, Attribute Docstrings, Lemburg ! http://www.python.org/peps/pep-0224.html ! [3] PEP 257, Docstring Conventions, Goodger, Van Rossum ! http://www.python.org/peps/pep-0257.html [4] http://www.rfc-editor.org/rfc/rfc1766.txt --- 615,625 ---- [1] PEP 256, Docstring Processing System Framework, Goodger ! http://www.python.org/peps/pep-0256.html [2] PEP 224, Attribute Docstrings, Lemburg ! http://www.python.org/peps/pep-0224.html ! [3] PEP 216, Docstring Format, Zadka ! http://www.python.org/peps/pep-0216.html [4] http://www.rfc-editor.org/rfc/rfc1766.txt *************** *** 393,408 **** [5] http://lcweb.loc.gov/standards/iso639-2/englangn.html ! [6] http://docstring.sourceforge.net/spec/ppdi.dtd ! ! [7] http://docstring.sourceforge.net/spec/gpdi.dtd ! [8] http://docstring.sourceforge.net/spec/soextblx.dtd ! [9] http://www.openvms.compaq.com:8000/73final/5841/ 5841pro_027.html#error_cond_severity ! [10] http://jakarta.apache.org/log4j/ ! [11] http://www.python.org/sigs/doc-sig/ --- 627,640 ---- [5] http://lcweb.loc.gov/standards/iso639-2/englangn.html ! [6] http://docstring.sourceforge.net/spec/gpdi.dtd ! [7] http://docstring.sourceforge.net/spec/soextblx.dtd ! [8] http://www.openvms.compaq.com:8000/73final/5841/ 5841pro_027.html#error_cond_severity ! [9] http://jakarta.apache.org/log4j/ ! [10] http://www.python.org/sigs/doc-sig/ *************** *** 421,425 **** This document borrows ideas from the archives of the Python ! Doc-SIG [11]. Thanks to all members past & present. --- 653,657 ---- This document borrows ideas from the archives of the Python ! Doc-SIG [10]. Thanks to all members past & present. |