Thread: [Epydoc-devel] Epydoc URLs and external API linking
Brought to you by:
edloper
From: Daniele V. <pi...@de...> - 2007-02-19 00:05:21
|
Hello, i just added Epydoc a system to reference to external API. I know i should have stopped adding features, but there is a pair of points i'd like to not ignore and implementing this module helped me to understand what can be done. The HTML pages generated by Epydoc currently suffer of a pair of issues: - URL fragments are not case sensitive, while Python names are: this may lead to ambiguous URLs (bug #1659522) - XHTML "name" attribute for anchors are actually copied to the "id" attribute, and XML id's can't start with an underscore (cfr. http://www.w3.org/TR/xhtml1/#C_8) so we should decide whether to keep the 1-1 identity from Python names to URL fragments (which creates non strictly valid URLs) or decide to drop such identity and create a function to mangle the names into valid fragments. With the provided tool, people could refer to documented objects without messing up with the URL details. We may then choose to generate URL's with uglier fragment but XML compliant. External API references can be used both in docstrings parsed by Epydoc (to refer to objects generated in other documentations) and in stand-alone documents to be transformed in HTML using the provided command line wrapper. It may also be used to refer to API documentation generated by other tools such Doxygen. While the redirect.html kinda solves the problem of long URLs, it still requires long names to access the object and doesn't solve the naming problems raised above. Furthermore uses Javascript to perform its job. What i like less of the current implementation are the command line options (described below). Suggestions about a better way to specify the external API options are welcome. The package has been tested with docutils 0.3.7 and 2007-01-28 snapshot. --------------------------------------------- Edward, would you like to keep the current URL fragments with their small shortcomings or do you prefer to mangle fragments? I prefer to decide it before releasing the beta package, which is about ready. A simple naming scheme could be: - prefix each uppercase and underscore with an underscore - prefix each name with a letter (let's say N, for "name") The mapping should be injective even considering non-case-sensitive comparisons. Examples: MyName -> N_My_Name _my_name -> N__my__name __init__ -> N____init____ If you want, i'll implement it and test it is used everywhere in the HTML writer. --------------------------------------------- Follows a more detailed description of the system, copied from the module docstring. This module allows a Docutils_ document to refer to elements defined in external API documentation. It is possible to refer to many external API from the same document. Each API documentation is assigned a new interpreted text role: using such interpreted text, an user can specify an object name inside an API documentation. The system will convert such text into an url and generate a reference to it. For example, if the API ``db`` is defined, being a database package, then a certain method may be referred as:: :db:`Connection.cursor()` To define a new API, an *index file* must be provided. This file contains a mapping from the object name to the URL part required to resolve such object. Index file ---------- Each line in the the index file describes an object. Each line contains the fully qualified name of the object and the URL at which the documentation is located. The fields are separated by a ``<tab>`` character. The URL's in the file are relative from the documentation root: the system can be configured to add a prefix in front of each returned URL. Allowed names ------------- When a name is used in an API text role, it is split over any *separator*. The separators defined are '``.``', '``::``', '``->``'. All the text from the first noise char (neither a separator nor alphanumeric or '``_``') is discarded. The same algorithm is applied when the index file is read. First the sequence of name parts is looked for in the provided index file. If no matching name is found, a partial match against the trailing part of the names in the index is performed. If no object is found, or if the trailing part of the name may refer to many objects, a warning is issued and no reference is created. Configuration ------------- This module provides the class `ApiLinkReader` a replacement for the Docutils standalone reader. Such reader specifies the settings required for the API canonical roles configuration. The same command line options are exposed by Epydoc. The script ``apirst2html.py`` is a frontend for the `ApiLinkReader` reader. API Linking Options:: --external-api=NAME Define a new API document. A new interpreted text role NAME will be added. --external-api-file=NAME:FILENAME Use records in FILENAME to resolve objects in the API named NAME. --external-api-root=NAME:STRING Use STRING as prefix for the URL generated from the API NAME. .. _Docutils: http://docutils.sourceforge.net/ -- Daniele Varrazzo - Develer S.r.l. http://www.develer.com |
From: Edward L. <ed...@gr...> - 2007-02-19 16:31:47
|
On Feb 18, 2007, at 7:04 PM, Daniele Varrazzo wrote: > Edward, would you like to keep the current URL fragments with their > small > shortcomings or do you prefer to mangle fragments? I prefer to > decide it > before releasing the beta package, which is about ready. > > A simple naming scheme could be: > > - prefix each uppercase and underscore with an underscore > - prefix each name with a letter (let's say N, for "name") Would this be done just for anchors, or for filenames as well? I guess this depends on whether we are assuming that files are written to a case-insensitive filesystem. I don't mind changing anchors too much, but would rather not have the filenames become like N_My_Class- class.html, if we can avoid it. I guess perhaps there could be a command-line switch to control whether the name mangling applies to just the anchors or to the entire URL -- perhaps something like "-- case-insensitive-filesystem" to reflect its intended usage. Thoughts? -Edward |
From: Paul P. <pog...@gm...> - 2007-02-19 22:24:07
|
Edward Loper wrote: > On Feb 18, 2007, at 7:04 PM, Daniele Varrazzo wrote: > > Edward, would you like to keep the current URL fragments with their > > small > > shortcomings or do you prefer to mangle fragments? I prefer to > > decide it > > before releasing the beta package, which is about ready. > > > > A simple naming scheme could be: > > > > - prefix each uppercase and underscore with an underscore > > - prefix each name with a letter (let's say N, for "name") > > Would this be done just for anchors, or for filenames as well? I > guess this depends on whether we are assuming that files are written > to a case-insensitive filesystem. I don't mind changing anchors too > much, but would rather not have the filenames become like N_My_Class- > class.html, if we can avoid it. I guess perhaps there could be a > command-line switch to control whether the name mangling applies to > just the anchors or to the entire URL -- perhaps something like "-- > case-insensitive-filesystem" to reflect its intended usage. > > Thoughts? Yes, please. Too much mangling is ugly. And in this case it is only for completeness sake, since for 99% of projects it will not matter. So, please make it optional (as proposed) or disabled at all. Paul |
From: Edward L. <ed...@gr...> - 2007-02-20 16:17:14
|
On Feb 18, 2007, at 7:04 PM, Daniele Varrazzo wrote: > A simple naming scheme could be: > > - prefix each uppercase and underscore with an underscore > - prefix each name with a letter (let's say N, for "name") > > The mapping should be injective even considering non-case-sensitive > comparisons. Examples: > > MyName -> N_My_Name > _my_name -> N__my__name > __init__ -> N____init____ I'd say to go ahead with this, for anchors only, but let's try to keep the mangled names as readable as possible. First, I don't think adding a prefix is necessary, because (1) we don't generate many other anchors anyway, and (2) when we do generate them, they all currently contain "-". We could continue that with any new anchors, thereby avoiding any conflicts. Second, I don't like using underscore for escaping, because the result looks like it could be a different identifier. If we're going to have a character to designate upcase, why not have it be something that doesn't occur in normal names, so it's clear that it's not part of the name? Here's what we're allowed to use in xhtml for an id: Id ::= (Letter | '_' | ':') (NameChar)* NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender So how about using ":" to mark upcase letters? We could then leave lower-case letters and underscores untouched. Example cases: > original -> transformed Example url > > ---------------------------------------------------------------------- > - > MyName -> :My:Name http://foo/bar/SomeClass.html#:My:Name > _my_name -> _my_name http://foo/bar/SomeClass.html#_my_name > __init__ -> __init__ http://foo/bar/SomeClass.html#__init__ > MY_CONST -> :M:Y_:C:O:N:S:T http://foo/bar/ > SomeClass.html#:M:Y_:C:O:N:S:T Recall that this mangling is *only* used for anchors of functions, methods, & variables -- not for classes. So in the common case, where functions, methods, and vars are written in all lower case, the mangling will be an identity function. If we wanted to make the MY_CONST look better, at the expense of complexity, we could use these rule instead: - If name is all-caps, return name surrounded by ":" - Otherwise, prefix each capital letter w/ ":" Making the examples: > MyName -> :My:Name http://foo/bar/SomeClass.html#:My:Name > _my_name -> _my_name http://foo/bar/SomeClass.html#_my_name > __init__ -> __init__ http://foo/bar/SomeClass.html#__init__ > MY_CONST -> :MY_CONST: http://foo/bar/SomeClass.html#:MY_CONST: -Edward |