Re: [Epydoc-devel] XML Docbook output

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Mon, 5 Feb 2007 17:41:43 +0100, "Daniele Varrazzo"
<dan...@gm...> wrote :

> > I'm currently an active user of Python and XML Docbook and would
> > like to have my Python documentation generated in XML Docbook.
> >
> > Epydoc seems to be a really fine documentation generator and I would
> > like to help you add an XML Docbook output.
> A sketch of the epydoc structure and where you may want to put your
> hands if you want to implement is the following::
> 
>              Markups                 Writers
> 
>              epytext |             | LaTeX
>              javadoc | -> build -> | HTML
>            plaintext |             | plaintext
>     restructuredtext |
> 
> Each markup is implemented by a ParsedDocstring subclass. Writers are
> more free-form beasts: there is no base class for writers, whose
> interface is a single function called by the cli.py module after
> options have been parsed and a docindex generated.

Ok I understand.

> The ParsedDocstring interface offers methods to retrieve a docstring
> in the target output format: currently methods to_plaintext(),
> to_html(), to_latex() methods are exposed. The concrete subclasses are
> responsible to implement such methods (while a fallback must be
> provided by the base class).
> 
> Writing a Docbook generator should require the following steps:
> 1. add a new writer implementing the sequences of calls to observe the
> built set of documents;
> 2. add a new method ParsedDocstring.to_docbook() implementing a
> default behavior, which typically is to return a a few more than the
> text version;
> 3. implement the details of converting the specific markups into
> Docbook format.
> 
> 2. should be easy. You may wrap the to_plaintext() output into
> <programlisting> tags, i guess; see ParsedDocstring.to_html() for an
> example.

Ok it's a start

> 1. can be cargo-culted from the LaTeX writer, but this probably would
> lead to more maintenance burden. Probably a nicer work would be to
> refactor the LaTeX writer creating a base class implementing the
> strategy (e.g. write the start; for each class: write it; write the
> end...) of creating a document from the parsed documentation (as
> "document" i mean something that can be read from the beginning to the
> end, which is very different from the hypertext created by the html
> writer) but delegating the details of how to write the leaves (such a
> paragraph or a string) to concrete subclasses (cfr. the strategy
> pattern). Such base class may be subclasses into concrete writers for
> LaTeX, Docbook, plain text (which is currently less maintained than
> the other writers), and going on with single page html, reST...

Seems reasonable.
Does that mean that currently the LaTeX and html writers are completely
independant ?
And how is currently built the LaTeX writer ? Do you just throw the
LaTeX markups one after another in a procedural way ?

> The step 3. could require more knowledge of the single markups. On the
> pro side it can be accomplished gradually, because there is always a
> fallback that would appear as monotype text. The current to_html()
> implementation would help you of course.

That step could be quite long but rather easy : Docbook markup are
really clear and there are no subtles/magics as with LaTeX.

> I'd ask Edward if he would welcome the refactoring described in step
> 2. I think that creating a Docbook writer the naive way would lead to
> harder maintenance, inconsistencies between formats and to too much
> code duplication.

I of course agree.
I can help a few hours (let's say 3-5) per week at most.

> Let me now if i can help you, but i'd like to hear Ed's advice first.

Well I would have started even without a refactoring, just for the fun
of it but it's better to wait and see if you want to refactor first...

Regards,