Re: [Epydoc-devel] XML Docbook output

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Saint Germain ha scritto:
 >> 1. can be cargo-culted from the LaTeX writer, but this probably would
 >> lead to more maintenance burden. Probably a nicer work would be to
 >> refactor the LaTeX writer creating a base class implementing the
 >> strategy (e.g. write the start; for each class: write it; write the
 >> end...) of creating a document from the parsed documentation (as
 >> "document" i mean something that can be read from the beginning to the
 >> end, which is very different from the hypertext created by the html
 >> writer) but delegating the details of how to write the leaves (such a
 >> paragraph or a string) to concrete subclasses (cfr. the strategy
 >> pattern). Such base class may be subclasses into concrete writers for
 >> LaTeX, Docbook, plain text (which is currently less maintained than
 >> the other writers), and going on with single page html, reST...
 >
 > Seems reasonable.
 > Does that mean that currently the LaTeX and html writers are completely
 > independant ?

Yes, they are. They actually create very different output: html creates
an hypertext with many index page summarizing different facets of a
package (the modules, the classes, all the names...) and each module
creates a matching html page, with links to other page with annotated
and colored code and some javascript thrown in. The LaTeX writer (of
which i don't have a good insight: i rarely used it and worked on its
source) also creates many files, but tied together into a document
suitable for printing.

There are some common services that both LaTeX and html could benefit,
but i can see much more similarities between two writers of linear
documents such LaTeX and Docbook would be.

 > And how is currently built the LaTeX writer ? Do you just throw the
 > LaTeX markups one after another in a procedural way ?

I'll try to answer, but i don't know if i correctly understood your
question.

The writer (both LaTeX and others) receives a DocIndex instance holding
the whole content of the analysis Epydoc carried on the code which has
been fed to it. It's the writer responsibility to decide what to use of
the whole informations bulk and how: the writer is a viewer of the model
in the DocIndex instance.

The LaTeX document generation is directed by the write() method, which
creates a top-level file and then iterates over the DocIndex node
running the proper generation function for each class and module it
meets. In this "generation strategy" there is almost no LaTeXisms (about
only the generated ".tex" files extensions smell like LaTeX).

The functions called by the write() method on selected detail decide
what to write about such detail; so for a "class" there is an iteration
on its methods, and for each of them a proper write_something() function
is called. Apart from sporadic snippets of LaTeX code you can find here
and there, the navigation in the DocIndex nodes basically the same you
would carry on to generate a Docbook output (assuming you want to put
the same information in such documents of course).

The leaves function called during the index navigation are the ones
where you will find most of the \stuff{\like\this} you can expect in a
LaTeX generator, which should be replaced by
<stuff><like/><this/></stuff> to create a Docbook document.

 > Well I would have started even without a refactoring, just for the fun
 > of it but it's better to wait and see if you want to refactor first...

Then, please, start and have a good time while coding :) Please, don't
wait a refactoring movement pouring from above. Mainly because there's
no point in creating a base class for a single subclass: it would be
programming in the vacuum. Instead trying to adapt the LaTeX writer into
a Docbook writer would give you a precise idea of what is common to both
writers and what is specific to each one.

I'd effectively proceed this way:

  - copy the current LatexWriter class into a LinearDocumentWriter class.
  - create an empty subclass DocbookWriter(LinearDocumentWriter)
  - walk the code from the entry point write(): each time you stumble
into a latex output string, translate such output into Docbook idiom,
but put the updated strings into the DocbookWriter.

You will end up with a base class dispatching action into a concrete
subclass, which is actually an implementation of the strategy design pattern

See for example LatexWriter.write_class() method: it's about independent
from the output format, except for a single statement:

     # Label our current location.
     out('    \\label{%s}\n' % self.label(doc))

You may replace it with a call:

     self.write_current_location(out, self.label(doc))

and write a matching implementation in the Docbook subclass:

     def write_current_location(self, out, label):
         # pretending i know Docbook markup...
         out('    <a name="%s" />\n' % label)

Where to dispose all that messy latex strings? Probably another concrete
subclass would be the ideal bin :)

Hope this helps. Feel free to write if you need help. Regards,

Daniele