[Epydoc-devel] Valid XHTML patch
Brought to you by:
edloper
|
From: Jacek K. <ja...@bn...> - 2004-10-03 19:52:56
|
Hello,
I am using Epydoc as the documentation generator for my project. It is
great in extracting the documentation, and the output looks nice (when
frames are disabled), but... I have checked the HTML output with
validating parser. And it was invalid.
I have looked into the HTML code, and although it was marked as XHTML
1.0 Transitional it was far from being any HTML (according to W3C
specifications). The code was invalid and ugly. Good example how not to
write HTML. I would be ashamed to publish such documents on my project's
page (I guess I am not the only one). So I have filled the bug report
#1039049 on the sourceforge.
Today I looked into the Epydoc code and fixed the HTML generation.
I have attached the patch to the bug report.
What the patch does:
- makes the HTML generated valid XHTML 1.0 Transitional and XHTML 1.0 Frameset,
- replaces deprecated elements (like <font/> and <center/>) with
structural elements and/or proper style,
- replaces <b/> and <i/> (which are not recommended by W3C) with
<strong/> and <em/> or other elements with semantics matching the
usage (eg. <h1>). Wherever I could guess what the markup is I added
a "class" attribute so the style of the element may be further
changed.
- separates layout definition from predefined styles (which differ only
in colors) to a single string, so the same definitions are not
repeated in css.py and may be modified in one place.
- escapes control characters in colorized regular expressions. Without
that regexp like r'[\x00-1f]' would result with 0 byte included in the
HTML output which is invalid and makes the rendering of the page
inpredictable (some browser will treat the byte as EOF)
I tried to make the generated not differ from the ones generated without
the patch. And they should not differ much in any modern
standard-compliant browser. If they do -- it is a bug (unless the look
better now, of course). If they look worse in some
non-standard-compliant, but "important" browser (read: IE), than some
hack may be needed. But there is no reason to use invalid HTML (means to
do that with valid HTML may be found in the Net).
Some more things that could be done:
- Include alternate stylesheets in the output, so they can be chosen
while browsing the documentation. That seems easy and I will probably
do that soon.
- XHTML Strict generation. That would probably need much more code
changes and removal of frames support (or making it and option).
- drop using tables for layout. Epydoc doesn't do that much, as most of
its output are real tables.
- Unicode support. The output could be alway UTF-8. But, I guess, a lot
of Epydoc code would have to be updated, not only the HTML generation.
Fortunately most code documentation is English only, even in
international projects.
I hope you will find my patch usefull and it will be applied to the
Epydoc code.
Greets,
Jacek
|