Re: [Doxygen-develop] strategies for XHTML support
Brought to you by:
dimitri
From: Francesco M. <f18...@ya...> - 2008-03-02 15:01:12
|
do...@ke... ha scritto: > On Sun, Mar 02, 2008 at 02:00:35PM +0100, Francesco Montorsi wrote: > >> btw if you are not interested to reach 100% well-formness in a single >> patch, then the one I've attached seems to work quite well in terms of >> output rendering (i.e. there are no big differences to the std doxygen >> HTML4, just some spacing differences). I'm not sure however it >> well-behaves respect the other output formats... > > It might also break some post-processing some places use. I know > several companies which post-process Doxygen output to create their own > documentation, but I don't know how robust their processing is. I think that the postprocessing of the HTML output will be much simplified if doxygen starts outputting XHTML instead of HTML4, which is not valid XML. Companies doing this kind of postprocessing will eventually need some changes to their scripts but this is probably true after all doxygen releases since the structure of the generated HTML is not granted to remain the same and in fact, most times it changes from a release to another. >> (*) = I wanted to enable XHTML output in order to use XSLT stylesheets >> over it, instead of doing it over the doxygen XML output. > > Have you tried processing it with the W3C 'tidy' program? That usually > does a pretty good job of producing XHTML from HTML with close tags > missing (what lynx calls "tag soup"), and will produce XML as well as > XHTML output. (Doing it on the number of files Doxygen creates is a > pain and slow, though, and you need to disable its comments about how > 'bad' the original is.) tidy does a good job but I think it's a "dirty" solution: its output is not granted to be the "right" one (it repairs the HTML as best as it can but it's still a machine and can't look at the context to understand what's the right fix) and may generate rendering artefacts (caused by syntatically correct but semanthically wrong markup). It's true that cleaning with 'tidy' the generated XHTML of the doxygen samples (I'm testing it with my patch applied) it shrinks the validation errors from about 700 to about 30 (great!!) but still those 30 needs human revision. In the bigger project which I'm trying to convert to Doxygen (FYI it's wxWidgets), there would be still hundreds of errors to handle by hand. Not feasible. It's the doxygen output which should be correct without any further processing. Doxygen cannot continue to produce HTML4 forever (*)! Technologies are evolving and the switch from HTML4 to XHTML I think is worth some troubles/regressions. It's just that sometimes I think that all doxygen sources should be entirely rewritten and reorganized (with more comments!!) in order to fix all of these errors. In conclusion: I need a pause and some help to complete this patch :) What's your (doxygen team) interest toward XHTML? Isn't it one of your priorities? Francesco (*) = I also strongly doubt it produces VALID html4 now; testing it is not easy as doing an HTML4 validation test is much more difficult than doing an XHTML validation test and requires for me to upload file by file the generated output to the w3c validator. |