Re: [Doxygen-develop] Doxygen and XML...
Brought to you by:
dimitri
From: Dimitri v. H. <di...@st...> - 2001-07-23 15:04:21
|
On Mon, Jul 23, 2001 at 02:34:33PM +0200, Prikryl,Petr wrote: > Dimitri wrote... > > > > I see XML output as an intermediate interface, that would allow > > several front-ends to produce specific output (e.g. html output or > > something very different like code metrics). The XML output would > > contain all information, the front-ends will then pick the appropriate > > information and transform that into the actual output. > > The more I think about it, the more I also incline towards internal > XML DTD (say doxygen internal XML). For the moment I even plan to drop DTD's altogether for the internal XML. The reasons: 1) The output is not very fixed at the moment, so maintaining a consistent DTD is more work. 2) DTD's are to be replaced by schemas (see http://www.w3.org/TR/xmlschema-0/) in the future (at least that is what W3C would like to see). 3) DTD's are difficult to read and do not have that much expressive power. > > In theory there are plenty of XML tools that can transform XML output > > into something else. In practice these tools are just not there (at > > least I haven't seen them). All that there really is, is an easy way to > > parse XML and build up the structure contained in an XML file into > > structures in memory. So the plan is to provide a C++/Qt based XML > > parser that understands doxygen's XML output. People that wish to > > add support for another output format can do so by using the structures > > build up by this parser. > > I am very new to XML, but there are tools used with DocBook XML and > they are more general than only for supporting DocBook. This > requires better analysis. Ok, let me know what you find. > > > With respect to DocBook format: I have looked at it, but I think it > > covers only 20% of what doxygen will produce. So any docbook tool > > (which are currently all SGML based by the way), wouldn't be very > > useful. > > I am not sure (just starting with DocBook), but I think that DocBook > is much richer that say HTML or LaTeX and it is very suitable for > producing the end documents. It may not fit to be used as > the internal XML format, but I would see it as the main final output > format. Let's think about the following approach: > > input sources > | > +--> doxygen internal XML (by doxygen parsers) > | > +--> DocBook XML > | | > | +--> HTML > | +--> RTF > | +--> jadetex --> DVI, PDF, PS > | +--> etc. > | > +--> some other postprocessing of the internal doxygen XML Yes, this is the way I see it as well. DocBook XML is just one (but a generic) output format. I think HTML and LaTeX would still be better off with a direct front-end. See below why. > > The important thing to note is that DocBook is not exclusively SGML > based. While this could be the truth in the past, majority of > DocBook users probably uses DocBook XML these days. Norman Walsh, > one of the DocBook leaders also considers the XML be the future of > DocBook. I suggest to focus on DocBook XML exclusively (instead of > thinking about DocBook SGML). > > What should be clarified is the mentioned 20% coverage of doxygen's > problems by DocBook. The 20% is for the internal XML format. All the structure information that doxygen currently outputs is not covered by DocBook. Basically only part of the user documentation blocks and their mark-up is present (things like <para> and <emphasis> are there, but for instance <bold> is not). Ofcourse DocBook has a lot more when it comes to formatting a book, but that is all output format related. But also for DocBook as the only output format I see problems. Think for instance about the diagrams that doxygen produces. For HTML these result in images overlayed with a clickable map. In latex these result in a vector-based EPS picture. I do not see how to get that result from DocBook XML. > > I do not know how these ideas match/conflict with the character > > encoding problems mentioned by Petr. Would using XML like this still > > solve all those problems? > > I guess that yes -- XML will always help to solve the problems. At > least, the first parsing phase can be done without problems with > respect to encoding. Once having the correctly marked internal XML, > all problems with languages and encoding become covered by the XML > standard. Ok, I trust you that it all will work out nicely :-) [snip] > > In summary doxygen would consist of the following: > > > > - the main engine as a library > > - the xml parser as a library > > - an extendable configuration parser as a library (contains the > > config options for the engine, but can be dynamically extended by the > > front-ends to support more options). > > - a number of front-ends, either as a libraries or as a standalone tools > > - some glue to make a user friendly tool out of these. > > As far as I understand, the internal XML format will not contain any > sentences generated by doxygen translators. > The things like the text around, say, the list of places from where > the method is called, is not generated into the internal XML. > Am I right? I would like to be ;-) Yes, that is the idea. But ofcourse these sentences have to be somewhere. With multiple output formats there is the chance of duplication of information. So there should still be one place where the translations of these sentences are defined. Regards, Dimitri |