Re: [Doxygen-develop] Status of XML development? (was Adding of n ew (all) HTML entities?)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Fri, Jul 20, 2001 at 09:42:21AM +0200, Prikryl,Petr wrote:
> Hi Michael and others,
> 
> Michael Lindig wrote...
> > you wish the XML feature. Have you an idea of the XML output format ?
> > think the format shall be like this:
> >
> > <DOXYGEN>
> >  	<file path="...">
> [snip]
> > Have you another idea ? 
> 
> There already is someone, who works on the XML part (Dimitri or someone
> else?). 

That will be me.

> If you look at doxygen/addon/xmlgen directory, you can find sources
> and also doxygen.dtd.  That person is definitely better in XML that I am.
> It is also important to have knowledge of doxygen's internal structures 
> (what elements are extracted by the parser) to choose the appropriate tags.

Well I'm not really an expert in XML, but I learn along the way.

> 
> This is not my case.  I am not prepared enough to discuss
> how the XML tags of doxygen should look like, or not.
> 
> However, I feel that XML is the right way to go, and I wish that more
> doxygen developers/users were interested in XML output.  Until now,
> I have only heard that some experimental XML support for doxygen
> is being implemented.  I understand, that possibly only one or two
> persons should work on that in the early stages of development.
> 
> In my last article I tried to explain why I wish the XML be the major
> intermediate and output format, and express my opinion on adding
> new character entities into doxygen generators.  Being one of the
> language maintainers, I was thinking hard about how the problems
> with special characters and various encoding should be solved.
> My personal opinion is that it would be endless work without XML.
> Briefly, XML is good for capturing semantics inside documents, and it
> solves the support for various languages and encodings.
> 
> Is anybody here (or any document) to explain current status of
> doxygen's XML development?

The XML development had been frozen for some time (due to lack of
time/priority mainly). Initially the XML support had been part of doxygen, 
but I later decided to move it outside of doxygen and prepare it to 
become an output plugin. Meanwhile my ideas have slightly changed.

The plans are to make XML output as part of doxygen again, with the
goal to replace the other output formats in the distant future
(but NOT before there is an equally powerful alternative for each).

I see XML output as an intermediate interface, that would allow
several front-ends to produce specific output (e.g. html output or
something very different like code metrics). The XML output would
contain all information, the front-ends will then pick the appropriate
information and transform that into the actual output.

In theory there are plenty of XML tools that can transform XML output
into something else. In practice these tools are just not there (at
least I haven't seen them). All that there really is, is an easy way to
parse XML and build up the structure contained in an XML file into
structures in memory. So the plan is to provide a C++/Qt based XML 
parser that understands doxygen's XML output. People that wish to
add support for another output format can do so by using the structures
build up by this parser.

The nice thing about having an intermediate
file is that the parser and front-end could also be written in another 
language such as Python. Furthermore, other input parsers could produce
the same XML output and benefit from the availble front-ends.

In summary doxygen would consist of the following:

- the main engine as a library
- the xml parser as a library
- an extendable configuration parser as a library (contains the
  config options for the engine, but can be dynamically extended by the
  front-ends to support more options).
- a number of front-ends, either as a libraries or as a standalone tools
- some glue to make a user friendly tool out of these.

With respect to DocBook format: I have looked at it, but I think it
covers only 20% of what doxygen will produce. So any docbook tool
(which are currently all SGML based by the way), wouldn't be very
useful.

I do not know how these ideas match/conflict with the character 
encoding problems mentioned by Petr. Would using XML like this still
solve all those problems?

Regards,
  Dimitri