Re: [Libxmlplusplus-general] parametrizing libxml++ for the character /string type

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Christophe de VIENNE wrote:

> I'm currently not having the time to do this, but your patches are very 
> welcome. However I won't put them in the current branch but in the unstable 
> one (which does not exists yet but as soon as it is needed, it will) as 
> Murray said.

ok, here is the first patch. The goal was to change the implementation to
delegate whatever we can down to libxml2, while respecting the existing API.

So, everything compiles, and the examples run unchanged.

There are, however, a couple of issues which need to be addressed. I hope we can
sort them out together. First a little account on what this change does:

All libxml2 structures use '_private' members for application data. I use that
to point to the corresponding libxml++ wrapper class, so we can do a reverse
lookup. For example, to access the first child node of a xmlpp::Node object,
you'd do something like:

reinterpret_cast<Node *>(this->_impl->children->_private)

Easy enough, isn't it ?

The tricky point is, as said earlier, ownership management. libxml2's nodes
are owned by their parent nodes (and ultimately by the enclosing document),
not by the libxml++ wrapper object. We need to work out how transfer of ownership
should happen when a node is unlinked from its document / parent node.

Another tricky point is that libxml2 will automagically merge nodes occasionally,
for example if you insert a new text node right after an existing text node.
Thus,

Node *Node::add_child(const std::string &)

may or may not return a new object. It is, however, (and luckily,) owned by the
parent node, so the caller doesn't have to care. A similar argument is to be made
for setting attributes.

All this said, I believe there are now a couple of ways to enhance the API itself:

* I'd like to add iterators to make child node and attribute traversal more efficient
   (right now they are copied into a temporary container that is returned)

* I'd like to suggest to add a 'Visitor' interface for simpler traversal of a document,
   notably to externalize it (the 'write' method would look a *lot* simpler)

* the domparser should be refactored into a 'Document' and possibly a single
   'Document *parse_document(std::istream &)' factory function.

* add new node types such as 'processing instruction', 'cdata section', etc.

* new functionality can be added (notably the xpath lookup stuff I have been suggesting)

* do the split into character type agnostic/specific parts, and hook up external
   unicode libraries

Anyways, I guess that's enough for tonight :-)
Let me know what you think of this plan...

Enjoy,
		Stefan