From: Stefanie S. <sch...@ku...> - 2002-10-01 19:21:27
|
Dear XMLTK developers, I am parsing an XML file and want to "filter" out certain subtrees as a whole. The task is very similar to that of xrun, but I allow several and also recursive XPath expressions to be registered with the XPath processor. (In fact, I only added a loop to xrun.cpp in order to achieve this). However, due to the implementation of TSAX2XML, I do not always get a "clean" subtree: e. g. take ./my_xrun / bib/book library.xml For an XML file like <bib> <book comment="ordered"> <auth> <name>Newcomer</name> </auth> <auth> <name>Bernstein</name> </auth> <title>TA Processing</title> <year>1997</year> </book> ... the context-events are invoked at the following places: START context 1 <bib START context 2 > <book comment="ordered"> <auth> <name>Newcomer</name> </auth> <auth> <name>Bernstein</name> </auth> <title>TA Processing</title> <year>1997</year> </book> END context 2 ... It is due to the implementation of TSAX2XML, that the closing brackets are sometimes not closed until the next event is invoked. Above, we see that the event for context 2 is invoked before the tag-bracket is closed. The reason for this is in the following strategie: bool CTSAX2XML::startElement(XTOKEN xtName) { _CloseBracketIfNeeded(); ... So if I want to take an XML document and write several subtrees in different files, I will encounter problems. One example for this task would be to write all books from a library into one file, and collect all authors that have written books in another file. If I simply register the paths with the XPath processor, wait for the start of a context and write everything until the end of the context into a file (using TSAX2XML), the file-content may not be well-formed XML. I was hoping one could simply plug together existing modules, and that I could avoid writing my own version of TSAX2XML. I do think the problem will exist for other users as well - filtering out complete subtrees should be a common task. This is my question: Does this mean I can't use TSAX2XML for a task like this? Will TSAX2Bin have the same behavior? Thanks, Steffi |