From: Jean-Marc V. <jm...@fr...> - 2004-09-16 09:16:04
|
Sebastian Bossung wrote: >Hi all > >I was at the EDBT summer school on XML and databases last week, > There are abstracts and slides here: http://edbtss04.dia.uniroma3.it/ Maybe you could give us your impressions? >where >I heard a lot about XQuery but also a fair amount about XML databases >in general. >This lead to an idea on node numbering (inspired by two of the talks) >that I would like to propose as a basis for discussion here. The key >feature is that the numbering scheme is able to store documents of >any structure and still allows updates without any renumbering. I put >up a web page at: > >http://www.sts.tu-harburg.de/~se.bossung/numbering/nodenumbering.html > >because I felt that some graphics might come handy. > > Thank you and Timo for these contributions. As most of you on the list know, the current indexing scheme in eXist is insufficient to handle deep-nested and / or irregular node structures. Wolfgang proposes an ad-hoc document splitting superposed on the current numbering scheme. You guys come with the ORDPATH scheme. I don't know if Timo is willing to contribute to eXist with his research code, but in any case my concern is how to integrate new indexing schemes to eXist. In a word: modularity. The article mentioned by Sebastian (http://www.doc.ic.ac.uk/~pjm/diweb2004/DIWeb2004_Part7.pdf) says that ORDPATH is in the next SQL server version. The SQL server developpers probably had an API to make changes easy. We don't have that. The long primitive type for the guid is carried around the software naked. We have to change that, with proper interfaces and factories. The org.exist.dom.NodeProxy class is a good starting point for an abstraction of an index. NodeProxy has two long fields: guid and internalAddress, the second being for DOMFile , a kind of BTree, a B+tree file. NodeProxy is referenced 596 times, in packages dom, storage, xquery, and others. The new interface for NodeProxy should represent the way is it used in DOM and XQuery to determine the fundamental relations: preceding/following, ancestor/descendant, parent/child nodes. Or maybe we can compute out of the ORDPATH a long index with the desired properties? Is this what Timo says here: > The combination of a hierarchical numbering scheme (with variable id length) and a fixed-length node > id could be a viable solution. Note that while looking the code, I found that following and preceding axis are not yet supported (in class LocationStep). I'm willing to contribute in this refactoring, but I need the inputs of the Database experts. -- Jean-Marc Vanel Conseil et Services / développement & intégration logiciels Logiciel libre, Web, Java, XML ... A la pointe de la technique, au service des projets http://jmvanel.free.fr/ ===) CV, software resources - computer science diary : http://jmvanel.free.fr/computer-notes.html Worldwide Botanical Knowledge Base http://wwbota.free.fr/ test XML query engine: http://jmvanel.free.fr/protea.html |