From: Wolfgang M. <wol...@gm...> - 2006-04-18 19:29:21
|
> As for Pedro's disconcerting report that it's faster to retrieve all the > children by exhaustively specifying them by name rather than using the so= rt > of abstract notation that XPath envisages (and which implementations are > supposed to be optimised for), then I think again that it is a little aki= n > to retrieving the parent node via its internal ID. Even if it gets result= s, > it is so at odds with how an XQuery/XPath implementation or application i= s > supposed to work that if it is indeed necessary to use it, then there's > probably something seriously amiss with the implementation or the > application, or both. It was actually me who told Pedro about the use of util:node-by-id, just because he asked for a way to address nodes selected by a first query for later retrieval. There's no other reason for using these functions here than just reducing the network load. He didn't have IDs and for experimentation, using the internal id instead seemed to be ok. I thought it should be clear that he would have to add proper ids to the elements themselves later. Concerning the speed of *[not(self::c)], it indeed used to be very slow some time ago. As far as I remember, I fixed this a few months back, and performance should have been ok. Obviously the type of expression is now slow again, and I told Pedro to check if using the explicit element names would be any faster. This points to a general problem and I am quite as much in despair about this as some users might be. Actually, I sometimes think I'm the only one who is concerned with performance issues (apart from Evgeny ;-) Our test suite has grown substantially over the past months. However, no one cares to provide tests to guarantee that at least the most often used types of expressions work fast enough on larger document sets. Right now, most of us (or at least Pierrick and me) are busy with redesign. I'm trying to complete the switch of eXist's internal indexing scheme, which should remove a few very annoying problems; Pierrick is mainly working to align the XQuery implementation with the specs and test eXist against the official XQuery test suite (besides that we still have those db corruptions to fix). You will understand that we don't have much time left to be concerned about performance. It would thus be great if we could get more help from others. Someone could even collect tests more systematically. If you have a certain expression that is too slow on your data set, please try to supply an autonomous test case (one that can be run by other people without needing special preparation). It would help a lot if we could collect such test cases in a central place, so whenever a developer is unsure if he introduced a performance or memory leak, he could just run those queries and get at least a general impression. Wolfgang |