From: Wolfgang M. <me...@if...> - 2003-10-04 09:23:45
|
I'm happy to announce a new snapshot release with many important changes: 1) Memory consumption during query processing While experimenting with a large collection of Asian-language TEI docs, I found that memory consumption is much too high for some types of queries, especially queries on the fulltext of the document. I measured up to 60M for a query on some frequent single-char tokens. As a result, major parts of the query engine have been modified to reduce memory consumption during query processing. So far, eXist used to load the entire list of text nodes matching a given text token into memory, then checked the nodes against the query context and applied an intersection or union on the resulting sets. All these operations have now been merged into one, single step: the list of matching text nodes is no longer kept in memory. Instead, nodes are directly matched against the context node set while scanning the index and only the relevant nodes are returned. As a result, killer queries like match-all(., 'a.*', 'b.*', 'c.*') will still take some time, but they can't kill the database. In addition, I replaced the node set implementations by better variants with reduced memory consumption. Using simple arrays for node sets is faster than other alternatives, e.g. trees. However, array length is fixed, so arrays are frequently reallocated while the set grows. The new implementation tries to better estimate the expected size of the node set. This estimation is correct in many cases, so only minimal reallocations are required during query processing. The new class org.exist.dom.ExtArrayNodeSet, uses a combination of an AVL tree (for document ids) and arrays (for the nodes). I have also spent some time to optimize the various algorithms using profiling information. Please note that the match highlighting feature does still consume a large amount of memory if the number of text hits is very large. You should disable this feature for Asian-language texts. 2) XPath query processing XPath query processing has been changed to reflect the query processing model of XPath 2.0 (and XQuery), i.e. every expression returns a sequence of items, where an item is either a node or an atomic value and the single item is also a sequence. I got most of the ideas from Michael Kay's Saxon. The migration to the XPath 2 model has just begun, so the code should be regarded as unstable, though most of your old queries should work. Best regards, Wolfgang -- |
From: Wolfgang M. <me...@if...> - 2004-01-23 14:24:52
|
I still cannot commit to CVS. Anyway, I posted a copy of my local version to sf. It should be available here: http://prdownloads.sourceforge.net/exist/eXist-snapshot-20040123.zip?download Changes include: * Redesigned HTTP/REST-style interface: REST-style access is now also available through a servlet, called EXistServlet, which listens on http://localhost:8080/exist/servlet/ by default. Both, the stand-alone HTTP server and the EXistServlet are based on the new class RESTServer. * Fixed ArrayIndexOutOfBounds exception in btree. * Updated docs. * Fixed several bugs in the client GUI that have been posted by different users. * Uploading binary resources is now possible by local and remote clients. * Fixed bug in serializer: it inserted a newline before the XML processing instruction. This lead to parsing errors on the client side. * Growth of index files: removed btree pages are now better reused. * Moved LOG4J configuration from conf.xml to log4j.xml: log4j.xml will be loaded either from exist.home, from the location specified by -Dlog4j.configuration or through class loader mechanism. * Changed usage of String methods to run eXist with kaffe. Regards, Wolfgang |
From: Christofer D. <du...@c-...> - 2004-01-23 19:12:43
|
Your fix form my "missing elements problem" works fine. Now everything is working absolutely fine. Thanx Wolfgang Chris Wolfgang Meier wrote: >I still cannot commit to CVS. Anyway, I posted a copy of my local version to >sf. It should be available here: > >http://prdownloads.sourceforge.net/exist/eXist-snapshot-20040123.zip?download > >Changes include: > >* Redesigned HTTP/REST-style interface: REST-style access is now also >available through a servlet, called EXistServlet, which listens on >http://localhost:8080/exist/servlet/ by default. Both, the stand-alone HTTP >server and the EXistServlet are based on the new class RESTServer. > >* Fixed ArrayIndexOutOfBounds exception in btree. > >* Updated docs. > >* Fixed several bugs in the client GUI that have been posted by different >users. > >* Uploading binary resources is now possible by local and remote clients. > >* Fixed bug in serializer: it inserted a newline before the XML processing >instruction. This lead to parsing errors on the client side. > >* Growth of index files: removed btree pages are now better reused. > >* Moved LOG4J configuration from conf.xml to log4j.xml: log4j.xml will be >loaded either from exist.home, from the location specified by >-Dlog4j.configuration or through class loader mechanism. > >* Changed usage of String methods to run eXist with kaffe. > >Regards, > >Wolfgang > > >------------------------------------------------------- >The SF.Net email is sponsored by EclipseCon 2004 >Premiere Conference on Open Tools Development and Integration >See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. >http://www.eclipsecon.org/osdn >_______________________________________________ >Exist-open mailing list >Exi...@li... >https://lists.sourceforge.net/lists/listinfo/exist-open > > |