From: Wolfgang M. <me...@if...> - 2002-07-25 15:46:17
|
Thanks to Mathias' bug report, I have found a severe bug which leads to=20 uncontrolled memory consumption when parsing larger documents. eXist uses= =20 several page buffers to cache data- and btree-pages. However, in some cas= es,=20 pages which were removed from the cache have not been properly garbage=20 collected (there were still valid references to the object). This does al= so=20 apply to version 0.7.1. The current CVS version should fix the problem. I have made several tests= and=20 it basically seems to work now. Using the patched code, storing a 12MB fi= le=20 to the server via XMLRPC took between 86 and 130 seconds (JVM memory=20 settings: 32MB min./128MB max.). I have now also been able to index the s= ame=20 file with memory restricted to only 32MB max.. Indexing a 32MB file took=20 about 300 seconds. However, sending this amount of data via XMLRPC made t= he=20 client crash, so I had to index it locally. > Does exist build a complete in-memory parse-tree of the document during= =20 > insertion or is processed sequentially? eXist processes documents sequentially using SAX. However, two SAX runs a= re=20 needed: During the first run, eXist determines the structure of the resul= ting=20 node tree. In the second run, it stores the actual nodes. Element and=20 fulltext indexes are cached. They will be flushed to disk whenever the JV= M=20 runs low on memory (free memory < 5MB). Best regards, Wolfgang |