From: Pierrick B. <pie...@cu...> - 2006-11-08 10:34:07
|
Hi, Grieder Bruno a =C3=A9crit : > 1- using an "update replace" statement to do a direct update of the=20 > nodes. The query times vary widely but take as an average 5 seconds to=20 > complete. > The typical entries in exist.log are >=20 > 2006-11-08 07:57:16,134 [SocketListener0-7] INFO (RpcConnection.java=20 > [doQuery]:256) - query took 9832ms. > 2006-11-08 07:57:16,135 [SocketListener0-7] DEBUG (RpcConnection.java=20 > [queryP]:1484) - found 0 > 2006-11-08 07:57:16,172 [SocketListener0-7] DEBUG (XQuery.java=20 > [compile]:156) - Compilation took 27 > 2006-11-08 07:57:16,174 [SocketListener0-7] DEBUG (NativeBroker.java=20 > [getXMLResource]:1523) - document '/db/DEMO' not found! > 2006-11-08 07:57:16,174 [SocketListener0-7] DEBUG (XQueryContext.java=20 > [getStaticallyKnownDocuments]:639) - reading collection /db/DEMO > 2006-11-08 07:57:17,390 [SocketListener0-7] DEBUG (DOMFile.java=20 > [removeNode]:1534) - removing page 8636 > 2006-11-08 07:57:19,907 [SocketListener0-7] DEBUG (DOMFile.java=20 > [insertAfter]:503) - creating new page: 8636 > 2006-11-08 07:57:19,912 [SocketListener0-7] DEBUG (HTTPUtils.java=20 > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > 2006-11-08 07:57:19,921 [SocketListener0-7] DEBUG (HTTPUtils.java=20 > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > 2006-11-08 07:57:19,922 [SocketListener0-7] INFO (RpcConnection.java=20 > [doQuery]:256) - query took 3749ms. That doesn't say much :-) > 2- reading the full document, updating the document nodes in memory,=20 > storing the full document again. Storage time also vary widely but are=20 > above 20 seconds as an average > The typical entries in exist.log are And this one as well. > In both cases, the performance hit seems to be on the removal of the=20 > previous data. Yes. See below... > Scanning the mails, we understand that this is mostly due to the=20 > cleaning up of indexes however we have deactivated most of the indexes=20 > using the following configuration: There is still one index that you can not deactivate : the structural=20 index which porbably plays a role in your problem. See below. > It is clear that the bottleneck is linked to significant disk I/O; the=20 > CPUs are vastly under-utilised. Yes. See below. > We have run our tests on 1.1 and 1.1.1 in standalone and as a webapp on= =20 > our application server. That's the point : the new indexing scheme (1.1) is less efficient than=20 the old one (1.0) on... delete updates. > Results did not vary much using a configuration or another. Or course : your performance problem is closely related to eXist's core,=20 not to the way you access the DB. > Would you have any suggestion on the next thing we could try? Well, we should implement an in-memory process that would defer data=20 flushing, but you can imagine that it's not an easy task. Adam Retter has also developed a yet undocumented (as far as I know)=20 feature about batch transactions. I don't know if this can help you. Cheers, --=20 Pierrick Brihaye, informaticien Service r=C3=A9gional de l'Inventaire / DRAC Bretagne mailto:pie...@cu... / t=C3=A9l : +33 (0)2 99 29 67 78 Avez-vous lu http://usenet-fr.news.eu.org/fr-chartes/rfc1855.html ? |