Re: [Exist-open] Improving Delete Update Performance

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

Grieder Bruno a =C3=A9crit :

> 1- using an "update replace" statement to do a direct update of the=20
> nodes. The query times vary widely but take as an average 5 seconds to=20
> complete.
> The typical entries in exist.log are
>=20
> 2006-11-08 07:57:16,134 [SocketListener0-7] INFO  (RpcConnection.java=20
> [doQuery]:256) - query took 9832ms.
> 2006-11-08 07:57:16,135 [SocketListener0-7] DEBUG (RpcConnection.java=20
> [queryP]:1484) - found 0
> 2006-11-08 07:57:16,172 [SocketListener0-7] DEBUG (XQuery.java=20
> [compile]:156) - Compilation took 27
> 2006-11-08 07:57:16,174 [SocketListener0-7] DEBUG (NativeBroker.java=20
> [getXMLResource]:1523) - document '/db/DEMO' not found!
> 2006-11-08 07:57:16,174 [SocketListener0-7] DEBUG (XQueryContext.java=20
> [getStaticallyKnownDocuments]:639) - reading collection /db/DEMO
> 2006-11-08 07:57:17,390 [SocketListener0-7] DEBUG (DOMFile.java=20
> [removeNode]:1534) - removing page 8636
> 2006-11-08 07:57:19,907 [SocketListener0-7] DEBUG (DOMFile.java=20
> [insertAfter]:503) - creating new page: 8636
> 2006-11-08 07:57:19,912 [SocketListener0-7] DEBUG (HTTPUtils.java=20
> [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0
> 2006-11-08 07:57:19,921 [SocketListener0-7] DEBUG (HTTPUtils.java=20
> [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0
> 2006-11-08 07:57:19,922 [SocketListener0-7] INFO  (RpcConnection.java=20
> [doQuery]:256) - query took 3749ms.

That doesn't say much :-)

> 2- reading the full document, updating the document nodes in memory,=20
> storing the full document again. Storage time also vary widely but are=20
> above 20 seconds as an average
> The typical entries in exist.log are

And this one as well.

> In both cases, the performance hit seems to be on the removal of the=20
> previous data.

Yes. See below...

> Scanning the mails, we understand that this is mostly due to the=20
> cleaning up of indexes however we have deactivated most of the indexes=20
> using the following configuration:

There is still one index that you can not deactivate : the structural=20
index which porbably plays a role in your problem. See below.

> It is clear that the bottleneck is linked to significant disk I/O; the=20
> CPUs are vastly under-utilised.

Yes. See below.

> We have run our tests on 1.1 and 1.1.1 in standalone and as a webapp on=
=20
> our application server.

That's the point : the new indexing scheme (1.1) is less efficient than=20
the old one (1.0) on... delete updates.

> Results did not vary much using a configuration or another.

Or course : your performance problem is closely related to eXist's core,=20
not to the way you access the DB.

> Would you have any suggestion on the next thing we could try?

Well, we should implement an in-memory process that would defer data=20
flushing, but you can imagine that it's not an easy task.

Adam Retter has also developed a yet undocumented (as far as I know)=20
feature about batch transactions. I don't know if this can help you.

Cheers,

--=20
Pierrick Brihaye, informaticien
Service r=C3=A9gional de l'Inventaire / DRAC Bretagne
mailto:pie...@cu... / t=C3=A9l : +33 (0)2 99 29 67 78
Avez-vous lu http://usenet-fr.news.eu.org/fr-chartes/rfc1855.html ?

Re: [Exist-open] Improving Delete Update Performance

eXist-db is a feature rich Open Source native XML database

Re: [Exist-open] Improving Delete Update Performance