|
From: Lizzi, V. <Vin...@ta...> - 2023-08-10 15:41:17
|
Hi JPR,
That sounds like a good idea. Unfortunately, trying to use JSON for this amount of data also results in out of memory errors.
array:join(
(1 to 200000000) ! ['1']
) => serialize(map{'method': 'json'})
I have gotten eXist to process actual data in tab separated text format when eXist is freshly started and has plenty of memory available, but even in this ideal condition memory usage is still unexpectedly high and sometimes produces out of memory errors.
Thanks,
Vincent
_____________________________________________
Vincent M. Lizzi
Head of Information Standards | Taylor & Francis Group
vin...@ta...<mailto:vin...@ta...>
Information Classification: General
From: Jean-Paul Rehr <re...@gm...>
Sent: Thursday, August 10, 2023 10:06 AM
To: Exist-open <exi...@li...>
Subject: Re: [Exist-open] high memory usage when constructing a string
Dear Vincent,
Have you tried outputting JSON structure data from the database? It will be easier to store and access later using Xquery via arrays and maps. You could even chunk it if eXist chokes on a large file - I've never handled one of that size.
Cheers,
JPR
On Thu, Aug 10, 2023 at 3:24 PM Lizzi, Vincent <Vin...@ta...<mailto:Vin...@ta...>> wrote:
Hello eXist folks,
I’m observing that asking eXist to construct a string of about 200 Mb uses more than 4 Gb of memory.
When running a query such as this that constructs a string of 200,000,000 bytes:
string-join(( (1 to 200000000) ! '1' ))
Monex shows the memory usage climbs to over 4 Gb.
This sometimes exhausts the available memory and results in a stack trace being printed exist.log, the top of which shows:
2023-08-09 23:12:40,553 [qtp353417634-34] ERROR (XQueryServlet.java [process]:549) - Java heap space
java.lang.OutOfMemoryError: Java heap space
at org.exist.dom.memtree.DocumentImpl.addChars(DocumentImpl.java:273) ~[exist-core-6.2.0.jar:6.2.0]
at org.exist.dom.memtree.MemTreeBuilder.characters(MemTreeBuilder.java:382) ~[exist-core-6.2.0.jar:6.2.0]
I would guess that constructing a document uses a lot more memory than constructing a simple string. Is there a way to signal eXist to use a simple string builder instead?
The example above is a simplification of what I actually want to do, which is to retrieve data from a SQL database and store the results in eXist to use for later processing. The data contains nearly 5 million rows with only 2 columns, so tab separated text would seem to be the most efficient way to store this data in eXist. Here is the relevant portion of the query:
let $data as xs:string :=
string-join(
sql:execute($cn, $sh, (), true())/sql:row/string-join(*/text(), '	')
, '
')
return xmldb:store-as-binary('/db/temp', 'list.txt', $data)
Is there a more efficient way to do this in eXist?
Thanks,
Vincent
______________________________________________
Vincent M. Lizzi
Head of Information Standards | Taylor & Francis Group
530 Walnut St., Suite 850, Philadelphia, PA 19106
E-Mail: vin...@ta...<mailto:vin...@ta...>
Web: www.tandfonline.com<http://www.tandfonline.com>
Taylor & Francis is a trading name of Informa UK Limited,
registered in England under no. 1072954
"Everything should be made as simple as possible, but not simpler."
Information Classification: General
_______________________________________________
Exist-open mailing list
Exi...@li...<mailto:Exi...@li...>
https://lists.sourceforge.net/lists/listinfo/exist-open<https://lists.sourceforge.net/lists/listinfo/exist-open>
|