I'm using Blazegraph (BZ) for a project (http://opencitations.net) which aims at storing RDF citation data from PubMed Central articles, so as to make them available for free (associated license: CC0). I've chosen BZ since it basically allow me to handle a large number of triples on a single machine.
However, I've just got a problem when uploading (via REST SPARQL UPDATE interface) additional triples through the SPARQL endpoint, once the triplestore has been already populated with about 4GB (according to the related jnl file) of data (about 12 million triples). Basically, while the system is running and continuing to answer correctly to SELECT queries, it seems that any additional UPDATE query is not correctly handled. No error is actually returned, but BZ freezes and doesn't run any UPDATE anymore nor return anything - it seems blocked "forever" for such UPDATE queries. It seems that this scenario happens when the RAM of my machine is used fully.
I'm running BZ on a virtual machine (Debian, earliest stable version) with 7GB of RAM, and plenty of space for the HD. The current BZ configuration (file occ.properties) I'm using is the following:
As far as I understood, it should not be an issue related to the RAM associated (according to https://wiki.blazegraph.com/wiki/index.php/Hardware_Configuration, 4GB are fine for now). However I really don't understand why this is happening. So, any hint or help here is very appreciated.
Thanks again and have a nice day :-)
S.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm using Blazegraph (BZ) for a project (http://opencitations.net) which
aims at storing RDF citation data from PubMed Central articles, so as to
make them available for free (associated license: CC0). I've chosen BZ
since it basically allow me to handle a large number of triples on a single
machine.
However, I've just got a problem when uploading (via REST SPARQL UPDATE
interface) additional triples through the SPARQL endpoint, once the
triplestore has been already populated with about 4GB (according to the
related jnl file) of data (about 12 million triples). Basically, while the
system is running and continuing to answer correctly to SELECT queries, it
seems that any additional UPDATE query is not correctly handled. No error
is actually returned, but BZ freezes and doesn't run any UPDATE anymore nor
return anything - it seems blocked "forever" for such UPDATE queries. It
seems that this scenario happens when the RAM of my machine is used fully.
I'm running BZ on a virtual machine (Debian, earliest stable version) with
7GB of RAM, and plenty of space for the HD. The current BZ configuration
(file occ.properties) I'm using is the following:
As far as I understood, it should not be an issue related to the RAM
associated (according to https://wiki.blazegraph.com/wiki/index.php/Hardware_Configuration, 4GB
are fine for now). However I really don't understand why this is happening.
So, any hint or help here is very appreciated.
Update need to buffer the intermediate results starting with 2.1.2. We are
working to put those intermediate results on the native heap to remove.any
GC burden.
How big is that update? What does the status tab indicate?
On Sun, Jul 3, 2016 at 2:44 AM, Silvio Peroni essepuntato@users.sf.net
wrote:
Hi all,
I'm using Blazegraph (BZ) for a project (http://opencitations.net) which
aims at storing RDF citation data from PubMed Central articles, so as to
make them available for free (associated license: CC0). I've chosen BZ
since it basically allow me to handle a large number of triples on a single
machine.
However, I've just got a problem when uploading (via REST SPARQL UPDATE
interface) additional triples through the SPARQL endpoint, once the
triplestore has been already populated with about 4GB (according to the
related jnl file) of data (about 12 million triples). Basically, while the
system is running and continuing to answer correctly to SELECT queries, it
seems that any additional UPDATE query is not correctly handled. No error
is actually returned, but BZ freezes and doesn't run any UPDATE anymore nor
return anything - it seems blocked "forever" for such UPDATE queries. It
seems that this scenario happens when the RAM of my machine is used fully.
I'm running BZ on a virtual machine (Debian, earliest stable version) with
7GB of RAM, and plenty of space for the HD. The current BZ configuration
(file occ.properties) I'm using is the following:
As far as I understood, it should not be an issue related to the RAM
associated (according to https://wiki.blazegraph.com/wiki/index.php/Hardware_Configuration, 4GB
are fine for now). However I really don't understand why this is happening.
So, any hint or help here is very appreciated.
thanks for your answers. Just to let you know I've run few days of test with the last 2.1.2 version, and now it seems that everything is working correctly.
Thanks again and have a nice day :-)
S.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,
I'm using Blazegraph (BZ) for a project (http://opencitations.net) which aims at storing RDF citation data from PubMed Central articles, so as to make them available for free (associated license: CC0). I've chosen BZ since it basically allow me to handle a large number of triples on a single machine.
However, I've just got a problem when uploading (via REST SPARQL UPDATE interface) additional triples through the SPARQL endpoint, once the triplestore has been already populated with about 4GB (according to the related jnl file) of data (about 12 million triples). Basically, while the system is running and continuing to answer correctly to SELECT queries, it seems that any additional UPDATE query is not correctly handled. No error is actually returned, but BZ freezes and doesn't run any UPDATE anymore nor return anything - it seems blocked "forever" for such UPDATE queries. It seems that this scenario happens when the RAM of my machine is used fully.
I'm running BZ on a virtual machine (Debian, earliest stable version) with 7GB of RAM, and plenty of space for the HD. The current BZ configuration (file occ.properties) I'm using is the following:
The command I'm using for running BZ is:
java -server -Xmx4g -Dbigdata.propertyFile=occ.properties -Djetty.port=3000 -Djetty.host=127.0.0.1 -jar blazegraph.jar
As far as I understood, it should not be an issue related to the RAM associated (according to https://wiki.blazegraph.com/wiki/index.php/Hardware_Configuration, 4GB are fine for now). However I really don't understand why this is happening. So, any hint or help here is very appreciated.
Thanks again and have a nice day :-)
S.
Silvio,
Thank you. Can you please share the version you are running? If it is
not the 2.1.2 release, can you please try it against the 2.1.2 release.
Also, can you comment if you are using the REST SPARQL Update via the
streaming or posting the contents in the body of the request? The Bulk
Data Load servlet may also be an option:
https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load
Thanks, --Brad
On Sun, Jul 3, 2016 at 2:44 AM, Silvio Peroni essepuntato@users.sf.net
wrote:
Update need to buffer the intermediate results starting with 2.1.2. We are
working to put those intermediate results on the native heap to remove.any
GC burden.
How big is that update? What does the status tab indicate?
Bryan
On Jul 3, 2016 6:24 AM, "Brad Bebee" beebs@users.sf.net wrote:
HI all,
thanks for your answers. Just to let you know I've run few days of test with the last 2.1.2 version, and now it seems that everything is working correctly.
Thanks again and have a nice day :-)
S.