Re: [Bigdata-developers] Problem while loading yago2s in Bigdata

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

You MUST go to the "NAMESPACES" tab and choose "Create namespace" and
specify a new namespace name and choose the RDR mode from the list of
topics (it defaults to triples - the choices are triples, rdr, quads).

I would advise the SPARQL* syntax.

Please try this on a small sample first to be sure that you have everything
setup correctly.

Thanks,
Bryan

On Thu, Oct 16, 2014 at 6:39 PM, Maria Jackson <mar...@gm...>
wrote:

> Dear Bryan,
>
> Just to confirm...
> RDR option in update window is disabled when we load a large file (i.e.
> specify its path). Is it ok to only specify the namespace as RDR when
> loading reified triples?
>
> Also can we query reified triples in Bigdata using SPARQL* or will the
> plain old SPARQL with reification work?
>
>
> Cheers,
> Maria
> On Fri, Oct 17, 2014 at 4:04 AM, Bryan Thompson <br...@sy...> wrote:
>
>> The database instance needs to be in the mode that supports RDR.  This is
>> called the "statement identifiers" mode internally for historical reasons.
>> Today the feature is implemented using inlining.  However, the details of
>> the implementation mechanism should be invisible.  That is the whole point
>> of the RDR model.
>>
>> The RDR mode will support efficient reified triples, inference (if
>> enabled), and triples.  We will be introducing RDR  support for quads but
>> it is not yet in the released platform.  The workbench might label the
>> option differently.  Choose to create a new namespace.  Then choose the
>> mode that supports triples with statement level provenance / RDR /
>> statement identifiers.  The default mode for the NSS in the WAR deployment
>> is, I believe, quads.
>>
>> There are two new papers that might be of interest to you linked from
>> http://blog.bigdata.com.  They are on RDF*/SPARQL* (a formal treatment
>> of RDR) and on a reconciliation of RDF and property graphs based on the RDR
>> model.  Many thanks to Olaf Hartig for his efforts on these papers.
>>
>> Thanks,
>> Bryan
>>
>>
>> On Thursday, October 16, 2014, Maria Jackson <mar...@gm...>
>> wrote:
>>
>>> Dear Bryan,
>>>
>>> I need to load reified triples. The option of RDR in update window is
>>> disabled when we specify the path. By this I mean how do we tell BigData
>>> that we are entering reified triples during update. I am entering reified
>>> triples in BigData according to W3C standard (
>>> http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#reification).  Also
>>> do we also need to specify the present namespace to be RDR instead of
>>> triples.
>>>
>>> Another question that I have is how can we query reified triples using
>>> SPARQL. Should I use the query reification vocabulary defined here
>>> http://www.w3.org/2001/sw/DataAccess/rq23/#queryReification or do you
>>> have your own vocabulary for querying RDR data (containing reified triples).
>>>
>>>
>>> Cheers,
>>> Maria.
>>>
>>> On Thu, Oct 16, 2014 at 3:55 PM, Bryan Thompson <br...@sy...>
>>> wrote:
>>>
>>>> There is a distinction between the workbench (JavaScript in the
>>>> browser) and the database process (java running inside of a servlet
>>>> container in this case).  In anomalous conditions the workbench might not
>>>> correctly track what is happening on the database side.  I suggest that you
>>>> check the database log output and see what messages were generated during
>>>> that time.  I suspect that you might have something like a "GC Overhead
>>>> limit exceeded", which is a type of out of memory exception for java where
>>>> too much of the total time is spent in garbage collection.  Or perhaps some
>>>> other root cause that abnormally terminated the update request in a manner
>>>> that the workbench was unable to identify.
>>>>
>>>> If the update failed, then the database will not contain any triples.
>>>> If you are trying to load a very large dataset it may make sense to upload
>>>> the data in a series of smaller chunks.
>>>>
>>>> There is a "monitor" option that will show you the status of the update
>>>> requests as they are being processed.  When loading large files it will
>>>> echo back on the HTTP connection a summary of the number of statements
>>>> loaded over time during the load.  This will provide you with better
>>>> feedback.
>>>>
>>>> But I think that you have an error condition on the server that has
>>>> halted the load.
>>>>
>>>> Thansk,
>>>> Bryan
>>>>
>>>>
>>>> On Wednesday, October 15, 2014, Maria Jackson <
>>>> mar...@gm...> wrote:
>>>>
>>>>> Dear All,
>>>>>
>>>>> I am trying to load yago2s 18.5GB (
>>>>> http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/
>>>>> <https://contactmonkey.com/api/v1/tracker?cm_session=4d54369b-9f5b-4f3b-ae2d-5c05ba2939a0&cm_type=link&cm_link=36eb659b-7a36-459f-95da-e6d711aec4d0&cm_destination=http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/>)
>>>>> in Bigdata. I downloaded bigdata from  http://www.bigdata.com/download
>>>>> <https://contactmonkey.com/api/v1/tracker?cm_session=4d54369b-9f5b-4f3b-ae2d-5c05ba2939a0&cm_type=link&cm_link=0b0bc5d8-f7fe-46b3-b416-31eb502201c4&cm_destination=http://www.bigdata.com/download> and
>>>>> I am using Bigdata workbench via http://localhost:9999.
>>>>>
>>>>> I am loading yago2s in BigData's default namespace "kb". I am loading
>>>>> yago2s using update by specifying the file path there. While Bigdata is
>>>>> loading yago I notice that it consumes a significant amount of CPU and RAM
>>>>> for 4-5 hours, but after that it stops using RAM. But my dilemma is that
>>>>> BigData workbench still keeps on showing "Running update.." although
>>>>> BigData does not consume any RAM or CPU for the next 48 hours or so (In
>>>>> fact it keeps showing "Running update.." until I kill the process). Can you
>>>>> please suggest as to where am I going wrong as after killing the process
>>>>> BigData is not able to retrieve any tuples (and shows 0 results even for
>>>>> the query select ?a?b?c where{?a ?b ?c})
>>>>>
>>>>>
>>>>> Also I am using BigData on a server with 16 cores and 64 GB RAM?
>>>>>
>>>>> Any help in this regard will be deeply appreciated.
>>>>>
>>>>> Cheers,
>>>>> Maria
>>>>>
>>>>
>>>>
>>>> --
>>>> ----
>>>> Bryan Thompson
>>>> Chief Scientist & Founder
>>>> SYSTAP, LLC
>>>> 4501 Tower Road
>>>> Greensboro, NC 27410
>>>> br...@sy...
>>>> http://bigdata.com
>>>> http://mapgraph.io
>>>>
>>>> CONFIDENTIALITY NOTICE:  This email and its contents and attachments
>>>> are for the sole use of the intended recipient(s) and are confidential or
>>>> proprietary to SYSTAP. Any unauthorized review, use, disclosure,
>>>> dissemination or copying of this email or its contents or attachments is
>>>> prohibited. If you have received this communication in error, please notify
>>>> the sender by reply email and permanently delete all copies of the email
>>>> and its contents and attachments.
>>>>
>>>>
>>>
>>
>> --
>> ----
>> Bryan Thompson
>> Chief Scientist & Founder
>> SYSTAP, LLC
>> 4501 Tower Road
>> Greensboro, NC 27410
>> br...@sy...
>> http://bigdata.com
>> http://mapgraph.io
>>
>> CONFIDENTIALITY NOTICE:  This email and its contents and attachments are
>> for the sole use of the intended recipient(s) and are confidential or
>> proprietary to SYSTAP. Any unauthorized review, use, disclosure,
>> dissemination or copying of this email or its contents or attachments is
>> prohibited. If you have received this communication in error, please notify
>> the sender by reply email and permanently delete all copies of the email
>> and its contents and attachments.
>>
>>
>

Re: [Bigdata-developers] Problem while loading yago2s in Bigdata

Fast, scalable, robust graph database platform

Re: [Bigdata-developers] Problem while loading yago2s in Bigdata