From: Jeremy J C. <jj...@sy...> - 2015-04-15 21:24:31
|
I will update later with a different experiment in which medium size (1000 or so triples) INSERTS (with no DELETE) are replaced by LOAD GRAPH calls Jeremy > On Apr 14, 2015, at 9:58 AM, Jeremy J Carroll <jj...@sy...> wrote: > > I found a CONSTRUCT and LOAD much more performant than a DELETE/INSERT, and was wondering why, and whether there is anything new (to me) about the blazegraph architecture that I should understand. > > ===== > > > I had a graph for which I wished to rename almost all URIs. > The graph had about 3M triples > I was working in AWS on > > I constructed a temporary graph with a rename mapping > and then tried the following update query: > > DELETE { > GRAPH <%(abox)s> { > ?oldS ?oldP ?oldO > } > } > INSERT { > GRAPH <%(abox)s> { > ?newS ?newP ?newO > } > } > WHERE { > graph <%(abox)s> { > ?oldS ?oldP ?oldO > } > GRAPH <x-eg:temporary-graph> { > ?oldS <x-eg:replaced-by> ?newS > } > GRAPH <x-eg:temporary-graph> { > ?oldP <x-eg:replaced-by> ?newP > } > { > GRAPH <x-eg:temporary-graph> { > ?oldO <x-eg:replaced-by> ?newO > } > } UNION { > graph <%(abox)s> { > ?oldS ?oldP ?oldO > } > FILTER ( isLiteral(?oldO) ) > BIND ( ?oldO as ?newO ) > } > } > > > > where <%(abox)s> is a variable > > > At the point where we perform this query we have exclusive access to the blaze graph process. > > It took over 4 hours, with approx. the first hour showing some change in the query execution stats, and then the last 3 hours showing no change in the stats (the status page in the NSS display is not very useful with these update queries). > After 4 hours I got bored. Cancel did not work. So I killed blazegraph and restarted. > > I then rewrote the code as follows. > > > I wrote a construct query: > > CONSTRUCT { > ?newS ?newP ?newO > } > WHERE { > graph <%(abox)s> { > ?oldS ?oldP ?oldO > } > GRAPH <x-eg:temporary-graph> { > ?oldS <x-eg:replaced-by> ?newS > } > GRAPH <x-eg:temporary-graph> { > ?oldP <x-eg:replaced-by> ?newP > } > { > GRAPH <x-eg:temporary-graph> { > ?oldO <x-eg:replaced-by> ?newO > } > } UNION { > graph <%(abox)s> { > ?oldS ?oldP ?oldO > } > FILTER ( isLiteral(?oldO) ) > BIND ( ?oldO as ?newO ) > } > } > > this created a temporary file. > > I replaced the DELETE part with > > DROP GRAPH <%(abox)s> > > and the INSERT with > > LOAD <file://%(tmpfile)s> INTO GRAPH <%(abox)s> > > ==== > > > The rewritten code took only a few minutes (less than 5 in total) > I was expecting some improvement, but not as much as I saw. > > My understanding is that each of the three operations is atomic and isolated, but I lost the guarantee linking the three (which I did not need since I had exclusive lock at a higher level). > > Was it the atomicity that cost so much? > > Jeremy > > > > > > > > |