This list is closed, nobody may subscribe to it.
| 2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
| 2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
| 2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
| 2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
| 2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
| 2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
| 2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Bryan T. <br...@sy...> - 2013-11-06 20:54:07
|
Sorry, make that http://ci.bigdata.com:8080. The runs are currently about 1-1/2 hours each. The last one failed on a zookeeper cleanup. The current CI run should go to completion. Thanks, Bryan From: Bryan Thompson <br...@sy...<mailto:br...@sy...>> Date: Wednesday, November 6, 2013 3:49 PM To: "Big...@li...<mailto:Big...@li...>" <Big...@li...<mailto:Big...@li...>> Subject: [Bigdata-developers] ci.bigdata.com We are standing up CI on a node (http://ci.bigdata.com) that will be visible to everyone (read-only). Hopefully this will provide added transparency. I am still working through the configuration of this service, but it is very close to delivering good builds. Once CI is running smoothly on EC2, I will look at how to export the maven artifacts generated by CI. I believe that we will be able to do this through a plug-in, but that may mean that the marven artifact location will change. I will look at this more tomorrow. Thanks, Bryan |
|
From: Bryan T. <br...@sy...> - 2013-11-06 20:50:20
|
We are standing up CI on a node (http://ci.bigdata.com) that will be visible to everyone (read-only). Hopefully this will provide added transparency. I am still working through the configuration of this service, but it is very close to delivering good builds. Once CI is running smoothly on EC2, I will look at how to export the maven artifacts generated by CI. I believe that we will be able to do this through a plug-in, but that may mean that the marven artifact location will change. I will look at this more tomorrow. Thanks, Bryan |
|
From: Bryan T. <br...@sy...> - 2013-11-06 20:48:43
|
Our apologies for the recent outage on bigdata.com and systap.com. Our hosting provider attempted a platform migration that broke our sites. We have migrated the sites to EC2 and are working to clean up any remaining broken links. Thanks, Bryan On 11/4/13 9:06 PM, "Bryan Thompson" <br...@sy...> wrote: >We are experiencing a temporary outage as our hosting provider conducts a >migration. These sites will be restored ASAP. The sourceforge sites and >services are independent and should remain online throughout. > >Thanks, >Bryan >-------------------------------------------------------------------------- >---- >November Webinars for C, C++, Fortran Developers >Accelerate application performance with scalable programming models. >Explore >techniques for threading, error checking, porting, and tuning. Get the >most >from the latest Intel processors and coprocessors. See abstracts and >register >http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktr >k >_______________________________________________ >Bigdata-developers mailing list >Big...@li... >https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Bryan T. <br...@sy...> - 2013-11-05 02:08:10
|
We are experiencing a temporary outage as our hosting provider conducts a migration. These sites will be restored ASAP. The sourceforge sites and services are independent and should remain online throughout. Thanks, Bryan |
|
From: Mike P. <mi...@sy...> - 2013-10-25 16:32:33
|
I made some pretty extensive changes to the implementation of property paths (ArbitraryLengthPathOp) to address this ticket: https://sourceforge.net/apps/trac/bigdata/ticket/761 Changes checked into the 1.3 pre-release branch (BIGDATA_RELEASE_1_3_0) - revision 7483. I've added significant test coverage and verified all the old tests, but I would still like some feedback on these changes if possible. Does it still work as expected in your application? Is it faster now? Thanks, Mike |
|
From: Jeremy J C. <jj...@sy...> - 2013-10-24 14:46:47
|
That is really helpful - sounds as though the optimizations you are doing will work for this case, and we needn't try optimizing up-front.
(Of course, to be revisited if there are problems!)
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 23, 2013, at 5:08 PM, Bryan Thompson <br...@sy...> wrote:
> Some comments inline below. Not sure if they will help.
>
> It sounds like there is some set of shared variables between your delete and insert in the where clause. If you are just doing the cross product of two unrelated queries, the ick. The main cost is the WHERE clause. If that is fast, the whole thing should be fast.
>
> Bryan
>
>> On Oct 23, 2013, at 6:21 PM, "Jeremy J Carroll" <jj...@sy...> wrote:
>>
>> (This is really a help request rather than a developer question)
>>
>>
>> Is INSERT ing the same ground triple multiple times, one per match, in a single UPDATE request, expensive?
>
> In general, only mutation drives writes. If the same triple is asserted multiple times, the index page is only made dirty once. Index operations benefit from locality. If there are a lot of updates, it helps (a lot) to order those index writes, which we do automatically.
>
> However, if the execution pattern produces joins that generate a lot of redundant triples, then that is going to be inefficient.
>
>>
>> We have an update request that can logically be split into the following two parts
>>
>>
>> DELETE
>> { ?s ?p ?o }
>> WHERE
>> {
>> # some complex constraint, matching multiple times, on ?s ?p ?o
>> }
>>
>> ===
>>
>> INSERT
>> {
>> # many ground triples, including references to ?a ?b ?c
>> }
>> WHERE
>> {
>> # constraint matching exactly once including ?a ?b ?c
>> }
>>
>>
>> Now, logically this is equivalent to:
>>
>>
>> DELETE
>> { ?s ?p ?o }
>> INSERT
>> {
>> # many ground triples, including references to ?a ?b ?c
>> }
>> WHERE
>> {
>>
>> # constraint matching exactly once including ?a ?b ?c
>>
>> # some complex constraint, matching multiple times, on ?s ?p ?o
>> }
>>
>>
>> which is quite tempting, because we do not need the client side logic to provide both updates commands in a single atomic HTTP POST request.
>>
>> But, it is very easy to imagine server side implementations where this is much less efficient ...
>>
>> Jeremy J Carroll
>> Principal Architect
>> Syapse, Inc.
>>
>>
>
> I would have to go look at the code on this. My recollection is that DELETE INSERT WHERE runs the WHERE once, feeds it into the DELETE and a temporary solution set and then feeds the temporary solution set into the INSERT. It might be that we buffer the solutions before feeding them into the DELETE. There are several different cases.
>
> As a general rule, if your WHERE is efficient, the whole update should be efficient.
>
> The SPARQL UPDATE layer does wind up materializing things as RDF values (vs IVs) rather than optimizing out materializations that are not required. This made the code much, much, MUCH easier.
>
> Hope that helps,
>
> Bryan
>
>>
>>
>> ------------------------------------------------------------------------------
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
>> the latest Intel processors and coprocessors. See abstracts and register >
>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Bigdata-developers mailing list
>> Big...@li...
>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers
|
|
From: Bryan T. <br...@sy...> - 2013-10-24 00:09:16
|
Some comments inline below. Not sure if they will help.
It sounds like there is some set of shared variables between your delete and insert in the where clause. If you are just doing the cross product of two unrelated queries, the ick. The main cost is the WHERE clause. If that is fast, the whole thing should be fast.
Bryan
> On Oct 23, 2013, at 6:21 PM, "Jeremy J Carroll" <jj...@sy...> wrote:
>
> (This is really a help request rather than a developer question)
>
>
> Is INSERT ing the same ground triple multiple times, one per match, in a single UPDATE request, expensive?
In general, only mutation drives writes. If the same triple is asserted multiple times, the index page is only made dirty once. Index operations benefit from locality. If there are a lot of updates, it helps (a lot) to order those index writes, which we do automatically.
However, if the execution pattern produces joins that generate a lot of redundant triples, then that is going to be inefficient.
>
> We have an update request that can logically be split into the following two parts
>
>
> DELETE
> { ?s ?p ?o }
> WHERE
> {
> # some complex constraint, matching multiple times, on ?s ?p ?o
> }
>
> ===
>
> INSERT
> {
> # many ground triples, including references to ?a ?b ?c
> }
> WHERE
> {
> # constraint matching exactly once including ?a ?b ?c
> }
>
>
> Now, logically this is equivalent to:
>
>
> DELETE
> { ?s ?p ?o }
> INSERT
> {
> # many ground triples, including references to ?a ?b ?c
> }
> WHERE
> {
>
> # constraint matching exactly once including ?a ?b ?c
>
> # some complex constraint, matching multiple times, on ?s ?p ?o
> }
>
>
> which is quite tempting, because we do not need the client side logic to provide both updates commands in a single atomic HTTP POST request.
>
> But, it is very easy to imagine server side implementations where this is much less efficient ...
>
> Jeremy J Carroll
> Principal Architect
> Syapse, Inc.
>
>
I would have to go look at the code on this. My recollection is that DELETE INSERT WHERE runs the WHERE once, feeds it into the DELETE and a temporary solution set and then feeds the temporary solution set into the INSERT. It might be that we buffer the solutions before feeding them into the DELETE. There are several different cases.
As a general rule, if your WHERE is efficient, the whole update should be efficient.
The SPARQL UPDATE layer does wind up materializing things as RDF values (vs IVs) rather than optimizing out materializations that are not required. This made the code much, much, MUCH easier.
Hope that helps,
Bryan
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
> _______________________________________________
> Bigdata-developers mailing list
> Big...@li...
> https://lists.sourceforge.net/lists/listinfo/bigdata-developers
|
|
From: Jeremy J C. <jj...@sy...> - 2013-10-23 22:21:54
|
(This is really a help request rather than a developer question)
Is INSERT ing the same ground triple multiple times, one per match, in a single UPDATE request, expensive?
We have an update request that can logically be split into the following two parts
DELETE
{ ?s ?p ?o }
WHERE
{
# some complex constraint, matching multiple times, on ?s ?p ?o
}
===
INSERT
{
# many ground triples, including references to ?a ?b ?c
}
WHERE
{
# constraint matching exactly once including ?a ?b ?c
}
Now, logically this is equivalent to:
DELETE
{ ?s ?p ?o }
INSERT
{
# many ground triples, including references to ?a ?b ?c
}
WHERE
{
# constraint matching exactly once including ?a ?b ?c
# some complex constraint, matching multiple times, on ?s ?p ?o
}
which is quite tempting, because we do not need the client side logic to provide both updates commands in a single atomic HTTP POST request.
But, it is very easy to imagine server side implementations where this is much less efficient ...
Jeremy J Carroll
Principal Architect
Syapse, Inc.
|
|
From: Bryan T. <br...@sy...> - 2013-10-17 17:40:51
|
That sounds good. Is there a utility to generate SPARQL queries from the SPARQL algebra operators? Thanks, Bryan On 10/17/13 9:48 AM, "Jerven Bolleman" <jer...@is...> wrote: >Hi Bryan, all, > >Actually thinking about it, it does not matter at all that >bigdata uses a different model. As the SPIN queries would be >translatable to SPARQL strings which bigdata can parse on its own. > >Now I just need to find time to actually work on the SPIN parser again.. > >Regards, >Jerven > >On 17/10/13 14:53, Bryan Thompson wrote: >> Peter, we do not use the openrdf query algebra. There were two reasons >> why we forked the SPARQL grammar. >> >> - Custom extensions. >> >> - Target a different operator model. >> >> We found that the openrdf algebra model hid many of the aspects of the >> SPARQL query that we wanted to access when rewriting queries in order to >> efficiently target our backend evaluation. >> >> Thanks, >> Bryan >> >> On 10/15/13 6:19 PM, "Peter Ansell" <ans...@gm...> wrote: >> >>> Hi Bryan, Jerven, >>> >>> If BigData still supports sesame-queryalgebra-model would it be >>> possible to write a converter using that API rather than >>> sesame-queryalgebra-evaluation? It may just need a pluggable factory >>> class (or reuse QueryModelVisitor??) to create the actual nodes and >>> Query objects as necessary. >>> >>> Cheers, >>> >>> Peter >>> >>> On 16 October 2013 02:28, Bryan Thompson <br...@sy...> wrote: >>>> We maintain a fork of the Sesame SPARQL parser in which we have some >>>> extensions of the SPARQL language. >>>> >>>> We also override evaluation. >>>> >>>> Bryan >>>> >>>> On 10/15/13 11:24 AM, "Jerven Bolleman" <jer...@is...> >>>> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I actually did not have time in the last few months to work on this >>>>>code >>>>> at all. But the idea was simple once we translate the SPIN >>>>>serialization >>>>> into Sesame SPARQL objects they would be easy to run >>>>> in any sesame store. (Excluding the magic properties which might use >>>>> JS). >>>>> >>>>> For BigData this would be a bit more complicated as in my >>>>>understanding >>>>> this does not use the sesame sparql parser (or is it just the >>>>>evaluation >>>>> engine?). >>>>> >>>>> Regards, >>>>> Jerven >>>>> >>>>> >>>>> >>>>> On 15/10/13 03:16, Peter Ansell wrote: >>>>>> Hi Daniel, >>>>>> >>>>>> Jerven Bolleman has been looking into translating SPIN queries to >>>>>> SPARQL using the OpenRDF API [1] [2]. >>>>>> >>>>>> Extending that work to dynamically create new OpenRDF Functions from >>>>>> SPIN Functions that could be used in queries, in a similar way to >>>>>>the >>>>>> Jena version, may be the next logical step after the parser is >>>>>> completed. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Peter >>>>>> >>>>>> [1] https://openrdf.atlassian.net/browse/SES-1840 >>>>>> [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN >>>>>> >>>>>> On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >>>>>>> Greetings Jeremey, >>>>>>> >>>>>>> Its really nice to see your active participation in the Bigdata >>>>>>> project. >>>>>>> I've yet to contribute anything back to it myself, but am inching >>>>>>> closer. >>>>>>> >>>>>>> You are in a particularly privileged position where you are now >>>>>>> familiar >>>>>>> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata >>>>>>> code >>>>>>> bases. How much effort do you think it would take for Bigdata to >>>>>>> support >>>>>>> SPIN through the open source API? Is there anything you can see >>>>>>>that >>>>>>> would >>>>>>> block the integration of the two? If not, then I'll seriously >>>>>>> consider >>>>>>> making this my first Bigdata undertaking. >>>>>>> >>>>>>> best wishes, >>>>>>> >>>>>>> -Daniel >>>>>>> >>>>>>> >>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>-- >>>>>>> -- >>>>>>> ------ >>>>>>> October Webinars: Code for Performance >>>>>>> Free Intel webinars can help you accelerate application >>>>>>>performance. >>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>>>>> most >>>>>>> from >>>>>>> the latest Intel processors and coprocessors. See abstracts and >>>>>>> register > >>>>>>> >>>>>>> >>>>>>>http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg >>>>>>>.c >>>>>>> lk >>>>>>> trk >>>>>>> _______________________________________________ >>>>>>> Bigdata-developers mailing list >>>>>>> Big...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>>>>> >>>>> >>>>> >>>>> -- >>>>> ------------------------------------------------------------------- >>>>> Jerven Bolleman Jer...@is... >>>>> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >>>>> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >>>>> 1211 Geneve 4, >>>>> Switzerland www.isb-sib.ch - www.uniprot.org >>>>> Follow us at https://twitter.com/#!/uniprot >>>>> ------------------------------------------------------------------- >>>>> >>>>> >>>>>---------------------------------------------------------------------- >>>>>-- >>>>> -- >>>>> ---- >>>>> October Webinars: Code for Performance >>>>> Free Intel webinars can help you accelerate application performance. >>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>>>most >>>>> from >>>>> the latest Intel processors and coprocessors. See abstracts and >>>>> register > >>>>> >>>>>http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.c >>>>>lk >>>>> tr >>>>> k >>>>> _______________________________________________ >>>>> Bigdata-developers mailing list >>>>> Big...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>> >> > > >-- >------------------------------------------------------------------- > Jerven Bolleman Jer...@is... > SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 > CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 > 1211 Geneve 4, > Switzerland www.isb-sib.ch - www.uniprot.org > Follow us at https://twitter.com/#!/uniprot >------------------------------------------------------------------- |
|
From: Jerven B. <jer...@is...> - 2013-10-17 13:48:12
|
Hi Bryan, all, Actually thinking about it, it does not matter at all that bigdata uses a different model. As the SPIN queries would be translatable to SPARQL strings which bigdata can parse on its own. Now I just need to find time to actually work on the SPIN parser again.. Regards, Jerven On 17/10/13 14:53, Bryan Thompson wrote: > Peter, we do not use the openrdf query algebra. There were two reasons > why we forked the SPARQL grammar. > > - Custom extensions. > > - Target a different operator model. > > We found that the openrdf algebra model hid many of the aspects of the > SPARQL query that we wanted to access when rewriting queries in order to > efficiently target our backend evaluation. > > Thanks, > Bryan > > On 10/15/13 6:19 PM, "Peter Ansell" <ans...@gm...> wrote: > >> Hi Bryan, Jerven, >> >> If BigData still supports sesame-queryalgebra-model would it be >> possible to write a converter using that API rather than >> sesame-queryalgebra-evaluation? It may just need a pluggable factory >> class (or reuse QueryModelVisitor??) to create the actual nodes and >> Query objects as necessary. >> >> Cheers, >> >> Peter >> >> On 16 October 2013 02:28, Bryan Thompson <br...@sy...> wrote: >>> We maintain a fork of the Sesame SPARQL parser in which we have some >>> extensions of the SPARQL language. >>> >>> We also override evaluation. >>> >>> Bryan >>> >>> On 10/15/13 11:24 AM, "Jerven Bolleman" <jer...@is...> >>> wrote: >>> >>>> Hi All, >>>> >>>> I actually did not have time in the last few months to work on this code >>>> at all. But the idea was simple once we translate the SPIN serialization >>>> into Sesame SPARQL objects they would be easy to run >>>> in any sesame store. (Excluding the magic properties which might use >>>> JS). >>>> >>>> For BigData this would be a bit more complicated as in my understanding >>>> this does not use the sesame sparql parser (or is it just the evaluation >>>> engine?). >>>> >>>> Regards, >>>> Jerven >>>> >>>> >>>> >>>> On 15/10/13 03:16, Peter Ansell wrote: >>>>> Hi Daniel, >>>>> >>>>> Jerven Bolleman has been looking into translating SPIN queries to >>>>> SPARQL using the OpenRDF API [1] [2]. >>>>> >>>>> Extending that work to dynamically create new OpenRDF Functions from >>>>> SPIN Functions that could be used in queries, in a similar way to the >>>>> Jena version, may be the next logical step after the parser is >>>>> completed. >>>>> >>>>> Cheers, >>>>> >>>>> Peter >>>>> >>>>> [1] https://openrdf.atlassian.net/browse/SES-1840 >>>>> [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN >>>>> >>>>> On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >>>>>> Greetings Jeremey, >>>>>> >>>>>> Its really nice to see your active participation in the Bigdata >>>>>> project. >>>>>> I've yet to contribute anything back to it myself, but am inching >>>>>> closer. >>>>>> >>>>>> You are in a particularly privileged position where you are now >>>>>> familiar >>>>>> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata >>>>>> code >>>>>> bases. How much effort do you think it would take for Bigdata to >>>>>> support >>>>>> SPIN through the open source API? Is there anything you can see that >>>>>> would >>>>>> block the integration of the two? If not, then I'll seriously >>>>>> consider >>>>>> making this my first Bigdata undertaking. >>>>>> >>>>>> best wishes, >>>>>> >>>>>> -Daniel >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> -- >>>>>> ------ >>>>>> October Webinars: Code for Performance >>>>>> Free Intel webinars can help you accelerate application performance. >>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>>>> most >>>>>> from >>>>>> the latest Intel processors and coprocessors. See abstracts and >>>>>> register > >>>>>> >>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.c >>>>>> lk >>>>>> trk >>>>>> _______________________________________________ >>>>>> Bigdata-developers mailing list >>>>>> Big...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>>>> >>>> >>>> >>>> -- >>>> ------------------------------------------------------------------- >>>> Jerven Bolleman Jer...@is... >>>> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >>>> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >>>> 1211 Geneve 4, >>>> Switzerland www.isb-sib.ch - www.uniprot.org >>>> Follow us at https://twitter.com/#!/uniprot >>>> ------------------------------------------------------------------- >>>> >>>> ------------------------------------------------------------------------ >>>> -- >>>> ---- >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>>> from >>>> the latest Intel processors and coprocessors. See abstracts and >>>> register > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clk >>>> tr >>>> k >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> > -- ------------------------------------------------------------------- Jerven Bolleman Jer...@is... SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 1211 Geneve 4, Switzerland www.isb-sib.ch - www.uniprot.org Follow us at https://twitter.com/#!/uniprot ------------------------------------------------------------------- |
|
From: Bryan T. <br...@sy...> - 2013-10-17 12:53:59
|
Peter, we do not use the openrdf query algebra. There were two reasons why we forked the SPARQL grammar. - Custom extensions. - Target a different operator model. We found that the openrdf algebra model hid many of the aspects of the SPARQL query that we wanted to access when rewriting queries in order to efficiently target our backend evaluation. Thanks, Bryan On 10/15/13 6:19 PM, "Peter Ansell" <ans...@gm...> wrote: >Hi Bryan, Jerven, > >If BigData still supports sesame-queryalgebra-model would it be >possible to write a converter using that API rather than >sesame-queryalgebra-evaluation? It may just need a pluggable factory >class (or reuse QueryModelVisitor??) to create the actual nodes and >Query objects as necessary. > >Cheers, > >Peter > >On 16 October 2013 02:28, Bryan Thompson <br...@sy...> wrote: >> We maintain a fork of the Sesame SPARQL parser in which we have some >> extensions of the SPARQL language. >> >> We also override evaluation. >> >> Bryan >> >> On 10/15/13 11:24 AM, "Jerven Bolleman" <jer...@is...> >>wrote: >> >>>Hi All, >>> >>>I actually did not have time in the last few months to work on this code >>>at all. But the idea was simple once we translate the SPIN serialization >>>into Sesame SPARQL objects they would be easy to run >>>in any sesame store. (Excluding the magic properties which might use >>>JS). >>> >>>For BigData this would be a bit more complicated as in my understanding >>>this does not use the sesame sparql parser (or is it just the evaluation >>>engine?). >>> >>>Regards, >>>Jerven >>> >>> >>> >>>On 15/10/13 03:16, Peter Ansell wrote: >>>> Hi Daniel, >>>> >>>> Jerven Bolleman has been looking into translating SPIN queries to >>>> SPARQL using the OpenRDF API [1] [2]. >>>> >>>> Extending that work to dynamically create new OpenRDF Functions from >>>> SPIN Functions that could be used in queries, in a similar way to the >>>> Jena version, may be the next logical step after the parser is >>>> completed. >>>> >>>> Cheers, >>>> >>>> Peter >>>> >>>> [1] https://openrdf.atlassian.net/browse/SES-1840 >>>> [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN >>>> >>>> On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >>>>> Greetings Jeremey, >>>>> >>>>> Its really nice to see your active participation in the Bigdata >>>>>project. >>>>> I've yet to contribute anything back to it myself, but am inching >>>>>closer. >>>>> >>>>> You are in a particularly privileged position where you are now >>>>>familiar >>>>> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata >>>>>code >>>>> bases. How much effort do you think it would take for Bigdata to >>>>>support >>>>> SPIN through the open source API? Is there anything you can see that >>>>>would >>>>> block the integration of the two? If not, then I'll seriously >>>>>consider >>>>> making this my first Bigdata undertaking. >>>>> >>>>> best wishes, >>>>> >>>>> -Daniel >>>>> >>>>> >>>>>---------------------------------------------------------------------- >>>>>-- >>>>>------ >>>>> October Webinars: Code for Performance >>>>> Free Intel webinars can help you accelerate application performance. >>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>>>most >>>>> from >>>>> the latest Intel processors and coprocessors. See abstracts and >>>>>register > >>>>> >>>>>http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.c >>>>>lk >>>>>trk >>>>> _______________________________________________ >>>>> Bigdata-developers mailing list >>>>> Big...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>>> >>> >>> >>>-- >>>------------------------------------------------------------------- >>> Jerven Bolleman Jer...@is... >>> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >>> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >>> 1211 Geneve 4, >>> Switzerland www.isb-sib.ch - www.uniprot.org >>> Follow us at https://twitter.com/#!/uniprot >>>------------------------------------------------------------------- >>> >>>------------------------------------------------------------------------ >>>-- >>>---- >>>October Webinars: Code for Performance >>>Free Intel webinars can help you accelerate application performance. >>>Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>>from >>>the latest Intel processors and coprocessors. See abstracts and >>>register > >>>http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clk >>>tr >>>k >>>_______________________________________________ >>>Bigdata-developers mailing list >>>Big...@li... >>>https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> |
|
From: Peter A. <ans...@gm...> - 2013-10-15 22:19:50
|
Hi Bryan, Jerven, If BigData still supports sesame-queryalgebra-model would it be possible to write a converter using that API rather than sesame-queryalgebra-evaluation? It may just need a pluggable factory class (or reuse QueryModelVisitor??) to create the actual nodes and Query objects as necessary. Cheers, Peter On 16 October 2013 02:28, Bryan Thompson <br...@sy...> wrote: > We maintain a fork of the Sesame SPARQL parser in which we have some > extensions of the SPARQL language. > > We also override evaluation. > > Bryan > > On 10/15/13 11:24 AM, "Jerven Bolleman" <jer...@is...> wrote: > >>Hi All, >> >>I actually did not have time in the last few months to work on this code >>at all. But the idea was simple once we translate the SPIN serialization >>into Sesame SPARQL objects they would be easy to run >>in any sesame store. (Excluding the magic properties which might use JS). >> >>For BigData this would be a bit more complicated as in my understanding >>this does not use the sesame sparql parser (or is it just the evaluation >>engine?). >> >>Regards, >>Jerven >> >> >> >>On 15/10/13 03:16, Peter Ansell wrote: >>> Hi Daniel, >>> >>> Jerven Bolleman has been looking into translating SPIN queries to >>> SPARQL using the OpenRDF API [1] [2]. >>> >>> Extending that work to dynamically create new OpenRDF Functions from >>> SPIN Functions that could be used in queries, in a similar way to the >>> Jena version, may be the next logical step after the parser is >>> completed. >>> >>> Cheers, >>> >>> Peter >>> >>> [1] https://openrdf.atlassian.net/browse/SES-1840 >>> [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN >>> >>> On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >>>> Greetings Jeremey, >>>> >>>> Its really nice to see your active participation in the Bigdata >>>>project. >>>> I've yet to contribute anything back to it myself, but am inching >>>>closer. >>>> >>>> You are in a particularly privileged position where you are now >>>>familiar >>>> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code >>>> bases. How much effort do you think it would take for Bigdata to >>>>support >>>> SPIN through the open source API? Is there anything you can see that >>>>would >>>> block the integration of the two? If not, then I'll seriously consider >>>> making this my first Bigdata undertaking. >>>> >>>> best wishes, >>>> >>>> -Daniel >>>> >>>> >>>>------------------------------------------------------------------------ >>>>------ >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>>most >>>> from >>>> the latest Intel processors and coprocessors. See abstracts and >>>>register > >>>> >>>>http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clk >>>>trk >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>> >> >> >>-- >>------------------------------------------------------------------- >> Jerven Bolleman Jer...@is... >> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >> 1211 Geneve 4, >> Switzerland www.isb-sib.ch - www.uniprot.org >> Follow us at https://twitter.com/#!/uniprot >>------------------------------------------------------------------- >> >>-------------------------------------------------------------------------- >>---- >>October Webinars: Code for Performance >>Free Intel webinars can help you accelerate application performance. >>Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>from >>the latest Intel processors and coprocessors. See abstracts and register > >>http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktr >>k >>_______________________________________________ >>Bigdata-developers mailing list >>Big...@li... >>https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
|
From: Jeremy J C. <jj...@sy...> - 2013-10-15 21:24:08
|
I do not believe this issue is fixed with recent improvements in the area.
I tried the following work around:
SELECT ?patient ?tt ?s
WHERE {
?patient rdf:type ?tt .
{ FILTER (EXISTS{?a ?b ?c} )
BIND ( true as ?s )
}
UNION
{ FILTER (NOT EXISTS{?a ?b ?c} )
BIND ( false as ?s )
}
} LIMIT 1
and the performance of the filter was poor
The following query does provide a workaround
SELECT ?patient ?tt ?s
WHERE {
?patient rdf:type ?tt .
{ SELECT (true AS ?s)
{ ?a ?b ?c }
LIMIT 1
}
UNION
{ FILTER (NOT EXISTS{?a ?b ?c } )
BIND ( false as ?s )
}
} LIMIT 1
It presents minor issues though with projecting the ?a ?b ?c vars out through the subselect as needed
I would be happy to accept a trac item on this one, but I am not expecting to do bigdata work this week, and probably not next either
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 15, 2013, at 1:48 PM, Bryan Thompson <br...@sy...> wrote:
> There have been a few tickets around exist and bind lately. Any thoughts? Bryan
>
>
> Begin forwarded message:
>
>> From: SourceForge.net <no...@so...>
>> Date: October 15, 2013 at 4:46:21 PM EDT
>> To: SourceForge.net <no...@so...>
>> Subject: [bigdata - Help] Nesting exists within bind gives issues
>>
>>
>> Read and respond to this message at:
>> https://sourceforge.net/projects/bigdata/forums/forum/676946/topic/8756452
>> By: newres
>>
>> Hi,
>>
>> I am currently trying to work with nesting an Exists within Bind which seems
>> to give trouble. A small example:
>>
>>
>> SELECT ?patient ?s
>> WHERE { ?patient rdf:type impact:Patient .
>> BIND ( true AS ?s)
>> }
>>
>> Works fine and binds true to ?s in the result. Meanwhile while nesting, an even
>> trivial exists which accourding to the sparql specificatiion should return a
>> true or false, ?s is always unbound.
>>
>>
>> SELECT ?patient ?s
>> WHERE { ?patient rdf:type impact:Patient .
>> BIND ( EXISTS{?a ?b ?c} AS ?s)
>> }
>>
>> What am I missing here with the nesting ?
>>
>> _____________________________________________________________________________________
>> You are receiving this email because you elected to monitor this topic or entire forum.
>> To stop monitoring this topic visit:
>> https://sourceforge.net/projects/bigdata/forums/forum/676946/topic/8756452/unmonitor
>> To stop monitoring this forum visit:
>> https://sourceforge.net/projects/bigdata/forums/forum/676946/unmonitor
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk_______________________________________________
> Bigdata-developers mailing list
> Big...@li...
> https://lists.sourceforge.net/lists/listinfo/bigdata-developers
|
|
From: Bryan T. <br...@sy...> - 2013-10-15 20:49:18
|
There have been a few tickets around exist and bind lately. Any thoughts? Bryan Begin forwarded message: From: SourceForge.net<http://SourceForge.net> <no...@so...<mailto:no...@so...>> Date: October 15, 2013 at 4:46:21 PM EDT To: SourceForge.net<http://SourceForge.net> <no...@so...<mailto:no...@so...>> Subject: [bigdata - Help] Nesting exists within bind gives issues Read and respond to this message at: https://sourceforge.net/projects/bigdata/forums/forum/676946/topic/8756452 By: newres Hi, I am currently trying to work with nesting an Exists within Bind which seems to give trouble. A small example: SELECT ?patient ?s WHERE { ?patient rdf:type impact:Patient . BIND ( true AS ?s) } Works fine and binds true to ?s in the result. Meanwhile while nesting, an even trivial exists which accourding to the sparql specificatiion should return a true or false, ?s is always unbound. SELECT ?patient ?s WHERE { ?patient rdf:type impact:Patient . BIND ( EXISTS{?a ?b ?c} AS ?s) } What am I missing here with the nesting ? _____________________________________________________________________________________ You are receiving this email because you elected to monitor this topic or entire forum. To stop monitoring this topic visit: https://sourceforge.net/projects/bigdata/forums/forum/676946/topic/8756452/unmonitor To stop monitoring this forum visit: https://sourceforge.net/projects/bigdata/forums/forum/676946/unmonitor |
|
From: Bryan T. <br...@sy...> - 2013-10-15 15:28:56
|
We maintain a fork of the Sesame SPARQL parser in which we have some extensions of the SPARQL language. We also override evaluation. Bryan On 10/15/13 11:24 AM, "Jerven Bolleman" <jer...@is...> wrote: >Hi All, > >I actually did not have time in the last few months to work on this code >at all. But the idea was simple once we translate the SPIN serialization >into Sesame SPARQL objects they would be easy to run >in any sesame store. (Excluding the magic properties which might use JS). > >For BigData this would be a bit more complicated as in my understanding >this does not use the sesame sparql parser (or is it just the evaluation >engine?). > >Regards, >Jerven > > > >On 15/10/13 03:16, Peter Ansell wrote: >> Hi Daniel, >> >> Jerven Bolleman has been looking into translating SPIN queries to >> SPARQL using the OpenRDF API [1] [2]. >> >> Extending that work to dynamically create new OpenRDF Functions from >> SPIN Functions that could be used in queries, in a similar way to the >> Jena version, may be the next logical step after the parser is >> completed. >> >> Cheers, >> >> Peter >> >> [1] https://openrdf.atlassian.net/browse/SES-1840 >> [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN >> >> On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >>> Greetings Jeremey, >>> >>> Its really nice to see your active participation in the Bigdata >>>project. >>> I've yet to contribute anything back to it myself, but am inching >>>closer. >>> >>> You are in a particularly privileged position where you are now >>>familiar >>> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code >>> bases. How much effort do you think it would take for Bigdata to >>>support >>> SPIN through the open source API? Is there anything you can see that >>>would >>> block the integration of the two? If not, then I'll seriously consider >>> making this my first Bigdata undertaking. >>> >>> best wishes, >>> >>> -Daniel >>> >>> >>>------------------------------------------------------------------------ >>>------ >>> October Webinars: Code for Performance >>> Free Intel webinars can help you accelerate application performance. >>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>most >>> from >>> the latest Intel processors and coprocessors. See abstracts and >>>register > >>> >>>http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clk >>>trk >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> > > >-- >------------------------------------------------------------------- > Jerven Bolleman Jer...@is... > SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 > CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 > 1211 Geneve 4, > Switzerland www.isb-sib.ch - www.uniprot.org > Follow us at https://twitter.com/#!/uniprot >------------------------------------------------------------------- > >-------------------------------------------------------------------------- >---- >October Webinars: Code for Performance >Free Intel webinars can help you accelerate application performance. >Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >from >the latest Intel processors and coprocessors. See abstracts and register > >http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktr >k >_______________________________________________ >Bigdata-developers mailing list >Big...@li... >https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Jerven B. <jer...@is...> - 2013-10-15 15:25:08
|
Hi All, I actually did not have time in the last few months to work on this code at all. But the idea was simple once we translate the SPIN serialization into Sesame SPARQL objects they would be easy to run in any sesame store. (Excluding the magic properties which might use JS). For BigData this would be a bit more complicated as in my understanding this does not use the sesame sparql parser (or is it just the evaluation engine?). Regards, Jerven On 15/10/13 03:16, Peter Ansell wrote: > Hi Daniel, > > Jerven Bolleman has been looking into translating SPIN queries to > SPARQL using the OpenRDF API [1] [2]. > > Extending that work to dynamically create new OpenRDF Functions from > SPIN Functions that could be used in queries, in a similar way to the > Jena version, may be the next logical step after the parser is > completed. > > Cheers, > > Peter > > [1] https://openrdf.atlassian.net/browse/SES-1840 > [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN > > On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: >> Greetings Jeremey, >> >> Its really nice to see your active participation in the Bigdata project. >> I've yet to contribute anything back to it myself, but am inching closer. >> >> You are in a particularly privileged position where you are now familiar >> with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code >> bases. How much effort do you think it would take for Bigdata to support >> SPIN through the open source API? Is there anything you can see that would >> block the integration of the two? If not, then I'll seriously consider >> making this my first Bigdata undertaking. >> >> best wishes, >> >> -Daniel >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >> from >> the latest Intel processors and coprocessors. See abstracts and register > >> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> -- ------------------------------------------------------------------- Jerven Bolleman Jer...@is... SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 1211 Geneve 4, Switzerland www.isb-sib.ch - www.uniprot.org Follow us at https://twitter.com/#!/uniprot ------------------------------------------------------------------- |
|
From: Peter A. <ans...@gm...> - 2013-10-15 01:16:12
|
Hi Daniel, Jerven Bolleman has been looking into translating SPIN queries to SPARQL using the OpenRDF API [1] [2]. Extending that work to dynamically create new OpenRDF Functions from SPIN Functions that could be used in queries, in a similar way to the Jena version, may be the next logical step after the parser is completed. Cheers, Peter [1] https://openrdf.atlassian.net/browse/SES-1840 [2] https://bitbucket.org/jbollema/sesame/commits/all/tip/SPIN On 15 October 2013 01:05, Daniel Mekonnen <da...@sy...> wrote: > Greetings Jeremey, > > Its really nice to see your active participation in the Bigdata project. > I've yet to contribute anything back to it myself, but am inching closer. > > You are in a particularly privileged position where you are now familiar > with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code > bases. How much effort do you think it would take for Bigdata to support > SPIN through the open source API? Is there anything you can see that would > block the integration of the two? If not, then I'll seriously consider > making this my first Bigdata undertaking. > > best wishes, > > -Daniel > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
|
From: Jeremy J C. <jj...@sy...> - 2013-10-14 23:54:27
|
Hi Daniel I am not sure where one would start … presumably you are wanting a server side SPIN implementation - this would be quite a lot of work since SPIN libraries are written on top of Jena which does not figure inside of bigdata Of course one could implement client side (in Jena), but then performance would go through the floor Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 14, 2013, at 7:05 AM, Daniel Mekonnen <da...@sy...> wrote: > Greetings Jeremey, > > Its really nice to see your active participation in the Bigdata project. I've yet to contribute anything back to it myself, but am inching closer. > > You are in a particularly privileged position where you are now familiar with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code bases. How much effort do you think it would take for Bigdata to support SPIN through the open source API? Is there anything you can see that would block the integration of the two? If not, then I'll seriously consider making this my first Bigdata undertaking. > > best wishes, > > -Daniel > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Daniel M. <da...@sy...> - 2013-10-14 14:05:58
|
Greetings Jeremey, Its really nice to see your active participation in the Bigdata project. I've yet to contribute anything back to it myself, but am inching closer. You are in a particularly privileged position where you are now familiar with both the SPIN API (http://topbraid.org/spin/api/) and Bigdata code bases. How much effort do you think it would take for Bigdata to support SPIN through the open source API? Is there anything you can see that would block the integration of the two? If not, then I'll seriously consider making this my first Bigdata undertaking. best wishes, -Daniel |
|
From: Bryan T. <br...@sy...> - 2013-10-11 22:12:37
|
FYI, I do see a performance impact from NSPIN.READ, but I have not yet
found a value that does better than the default on our standard
benchmarking node. See below.
Bryan
On 10/11/13 6:10 PM, "bigdata®" <no...@so...> wrote:
>#740: performance impact of NSPIN
>----------------------------+---------------------------------------------
>--
> Reporter: jeremy_carroll | Owner: jeremy_carroll
> Type: defect | Status: accepted
> Priority: major | Milestone:
>Component: Bigdata SAIL | Version: BIGDATA_RELEASE_1_2_2
> Keywords: |
>----------------------------+---------------------------------------------
>--
>
>Comment(by thompsonbry):
>
> I can confirm a big impact from NSPIN on the BSBM 100M Explore query mix.
> However, the value of 100000 drops the performance significantly over the
> default of 100. Each trial is a full run of 50 warmup query mixes plus
>500
> query mixes. This is with 16 client threads on a 2011 Intel Mac mini
>with
> an SSD and 16G of RAM (our default setup for running this benchmark). I
>am
> using the bigdata11 host.
>
> {{{
> @100 =~ 49k QMpH (after 5 trials, with peak performance to 52k QMpH)
> @1000 =~ 40k QMpH (after 5 trials, but still unchanged after 20 trials)
> @10000 =~ 35k QMpH (after 5 trials)
> @100000 =~ 18k QMpH (after 4 trials - aborted early since clearly vastly
> slower)
> }}}
>
> Some annotated vmstat output is below.
> {{{
> vmstat -n 60
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy
>id
> wa
>
> @100
> 24 0 0 2577160 234664 11132924 0 0 1 1 2 1 1
>0
> 99 0
> 24 0 0 2360732 234664 11143836 0 0 291 106 3829 17447 94
> 4 2 0
> 13 0 0 2259448 234664 11164748 0 0 376 7 4424 22839 93
> 5 2 0
> 4 0 0 2168540 234664 11202600 0 0 530 231 5439 39167 84
> 5 10 0
> 6 1 0 2064804 234664 11243004 0 0 534 262 5611 42701 83
> 6 11 0
> 14 0 0 1978540 234664 11246852 0 0 254 71 5432 41527 81
> 6 13 0
> 15 0 0 1901856 234664 11284636 0 0 486 7 5692 43495 82
> 6 12 0
> 25 0 0 1818040 234664 11301536 0 0 466 256 5559 42428 81
> 6 13 0
> 13 0 0 1789488 234664 11336968 0 0 439 71 5757 44099 82
> 6 12 0
> 13 0 0 1683520 234664 11351136 0 0 411 266 5624 43047 81
> 5 13 0
> 11 0 0 1639276 234664 11384632 0 0 415 121 5664 43770 81
> 6 13 0
> 28 0 0 1537180 234664 11398084 0 0 366 277 5640 43359 81
> 6 13 0
> 16 0 0 1489488 234664 11425612 0 0 343 95 5596 43497 82
> 5 13 0
> 0 0 0 4237804 234684 11429704 0 0 186 201 3267 24943 47
> 4 49 0
>
> @1000
> 8 0 0 2103120 234684 11553660 0 0 88 7 2983 16066 90
>3
> 6 0
> 8 0 0 1943396 234684 11562316 0 0 151 178 4013 27923 83
> 4 12 0
> 2 0 0 1856952 234684 11574844 0 0 167 35 4138 38283 66
> 5 29 0
> 11 0 0 1736128 234684 11588228 0 0 171 249 4186 39123 66
> 5 29 0
> 1 0 0 1810508 234684 11600240 0 0 153 227 4079 38426 66
> 4 30 0
> 11 0 0 1613000 234684 11590980 0 0 131 279 3949 36645 65
> 4 31 0
> 6 0 0 1555644 234684 11602468 0 0 141 19 4121 38954 67
> 4 29 0
> 13 0 0 1518924 234684 11614240 0 0 144 201 4171 39545 67
> 4 29 0
> 2 0 0 1462912 234684 11625288 0 0 129 266 4139 39045 67
> 5 29 0
> 4 0 0 1415600 234684 11635752 0 0 123 211 4143 39169 66
> 4 29 0
>
> @10000
> 17 0 0 2215528 234684 11486076 0 0 118 131 2859 17064 93
> 3 4 0
> 9 0 0 2057240 234684 11492604 0 0 153 136 3523 25051 86
> 4 11 0
> 5 0 0 1974700 234684 11505408 0 0 203 237 3865 32972 76
> 4 20 0
> 6 0 0 1874744 234684 11517768 0 0 194 251 3902 33443 76
> 4 20 0
> 5 0 0 1821332 234684 11529316 0 0 180 266 3861 33425 75
> 4 21 0
> 2 0 0 1757648 234684 11540216 0 0 165 8 3885 33745 76
> 4 20 0
> 1 0 0 4106524 234684 11548984 0 0 159 205 3585 31021 70
> 4 27 0
>
> @100000
> 44 0 0 2319320 234684 11440564 0 0 76 91 2051 10715 97
> 2 1 0
> 5 0 0 2147392 234684 11454628 0 0 98 141 2053 12945 98
> 1 1 0
> 11 0 0 2034736 234684 11450632 0 0 105 6 1987 13235 97
> 1 1 0
> 28 0 0 2002932 234684 11468332 0 0 124 177 2113 14912 98
> 1 1 0
> 19 0 0 1915584 234684 11466676 0 0 115 96 2187 15370 95
> 2 3 0
> 1 0 0 1887056 234684 11483456 0 0 101 230 2188 15735 97
> 1 1 0
> 10 0 0 3523848 234684 11468908 0 0 27 19 999 5893 41
>1
> 58 0
> }}}
>
> I am going try @ 50, 250, and 500 and see if I can find a setting that
> gives higher performance on this machine. The default seems to be pretty
> good.
>
>--
>Ticket URL:
><http://sourceforge.net/apps/trac/bigdata/ticket/740#comment:12>
>bigdata® <http://www.bigdata.com/blog>
>bigdata® is a scale-out storage and computing fabric supporting optional
>transactions, very high concurrency, and very high aggregate IO rates.
|
|
From: Bryan T. <br...@sy...> - 2013-10-10 16:59:32
|
Ok. Bryan From: Jeremy Carroll <jj...@sy...<mailto:jj...@sy...>> Date: Thursday, October 10, 2013 12:57 PM To: Bryan Thompson <br...@sy...<mailto:br...@sy...>> Cc: "Big...@li...<mailto:Big...@li...>" <Big...@li...<mailto:Big...@li...>> Subject: Re: [Bigdata-developers] MIN and MAX aggregates fixed: trac 736 (with question?) I can revert the MIN/MAX change from yesterday (which resurrects the old semantics) and then we can discuss tomorrow how to proceed. -- I did some digging: > Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B The spec has been fairly clear since the introduction of MIN in http://www.w3.org/TR/2010/WD-sparql11-query-20100126/ [[ Definition: Min The multiset of values passed as an argument is converted to a sequence S, this sequence is ordered as per the ORDER BY ASC clause. Min(S) = S0 ]] The current key phrase [[ The[y] make use of the SPARQL ORDER BY ordering definition, to allow ordering over arbitrarily typed expressions. ]] is in http://www.w3.org/TR/2010/WD-sparql11-query-20100601/ Lee himself seems to argue for the current semantics in http://lists.w3.org/Archives/Public/public-rdf-dawg/2009OctDec/0408.html All of this predates the test which seem to be introduced around Dec 2011 in http://bigdata.svn.sourceforge.net/viewvc/bigdata?logsort=cvs&view=revision&revision=5822 Looking at the rec further, there should be a test that unbound comes before other values hmm something like SELECT MIN(?x) MAX(?x) { { BIND(1 as ?x)} UNION { BIND(1 +"x" as ?x)} } I think should return (UNBOUND, 1) === Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 8:26 AM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote: Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B -------- Original message -------- From: Jeremy J Carroll <jj...@sy...<mailto:jj...@sy...>> Date: 10/10/2013 11:02 AM (GMT-05:00) To: Bryan Thompson <br...@sy...<mailto:br...@sy...>> Cc: Big...@li...<mailto:Big...@li...> Subject: Re: [Bigdata-developers] MIN and MAX aggregates fixed: trac 736 (with question?) Shall I update these tests to not check for error, but to check for giving the correct min or max. It will lose checking that the error is sticky, but that is hard since I don't believe we should have expected errors for MIN and MAX? (We could discuss tomorrow) === Rationale: These tests both check for the old behavior which explicitly does not implement the recommendation (and had the FIXME for fixing it). The key part for both tests is: final IBindingSet data [] = new IBindingSet [] { new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book1, price9 } ) , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book2, price5 } ) , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth2, book3, auth2 } ) , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org2, auth3, book4, price7 } ) }; They then call MAX or MIN on the ?lprice. The old behavior was to raise an error, my guess is that the new behavior is to give auth2 MIN and price9 MAX (or it might be price5 MIN and auth2 MAX) Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 7:49 AM, Jeremy J Carroll <jj...@sy...<mailto:jj...@sy...>> wrote: thanks Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote: Jeremy, com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll) And also TestMIN.test_min_with_errors are failing in CI. Bryan On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...<mailto:jj...@sy...>> wrote: I have verified that MIN and MAX both inherit from INeedsMaterialization and added tests that use some of the literals from TestIVComparator that need materialization. Unless Jenkins says otherwise I believe the change is good Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...<mailto:jj...@sy...>> wrote: This is really helpful, I was just about to go home, but think I will spend a few more minutes on this! Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote: >From the ticket you mentioned -- the INeedsMaterialization interface is used to declare which bops require materialization. This gets lifted from subexpressions to decide if we need to do materialization before entering the expression. SNIP<<< There are some known issues with comparisons of inline and non-inline IVs which MikeP will be addressing shortly. Basically, we need an INeedsMaterialization option for operators which always require the materialization of non-inline IVs. ORDER_BY and aggregation both have this characteristic since they must run over all solutions at once and can not re-try solutions which have failed with a NotMaterializedException? <https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>. As an initial step in that direction, I have defined an IVComparator and a test suite for that class and also added a compareLiteral(IV,Literal) to IVUtility. While this covers some cases, it does not cover all as demonstrated by TestIVComparator. SNIP<<< I think that the question for MIN and MAX is just whether they are imposing the total ordering for SPARQL over RDF Values. Again, I would have to look at the code. That might be an old FIXME. Or it might be a valid problem. However, I think that there are DAWG test cases that cover this. I would check there first and make sure that there is a problem (or an absence of test coverage). Bryan On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...<mailto:jj...@sy...>> wrote: Please glance at this code (from MIN.java or MAX.java) ... + private static IVComparator comparator = new IVComparator(); Š - /** - * FIXME This needs to use the ordering define by ORDER_BY. The - * CompareBOp imposes the ordering defined for the "<" operator - * which is less robust and will throw a type exception if you - * attempt to compare unlike Values. - * - * @see https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5 - */ - if (CompareBOp.compare(iv, min, CompareOp.LT)) { + if (comparator.compare(iv, min)<0) { min = iv; } It seems to work, but I read something about requiring materialization which I did not understand and chose to ignore - was that a mistake? Jeremy J Carroll Principal Architect Syapse, Inc. -------------------------------------------------------------------------- ---- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr k _______________________________________________ Bigdata-developers mailing list Big...@li...<mailto:Big...@li...> https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
|
From: Jeremy J C. <jj...@sy...> - 2013-10-10 16:57:37
|
I can revert the MIN/MAX change from yesterday (which resurrects the old semantics) and then we can discuss tomorrow how to proceed. -- I did some digging: > Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B The spec has been fairly clear since the introduction of MIN in http://www.w3.org/TR/2010/WD-sparql11-query-20100126/ [[ Definition: Min The multiset of values passed as an argument is converted to a sequence S, this sequence is ordered as per the ORDER BY ASC clause. Min(S) = S0 ]] The current key phrase [[ The[y] make use of the SPARQL ORDER BY ordering definition, to allow ordering over arbitrarily typed expressions. ]] is in http://www.w3.org/TR/2010/WD-sparql11-query-20100601/ Lee himself seems to argue for the current semantics in http://lists.w3.org/Archives/Public/public-rdf-dawg/2009OctDec/0408.html All of this predates the test which seem to be introduced around Dec 2011 in http://bigdata.svn.sourceforge.net/viewvc/bigdata?logsort=cvs&view=revision&revision=5822 Looking at the rec further, there should be a test that unbound comes before other values hmm something like SELECT MIN(?x) MAX(?x) { { BIND(1 as ?x)} UNION { BIND(1 +"x" as ?x)} } I think should return (UNBOUND, 1) === Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 8:26 AM, Bryan Thompson <br...@sy...> wrote: > Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B > > > -------- Original message -------- > From: Jeremy J Carroll <jj...@sy...> > Date: 10/10/2013 11:02 AM (GMT-05:00) > To: Bryan Thompson <br...@sy...> > Cc: Big...@li... > Subject: Re: [Bigdata-developers] MIN and MAX aggregates fixed: trac 736 (with question?) > > > Shall I update these tests to not check for error, but to check for giving the correct min or max. > > It will lose checking that the error is sticky, but that is hard since I don't believe we should have expected errors for MIN and MAX? > > (We could discuss tomorrow) > > === > > Rationale: > > These tests both check for the old behavior which explicitly does not implement the recommendation (and had the FIXME for fixing it). > > The key part for both tests is: > > > final IBindingSet data [] = new IBindingSet [] > { > new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book1, price9 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book2, price5 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth2, book3, auth2 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org2, auth3, book4, price7 } ) > }; > > They then call MAX or MIN on the ?lprice. > The old behavior was to raise an error, my guess is that the new behavior is to give auth2 MIN and price9 MAX (or it might be price5 MIN and auth2 MAX) > > > Jeremy J Carroll > Principal Architect > Syapse, Inc. > > > > On Oct 10, 2013, at 7:49 AM, Jeremy J Carroll <jj...@sy...> wrote: > >> thanks >> Jeremy J Carroll >> Principal Architect >> Syapse, Inc. >> >> >> >> On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...> wrote: >> >>> Jeremy, >>> >>> com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll) >>> >>> And also TestMIN.test_min_with_errors are failing in CI. >>> >>> Bryan >>> >>> On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...> wrote: >>> >>>> I have verified that MIN and MAX both inherit from INeedsMaterialization >>>> and added tests that use some of the literals from TestIVComparator that need materialization. >>>> >>>> Unless Jenkins says otherwise I believe the change is good >>>> >>>> Jeremy J Carroll >>>> Principal Architect >>>> Syapse, Inc. >>>> >>>> >>>> >>>> On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...> wrote: >>>> >>>>> This is really helpful, I was just about to go home, but think I will spend a few more minutes on this! >>>>> >>>>> >>>>> Jeremy J Carroll >>>>> Principal Architect >>>>> Syapse, Inc. >>>>> >>>>> >>>>> >>>>> On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...> wrote: >>>>> >>>>>> From the ticket you mentioned -- the INeedsMaterialization interface is >>>>>> used to declare which bops require materialization. This gets lifted from >>>>>> subexpressions to decide if we need to do materialization before entering >>>>>> the expression. >>>>>> >>>>>>>>> SNIP<<< >>>>>> There are some known issues with comparisons of inline and non-inline >>>>>> IVs which MikeP will be addressing shortly. Basically, we need an >>>>>> INeedsMaterialization option for operators which always require the >>>>>> materialization of non-inline IVs. ORDER_BY and aggregation both have >>>>>> this characteristic since they must run over all solutions at once and >>>>>> can not re-try solutions which have failed with a >>>>>> NotMaterializedException? >>>>>> <https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>. >>>>>> As an initial step in that direction, I have defined an IVComparator >>>>>> and a test suite for that class and also added a >>>>>> compareLiteral(IV,Literal) to IVUtility. While this covers some cases, >>>>>> it does not cover all as demonstrated by TestIVComparator. >>>>>> >>>>>>>>> SNIP<<< >>>>>> >>>>>> I think that the question for MIN and MAX is just whether they are >>>>>> imposing the total ordering for SPARQL over RDF Values. Again, I would >>>>>> have to look at the code. That might be an old FIXME. Or it might be a >>>>>> valid problem. However, I think that there are DAWG test cases that cover >>>>>> this. I would check there first and make sure that there is a problem (or >>>>>> an absence of test coverage). >>>>>> >>>>>> >>>>>> Bryan >>>>>> >>>>>> >>>>>> On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...> wrote: >>>>>> >>>>>>> >>>>>>> Please glance at this code (from MIN.java or MAX.java) >>>>>>> >>>>>>> ... >>>>>>> >>>>>>> + private static IVComparator comparator = new IVComparator(); >>>>>>> >>>>>>> Š >>>>>>> >>>>>>> >>>>>>> - /** >>>>>>> - * FIXME This needs to use the ordering define by >>>>>>> ORDER_BY. The >>>>>>> - * CompareBOp imposes the ordering defined for the "<" >>>>>>> operator >>>>>>> - * which is less robust and will throw a type exception >>>>>>> if you >>>>>>> - * attempt to compare unlike Values. >>>>>>> - * >>>>>>> - * @see >>>>>>> https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5 >>>>>>> - */ >>>>>>> - if (CompareBOp.compare(iv, min, CompareOp.LT)) { >>>>>>> + if (comparator.compare(iv, min)<0) { >>>>>>> >>>>>>> min = iv; >>>>>>> >>>>>>> } >>>>>>> >>>>>>> >>>>>>> >>>>>>> It seems to work, but I read something about requiring materialization >>>>>>> which I did not understand and chose to ignore - was that a mistake? >>>>>>> >>>>>>> Jeremy J Carroll >>>>>>> Principal Architect >>>>>>> Syapse, Inc. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> ---- >>>>>>> October Webinars: Code for Performance >>>>>>> Free Intel webinars can help you accelerate application performance. >>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>>>>>> from >>>>>>> the latest Intel processors and coprocessors. See abstracts and register > >>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr >>>>>>> k >>>>>>> _______________________________________________ >>>>>>> Bigdata-developers mailing list >>>>>>> Big...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>>>> >>>>> >>>> >> > |
|
From: Bryan T. <br...@sy...> - 2013-10-10 15:32:05
|
Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B
-------- Original message --------
From: Jeremy J Carroll <jj...@sy...>
Date: 10/10/2013 11:02 AM (GMT-05:00)
To: Bryan Thompson <br...@sy...>
Cc: Big...@li...
Subject: Re: [Bigdata-developers] MIN and MAX aggregates fixed: trac 736 (with question?)
Shall I update these tests to not check for error, but to check for giving the correct min or max.
It will lose checking that the error is sticky, but that is hard since I don't believe we should have expected errors for MIN and MAX?
(We could discuss tomorrow)
===
Rationale:
These tests both check for the old behavior which explicitly does not implement the recommendation (and had the FIXME for fixing it).
The key part for both tests is:
final IBindingSet data [] = new IBindingSet []
{
new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book1, price9 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book2, price5 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth2, book3, auth2 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org2, auth3, book4, price7 } )
};
They then call MAX or MIN on the ?lprice.
The old behavior was to raise an error, my guess is that the new behavior is to give auth2 MIN and price9 MAX (or it might be price5 MIN and auth2 MAX)
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 10, 2013, at 7:49 AM, Jeremy J Carroll <jj...@sy...<mailto:jj...@sy...>> wrote:
thanks
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote:
Jeremy,
com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll)
And also TestMIN.test_min_with_errors are failing in CI.
Bryan
On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...<mailto:jj...@sy...>> wrote:
I have verified that MIN and MAX both inherit from INeedsMaterialization
and added tests that use some of the literals from TestIVComparator that need materialization.
Unless Jenkins says otherwise I believe the change is good
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...<mailto:jj...@sy...>> wrote:
This is really helpful, I was just about to go home, but think I will spend a few more minutes on this!
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote:
>From the ticket you mentioned -- the INeedsMaterialization interface is
used to declare which bops require materialization. This gets lifted from
subexpressions to decide if we need to do materialization before entering
the expression.
SNIP<<<
There are some known issues with comparisons of inline and non-inline
IVs which MikeP will be addressing shortly. Basically, we need an
INeedsMaterialization option for operators which always require the
materialization of non-inline IVs. ORDER_BY and aggregation both have
this characteristic since they must run over all solutions at once and
can not re-try solutions which have failed with a
NotMaterializedException?
<https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>.
As an initial step in that direction, I have defined an IVComparator
and a test suite for that class and also added a
compareLiteral(IV,Literal) to IVUtility. While this covers some cases,
it does not cover all as demonstrated by TestIVComparator.
SNIP<<<
I think that the question for MIN and MAX is just whether they are
imposing the total ordering for SPARQL over RDF Values. Again, I would
have to look at the code. That might be an old FIXME. Or it might be a
valid problem. However, I think that there are DAWG test cases that cover
this. I would check there first and make sure that there is a problem (or
an absence of test coverage).
Bryan
On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...<mailto:jj...@sy...>> wrote:
Please glance at this code (from MIN.java or MAX.java)
...
+ private static IVComparator comparator = new IVComparator();
Š
- /**
- * FIXME This needs to use the ordering define by
ORDER_BY. The
- * CompareBOp imposes the ordering defined for the "<"
operator
- * which is less robust and will throw a type exception
if you
- * attempt to compare unlike Values.
- *
- * @see
https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5
- */
- if (CompareBOp.compare(iv, min, CompareOp.LT)) {
+ if (comparator.compare(iv, min)<0) {
min = iv;
}
It seems to work, but I read something about requiring materialization
which I did not understand and chose to ignore - was that a mistake?
Jeremy J Carroll
Principal Architect
Syapse, Inc.
--------------------------------------------------------------------------
----
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr
k
_______________________________________________
Bigdata-developers mailing list
Big...@li...<mailto:Big...@li...>
https://lists.sourceforge.net/lists/listinfo/bigdata-developers
|
|
From: Jeremy J C. <jj...@sy...> - 2013-10-10 15:07:47
|
Shall I update these tests to not check for error, but to check for giving the correct min or max.
It will lose checking that the error is sticky, but that is hard since I don't believe we should have expected errors for MIN and MAX?
(We could discuss tomorrow)
===
Rationale:
These tests both check for the old behavior which explicitly does not implement the recommendation (and had the FIXME for fixing it).
The key part for both tests is:
final IBindingSet data [] = new IBindingSet []
{
new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book1, price9 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book2, price5 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth2, book3, auth2 } )
, new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org2, auth3, book4, price7 } )
};
They then call MAX or MIN on the ?lprice.
The old behavior was to raise an error, my guess is that the new behavior is to give auth2 MIN and price9 MAX (or it might be price5 MIN and auth2 MAX)
Jeremy J Carroll
Principal Architect
Syapse, Inc.
On Oct 10, 2013, at 7:49 AM, Jeremy J Carroll <jj...@sy...> wrote:
> thanks
> Jeremy J Carroll
> Principal Architect
> Syapse, Inc.
>
>
>
> On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...> wrote:
>
>> Jeremy,
>>
>> com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll)
>>
>> And also TestMIN.test_min_with_errors are failing in CI.
>>
>> Bryan
>>
>> On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...> wrote:
>>
>>> I have verified that MIN and MAX both inherit from INeedsMaterialization
>>> and added tests that use some of the literals from TestIVComparator that need materialization.
>>>
>>> Unless Jenkins says otherwise I believe the change is good
>>>
>>> Jeremy J Carroll
>>> Principal Architect
>>> Syapse, Inc.
>>>
>>>
>>>
>>> On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...> wrote:
>>>
>>>> This is really helpful, I was just about to go home, but think I will spend a few more minutes on this!
>>>>
>>>>
>>>> Jeremy J Carroll
>>>> Principal Architect
>>>> Syapse, Inc.
>>>>
>>>>
>>>>
>>>> On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...> wrote:
>>>>
>>>>> From the ticket you mentioned -- the INeedsMaterialization interface is
>>>>> used to declare which bops require materialization. This gets lifted from
>>>>> subexpressions to decide if we need to do materialization before entering
>>>>> the expression.
>>>>>
>>>>>>>> SNIP<<<
>>>>> There are some known issues with comparisons of inline and non-inline
>>>>> IVs which MikeP will be addressing shortly. Basically, we need an
>>>>> INeedsMaterialization option for operators which always require the
>>>>> materialization of non-inline IVs. ORDER_BY and aggregation both have
>>>>> this characteristic since they must run over all solutions at once and
>>>>> can not re-try solutions which have failed with a
>>>>> NotMaterializedException?
>>>>> <https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>.
>>>>> As an initial step in that direction, I have defined an IVComparator
>>>>> and a test suite for that class and also added a
>>>>> compareLiteral(IV,Literal) to IVUtility. While this covers some cases,
>>>>> it does not cover all as demonstrated by TestIVComparator.
>>>>>
>>>>>>>> SNIP<<<
>>>>>
>>>>> I think that the question for MIN and MAX is just whether they are
>>>>> imposing the total ordering for SPARQL over RDF Values. Again, I would
>>>>> have to look at the code. That might be an old FIXME. Or it might be a
>>>>> valid problem. However, I think that there are DAWG test cases that cover
>>>>> this. I would check there first and make sure that there is a problem (or
>>>>> an absence of test coverage).
>>>>>
>>>>>
>>>>> Bryan
>>>>>
>>>>>
>>>>> On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...> wrote:
>>>>>
>>>>>>
>>>>>> Please glance at this code (from MIN.java or MAX.java)
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> + private static IVComparator comparator = new IVComparator();
>>>>>>
>>>>>> Š
>>>>>>
>>>>>>
>>>>>> - /**
>>>>>> - * FIXME This needs to use the ordering define by
>>>>>> ORDER_BY. The
>>>>>> - * CompareBOp imposes the ordering defined for the "<"
>>>>>> operator
>>>>>> - * which is less robust and will throw a type exception
>>>>>> if you
>>>>>> - * attempt to compare unlike Values.
>>>>>> - *
>>>>>> - * @see
>>>>>> https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5
>>>>>> - */
>>>>>> - if (CompareBOp.compare(iv, min, CompareOp.LT)) {
>>>>>> + if (comparator.compare(iv, min)<0) {
>>>>>>
>>>>>> min = iv;
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> It seems to work, but I read something about requiring materialization
>>>>>> which I did not understand and chose to ignore - was that a mistake?
>>>>>>
>>>>>> Jeremy J Carroll
>>>>>> Principal Architect
>>>>>> Syapse, Inc.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> ----
>>>>>> October Webinars: Code for Performance
>>>>>> Free Intel webinars can help you accelerate application performance.
>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>>>>>> from
>>>>>> the latest Intel processors and coprocessors. See abstracts and register >
>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr
>>>>>> k
>>>>>> _______________________________________________
>>>>>> Bigdata-developers mailing list
>>>>>> Big...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers
>>>>>
>>>>
>>>
>
|
|
From: Jeremy J C. <jj...@sy...> - 2013-10-10 14:49:39
|
thanks Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...> wrote: > Jeremy, > > com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll) > > And also TestMIN.test_min_with_errors are failing in CI. > > Bryan > > On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...> wrote: > >> I have verified that MIN and MAX both inherit from INeedsMaterialization >> and added tests that use some of the literals from TestIVComparator that need materialization. >> >> Unless Jenkins says otherwise I believe the change is good >> >> Jeremy J Carroll >> Principal Architect >> Syapse, Inc. >> >> >> >> On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...> wrote: >> >>> This is really helpful, I was just about to go home, but think I will spend a few more minutes on this! >>> >>> >>> Jeremy J Carroll >>> Principal Architect >>> Syapse, Inc. >>> >>> >>> >>> On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...> wrote: >>> >>>> From the ticket you mentioned -- the INeedsMaterialization interface is >>>> used to declare which bops require materialization. This gets lifted from >>>> subexpressions to decide if we need to do materialization before entering >>>> the expression. >>>> >>>>>>> SNIP<<< >>>> There are some known issues with comparisons of inline and non-inline >>>> IVs which MikeP will be addressing shortly. Basically, we need an >>>> INeedsMaterialization option for operators which always require the >>>> materialization of non-inline IVs. ORDER_BY and aggregation both have >>>> this characteristic since they must run over all solutions at once and >>>> can not re-try solutions which have failed with a >>>> NotMaterializedException? >>>> <https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>. >>>> As an initial step in that direction, I have defined an IVComparator >>>> and a test suite for that class and also added a >>>> compareLiteral(IV,Literal) to IVUtility. While this covers some cases, >>>> it does not cover all as demonstrated by TestIVComparator. >>>> >>>>>>> SNIP<<< >>>> >>>> I think that the question for MIN and MAX is just whether they are >>>> imposing the total ordering for SPARQL over RDF Values. Again, I would >>>> have to look at the code. That might be an old FIXME. Or it might be a >>>> valid problem. However, I think that there are DAWG test cases that cover >>>> this. I would check there first and make sure that there is a problem (or >>>> an absence of test coverage). >>>> >>>> >>>> Bryan >>>> >>>> >>>> On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...> wrote: >>>> >>>>> >>>>> Please glance at this code (from MIN.java or MAX.java) >>>>> >>>>> ... >>>>> >>>>> + private static IVComparator comparator = new IVComparator(); >>>>> >>>>> Š >>>>> >>>>> >>>>> - /** >>>>> - * FIXME This needs to use the ordering define by >>>>> ORDER_BY. The >>>>> - * CompareBOp imposes the ordering defined for the "<" >>>>> operator >>>>> - * which is less robust and will throw a type exception >>>>> if you >>>>> - * attempt to compare unlike Values. >>>>> - * >>>>> - * @see >>>>> https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5 >>>>> - */ >>>>> - if (CompareBOp.compare(iv, min, CompareOp.LT)) { >>>>> + if (comparator.compare(iv, min)<0) { >>>>> >>>>> min = iv; >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> It seems to work, but I read something about requiring materialization >>>>> which I did not understand and chose to ignore - was that a mistake? >>>>> >>>>> Jeremy J Carroll >>>>> Principal Architect >>>>> Syapse, Inc. >>>>> >>>>> >>>>> >>>>> >>>>> -------------------------------------------------------------------------- >>>>> ---- >>>>> October Webinars: Code for Performance >>>>> Free Intel webinars can help you accelerate application performance. >>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>>>> from >>>>> the latest Intel processors and coprocessors. See abstracts and register > >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr >>>>> k >>>>> _______________________________________________ >>>>> Bigdata-developers mailing list >>>>> Big...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>> >>> >> |