Re: [Bigdata-developers] ALPP performance

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

You would want to use a custom inference model that just had the exact
rules you needed. Look at FullClosure or FastClosure and at
InferenceEngine.Options.
Bryan

On 11/8/13 12:03 PM, "Jeremy J Carroll" <jj...@sy...> wrote:

>
>Hmmm - actually I should try enabling truth maintenance at some minimal
>level and see what happens
>
>
>Jeremy J Carroll
>Principal Architect
>Syapse, Inc.
>
>
>
>On Nov 8, 2013, at 8:50 AM, Jeremy J Carroll <jj...@sy...> wrote:
>
>> 
>> This message is highlighting a high-level issue to do with ALPPs versus
>>materialized versions of the same query.
>> 
>> yesterday I finished porting the final piece of the Syapse
>>application's "normal user" functionality from our legacy knowledge base
>>to bigdata.
>> This piece was the facetted browser - which has a heavy dependency on
>>some typing functionality, partial queries that I was writing as
>> 
>> [A] ?object rdf:type / rdfs:subClassOf * ?class
>> 
>> (this is a very small part of a big query that populates every cell of
>>a facetted browse page)
>> 
>> The performance of the initial cut was very significantly lower than
>>the legacy system: I got a big boost by pulling in a recent change from
>>Mike; but even so I was not in the right ball-park.
>> 
>> On analysis the issue seemed to come down to the rdfs:subClassOf *
>>expressions, and I can meet my performance expectations by materializing
>>the reflexive transitive closure of this property so that the query
>>becomes
>> 
>> [B] ?object rdf:type / syapse:optimizedSubClassOf ?class
>> 
>> (approx: I got a factor of 10 from Mike's changes and a further factor
>>of maybe 5 from materializing)
>> 
>> The architectural question is:
>> 
>> - should the ALPP code actually do a materialization (which would need
>>to be invalidated on update), probably controlled by an optimization
>>hint, or by counting (e.g. if we call rdfs:subClassOf * sufficiently
>>frequently compared with the updates then we should materialize)
>> 
>> if it did, I imagine that the performance of the initial query [A]
>>could approach that of the optimized query [B].
>> 
>> Arguments against (other than time and prioritization) are:
>> - this optimization is better done by the end user (as I am doing),
>>where it can be guided by application knowledge (which is true for me -
>>syapse:optimizedSubClassOf is strictly less than rdfs:subClassOf *, e.g.
>>it is only reflexive on classes, and only on those classes that I care
>>about in the sort of query I am supporting)
>> - the cache invalidation is also hard to get right in a general
>>setting, whereas application level knowledge can make cache invalidation
>>trivial (in the syapse application any change to the ontology is a
>>pretty rare admin function, and we can invalidate all ontological caches
>>for every change without any issue)
>> 
>> Arguments for are - this is otherwise an improvement that is
>>conceptually straightforward
>> 
>> Jeremy J Carroll
>> Principal Architect
>> Syapse, Inc.
>> 
>> 
>> 
>
>
>--------------------------------------------------------------------------
>----
>November Webinars for C, C++, Fortran Developers
>Accelerate application performance with scalable programming models.
>Explore
>techniques for threading, error checking, porting, and tuning. Get the
>most 
>from the latest Intel processors and coprocessors. See abstracts and
>register
>http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktr
>k
>_______________________________________________
>Bigdata-developers mailing list
>Big...@li...
>https://lists.sourceforge.net/lists/listinfo/bigdata-developers

Re: [Bigdata-developers] ALPP performance

Fast, scalable, robust graph database platform

Re: [Bigdata-developers] ALPP performance