|
From: Bryan T. <br...@sy...> - 2013-10-09 23:57:18
|
This is a complex subject. I think that there have been postings on it before. The general approach is based on the following: - bigdata inlines a lot of different kinds of values into the statement indices rather than storing them in the dictionaries. Examples are xsd:int, xsd:integer, xsd:decimal, xsd:float, xsd:double, etc. Anything that is "inline" can be directly converted from an IV back into an RDF Value. Things that are not inline (most URIs, lots of literals) have to be converted through "materialization". Materialization is basically a JOIN against the appropriate index (either ID2TERM or BLOBS). - RDF and SPARQL are typed only at runtime for most queries. This means that we do not (in general) know the data type of the variable bindings until things are evaluated. - We handle this late discovery of typing through a pattern in which we first process those solutions that can be handled without materialization and then route the solutions that were not processed into the materialization step and then re-try the operation. This is called "conditional materialization." It gets applied for FILTERs. So, if you try to do a function and everything is xsd:int (or anything else that can be converted back into an RDF Value), then we do not perform the "materialization" join. MIN, MAX, AVG, and the other aggregates might be different. I would have to go back and look at the code. I think that we require mandatory materialization in front of the aggregation operations since we can not do the conditional materialization pattern. (Aggregation is evaluated in a single operator so the try-fail-materialize-retry pattern can not be made to work). There is an interface that is used to declare the materialization requirements of various things (bops). MIN probably requires materialization. Does that help? Bryan On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...> wrote: > >Please glance at this code (from MIN.java or MAX.java) > >... > >+ private static IVComparator comparator = new IVComparator(); > >Š > > >- /** >- * FIXME This needs to use the ordering define by >ORDER_BY. The >- * CompareBOp imposes the ordering defined for the "<" >operator >- * which is less robust and will throw a type exception >if you >- * attempt to compare unlike Values. >- * >- * @see >https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5 >- */ >- if (CompareBOp.compare(iv, min, CompareOp.LT)) { >+ if (comparator.compare(iv, min)<0) { > > min = iv; > > } > > > >It seems to work, but I read something about requiring materialization >which I did not understand and chose to ignore - was that a mistake? > >Jeremy J Carroll >Principal Architect >Syapse, Inc. > > > > >-------------------------------------------------------------------------- >---- >October Webinars: Code for Performance >Free Intel webinars can help you accelerate application performance. >Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >from >the latest Intel processors and coprocessors. See abstracts and register > >http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr >k >_______________________________________________ >Bigdata-developers mailing list >Big...@li... >https://lists.sourceforge.net/lists/listinfo/bigdata-developers |