This list is closed, nobody may subscribe to it.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(139) |
Aug
(94) |
Sep
(232) |
Oct
(143) |
Nov
(138) |
Dec
(55) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(127) |
Feb
(90) |
Mar
(101) |
Apr
(74) |
May
(148) |
Jun
(241) |
Jul
(169) |
Aug
(121) |
Sep
(157) |
Oct
(199) |
Nov
(281) |
Dec
(75) |
2012 |
Jan
(107) |
Feb
(122) |
Mar
(184) |
Apr
(73) |
May
(14) |
Jun
(49) |
Jul
(26) |
Aug
(103) |
Sep
(133) |
Oct
(61) |
Nov
(51) |
Dec
(55) |
2013 |
Jan
(59) |
Feb
(72) |
Mar
(99) |
Apr
(62) |
May
(92) |
Jun
(19) |
Jul
(31) |
Aug
(138) |
Sep
(47) |
Oct
(83) |
Nov
(95) |
Dec
(111) |
2014 |
Jan
(125) |
Feb
(60) |
Mar
(119) |
Apr
(136) |
May
(270) |
Jun
(83) |
Jul
(88) |
Aug
(30) |
Sep
(47) |
Oct
(27) |
Nov
(23) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(4) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <tho...@us...> - 2011-03-05 20:38:44
|
Revision: 4275 http://bigdata.svn.sourceforge.net/bigdata/?rev=4275&view=rev Author: thompsonbry Date: 2011-03-05 20:38:35 +0000 (Sat, 05 Mar 2011) Log Message: ----------- I have renamed the RDF aware Constraint class to SPARQLConstraint to avoid confusion with the version which is NOT aware of SPARQL evaluation semantics (esp, type errors). I have added some optimizations to MathBOp and RangeBOp designed to provide a fast path if the left argument evaluations to null and to defer heap allocations until we know that the RangeBOp can be fully evaluated. I have modified SPOPredicate#asBound(...) to trap type errors arising from attempts to evaluate a RangeBOp when some variable(s) are not bound. I am not sure that this is the right thing to do, but it allows the RTO to run. It may be that the underlying problem is making the PartitionedJoinGroup aware of the RangeBOp. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/error/SparqlTypeErrorException.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java Added Paths: ----------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/SPARQLConstraint.java Removed Paths: ------------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -1270,41 +1270,44 @@ if (!PartitionedJoinGroup.canJoinUsingConstraints( new IPredicate[] { v.pred }, vp.pred, C)) { - /* - * If there are no shared variables, either directly or - * indirectly via the constraints, then we can not use this - * as an initial edge. - * - * @todo UNIT TEST : correct rejection of initial paths for - * vertices which are unconstrained joins. - * - * @todo UNIT TEST : correct acceptance of initial paths for - * vertices which are unconstrained joins IFF there are no - * constrained joins in the join graph. - */ + /* + * If there are no shared variables, either directly or + * indirectly via the constraints, then we can not use this + * as an initial edge. + * + * TODO It may be possible to execute the join in one + * direction or the other but not both. This seems to be + * true for RangeBOp. + * + * @todo UNIT TEST : correct rejection of initial paths for + * vertices which are unconstrained joins. + * + * @todo UNIT TEST : correct acceptance of initial paths for + * vertices which are unconstrained joins IFF there are no + * constrained joins in the join graph. + */ continue; - } - - // The path segment - final IPredicate<?>[] preds = new IPredicate[] { v.pred, vp.pred }; + } - // cutoff join of the edge (v,vp) - final EdgeSample edgeSample = Path.cutoffJoin( - queryEngine,// - limit, // sample limit - preds, // ordered path segment. - C, // constraints - pathIsComplete,// - v.sample // sourceSample - ); + // The path segment + final IPredicate<?>[] preds = new IPredicate[] { v.pred, vp.pred }; - final Path p = new Path(v, vp, edgeSample); + // cutoff join of the edge (v,vp) + final EdgeSample edgeSample = Path.cutoffJoin(queryEngine,// + limit, // sample limit + preds, // ordered path segment. + C, // constraints + pathIsComplete,// + v.sample // sourceSample + ); - paths.add(p); + final Path p = new Path(v, vp, edgeSample); - } + paths.add(p); + + } } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/error/SparqlTypeErrorException.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/error/SparqlTypeErrorException.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/error/SparqlTypeErrorException.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -46,28 +46,49 @@ static public String SPARQL_TYPE_ERROR_0000 = toURI(0); /** + * Type error used to indicate an unbound variable. + */ + static public String SPARQL_TYPE_ERROR_0001 = toURI(1); + + /** * Generic SPARQL type error. * * @see #SPARQL_TYPE_ERROR_0000 */ public SparqlTypeErrorException() { - super(LanguageFamily.SP, ErrorCategory.TY, 0/* errorCode */, - SPARQL_TYPE_ERROR_0000); + this(0/* errorCode */, SPARQL_TYPE_ERROR_0000); } - /** - * @param errorCode - * The four digit error code. - */ - public SparqlTypeErrorException(int errorCode) { + /** + * Type error thrown when there is an unbound variable. + */ + static public class UnboundVarException extends SparqlTypeErrorException { + private static final long serialVersionUID = 1L; + + public UnboundVarException() { + + super(0001/* errorCode */, SPARQL_TYPE_ERROR_0001); + + } + + } + + /** + * @param errorCode + * The four digit error code. + * @param uri + * The uri of the error. + */ + protected SparqlTypeErrorException(final int errorCode, final String uri) { + super(LanguageFamily.SP, ErrorCategory.TY, errorCode, null/* msg */); } - static protected String toURI(int errorCode) { + static protected String toURI(final int errorCode) { return W3CQueryLanguageException.toURI(LanguageFamily.SP, ErrorCategory.TY, errorCode); Deleted: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -1,119 +0,0 @@ -/* - -Copyright (C) SYSTAP, LLC 2006-2011. All rights reserved. - -Contact: - SYSTAP, LLC - 4501 Tower Road - Greensboro, NC 27410 - lic...@bi... - -This program is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; version 2 of the License. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; if not, write to the Free Software -Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - -*/ -package com.bigdata.rdf.internal.constraints; - -import java.util.Map; - -import org.apache.log4j.Logger; - -import com.bigdata.bop.BOp; -import com.bigdata.bop.IBindingSet; -import com.bigdata.bop.IConstraint; -import com.bigdata.bop.IValueExpression; -import com.bigdata.rdf.error.SparqlTypeErrorException; -import com.bigdata.rdf.internal.IV; -import com.bigdata.util.InnerCause; - -/** - * BOpConstraint that wraps a {@link EBVBOp}, which itself computes the - * effective boolean value of an IValueExpression. - */ -public class Constraint extends com.bigdata.bop.constraint.Constraint { - - /** - * - */ - private static final long serialVersionUID = -5796492538735372727L; - - protected static final Logger log = Logger.getLogger(Constraint.class); - - /** - * Convenience method to generate a constraint from a value expression. - */ - public static IConstraint wrap(final IValueExpression<IV> ve) { - if (ve instanceof EBVBOp) - return new Constraint((EBVBOp) ve); - else - return new Constraint(new EBVBOp(ve)); - } - - - public Constraint(final EBVBOp x) { - - this(new BOp[] { x }, null/*annocations*/); - - } - - /** - * Required shallow copy constructor. - */ - public Constraint(final BOp[] args, - final Map<String, Object> anns) { - super(args, anns); - } - - /** - * Required deep copy constructor. - */ - public Constraint(final Constraint op) { - super(op); - } - - @Override - public EBVBOp get(final int i) { - return (EBVBOp) super.get(i); - } - - public IValueExpression<IV> getValueExpression() { - return get(0).get(0); - } - - public boolean accept(final IBindingSet bs) { - - try { - - // evaluate the EBV operator - return get(0).get(bs).booleanValue(); - - } catch (Throwable t) { - - if (InnerCause.isInnerCause(t, SparqlTypeErrorException.class)) { - - // trap the type error and filter out the solution - if (log.isInfoEnabled()) - log.info("discarding solution due to type error: " + bs - + " : " + t); - - return false; - - } - - throw new RuntimeException(t); - - } - - } - -} Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -131,11 +131,16 @@ final public IV get(final IBindingSet bs) { final IV left = left().get(bs); + + // not yet bound? + if (left == null) + throw new SparqlTypeErrorException.UnboundVarException(); + final IV right = right().get(bs); - // not yet bound - if (left == null || right == null) - throw new SparqlTypeErrorException(); + // not yet bound? + if (right == null) + throw new SparqlTypeErrorException.UnboundVarException(); return IVUtility.numericalMath(left, right, op()); Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -39,6 +39,11 @@ import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.Range; +/** + * Operator used to impose a key-range constraint on an access path. + * + * @author mrpersonick + */ final public class RangeBOp extends BOpBase implements IVariable<Range> { @@ -49,7 +54,6 @@ // private static final Logger log = Logger.getLogger(RangeBOp.class); - public interface Annotations extends ImmutableBOp.Annotations { String VAR = (RangeBOp.class.getName() + ".var").intern(); @@ -75,7 +79,7 @@ /** * Required shallow copy constructor. */ - public RangeBOp(final BOp[] args, Map<String,Object> anns) { + public RangeBOp(final BOp[] args, final Map<String,Object> anns) { super(args,anns); @@ -124,15 +128,22 @@ // log.debug("getting the asBound value"); final IV from = from().get(bs); + +// log.debug("from: " + from); + + // sort of like Var.get(), which returns null when the variable + // is not yet bound + if (from == null) + return null; + final IV to = to().get(bs); -// log.debug("from: " + from); // log.debug("to: " + to); // sort of like Var.get(), which returns null when the variable // is not yet bound - if (from == null || to == null) - return null; + if (to == null) + return null; try { // let Range ctor() do the type checks and valid range checks @@ -149,21 +160,29 @@ final public RangeBOp asBound(final IBindingSet bs) { - final RangeBOp asBound = (RangeBOp) this.clone(); - // log.debug("getting the asBound value"); final IV from = from().get(bs); + +// log.debug("from: " + from); + + // sort of like Var.get(), which returns null when the variable + // is not yet bound + if (from == null) + return this; + final IV to = to().get(bs); -// log.debug("from: " + from); // log.debug("to: " + to); // sort of like Var.get(), which returns null when the variable // is not yet bound - if (from == null || to == null) - return asBound; + if (to == null) + return this; + // Note: defer clone() until everything is bound. + final RangeBOp asBound = (RangeBOp) this.clone(); + asBound._setProperty(Annotations.FROM, new Constant(from)); asBound._setProperty(Annotations.TO, new Constant(to)); Copied: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/SPARQLConstraint.java (from rev 4261, branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java) =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/SPARQLConstraint.java (rev 0) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/SPARQLConstraint.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -0,0 +1,119 @@ +/* + +Copyright (C) SYSTAP, LLC 2006-2011. All rights reserved. + +Contact: + SYSTAP, LLC + 4501 Tower Road + Greensboro, NC 27410 + lic...@bi... + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; version 2 of the License. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + +*/ +package com.bigdata.rdf.internal.constraints; + +import java.util.Map; + +import org.apache.log4j.Logger; + +import com.bigdata.bop.BOp; +import com.bigdata.bop.IBindingSet; +import com.bigdata.bop.IConstraint; +import com.bigdata.bop.IValueExpression; +import com.bigdata.rdf.error.SparqlTypeErrorException; +import com.bigdata.rdf.internal.IV; +import com.bigdata.util.InnerCause; + +/** + * BOpConstraint that wraps a {@link EBVBOp}, which itself computes the + * effective boolean value of an IValueExpression. + */ +public class SPARQLConstraint extends com.bigdata.bop.constraint.Constraint { + + /** + * + */ + private static final long serialVersionUID = -5796492538735372727L; + + protected static final Logger log = Logger.getLogger(SPARQLConstraint.class); + + /** + * Convenience method to generate a constraint from a value expression. + */ + public static IConstraint wrap(final IValueExpression<IV> ve) { + if (ve instanceof EBVBOp) + return new SPARQLConstraint((EBVBOp) ve); + else + return new SPARQLConstraint(new EBVBOp(ve)); + } + + + public SPARQLConstraint(final EBVBOp x) { + + this(new BOp[] { x }, null/*annocations*/); + + } + + /** + * Required shallow copy constructor. + */ + public SPARQLConstraint(final BOp[] args, + final Map<String, Object> anns) { + super(args, anns); + } + + /** + * Required deep copy constructor. + */ + public SPARQLConstraint(final SPARQLConstraint op) { + super(op); + } + + @Override + public EBVBOp get(final int i) { + return (EBVBOp) super.get(i); + } + + public IValueExpression<IV> getValueExpression() { + return get(0).get(0); + } + + public boolean accept(final IBindingSet bs) { + + try { + + // evaluate the EBV operator + return get(0).get(bs).booleanValue(); + + } catch (Throwable t) { + + if (InnerCause.isInnerCause(t, SparqlTypeErrorException.class)) { + + // trap the type error and filter out the solution + if (log.isInfoEnabled()) + log.info("discarding solution due to type error: " + bs + + " : " + t); + + return false; + + } + + throw new RuntimeException(t); + + } + + } + +} Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -31,6 +31,7 @@ import com.bigdata.bop.IVariableOrConstant; import com.bigdata.bop.NV; import com.bigdata.bop.ap.Predicate; +import com.bigdata.rdf.error.SparqlTypeErrorException; import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.constraints.RangeBOp; import com.bigdata.relation.rule.IAccessPathExpander; @@ -320,13 +321,40 @@ // we don't have a range bop for ?o if (rangeBOp == null) return tmp; - - final RangeBOp asBound = rangeBOp.asBound(bindingSet); - - tmp._setProperty(Annotations.RANGE, asBound); - - return tmp; - - } + try { + + /* + * Attempt to evaluate the RangeBOp. + */ + final RangeBOp asBound = rangeBOp.asBound(bindingSet); + + tmp._setProperty(Annotations.RANGE, asBound); + + } catch (SparqlTypeErrorException.UnboundVarException ex) { + + /* + * If there was an unbound variable in the RangeBOp annotation then + * we will drop the RangeBOp. + * + * FIXME I have modified SPOPredicate#asBound(...) to trap type + * errors arising from attempts to evaluate a RangeBOp when some + * variable(s) are not bound. This presumes that the RangeBOp is in + * addition to (not instead of) the value expression from which the + * RangeBOp constraint was derived. Verify with MikeP. + * + * I am not sure that this is the right thing to do, but it allows + * the RTO to run. It may be that the underlying problem is making + * the PartitionedJoinGroup aware of the RangeBOp such that we do + * not attempt evaluation orders which would cause the RangeBOP to + * throw a type error. This gets into the area of alternative query + * plans rather than just alternative join orderings. + */ + + } + + return tmp; + + } + } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -1,6 +1,7 @@ package com.bigdata.bop.rdf.joinGraph; import java.io.File; +import java.math.BigInteger; import java.util.Properties; import java.util.UUID; @@ -22,11 +23,13 @@ import com.bigdata.journal.ITx; import com.bigdata.journal.Journal; import com.bigdata.rdf.internal.XSDIntIV; +import com.bigdata.rdf.internal.XSDIntegerIV; import com.bigdata.rdf.internal.constraints.CompareBOp; -import com.bigdata.rdf.internal.constraints.Constraint; import com.bigdata.rdf.internal.constraints.IsBoundBOp; import com.bigdata.rdf.internal.constraints.MathBOp; import com.bigdata.rdf.internal.constraints.NotBOp; +import com.bigdata.rdf.internal.constraints.RangeBOp; +import com.bigdata.rdf.internal.constraints.SPARQLConstraint; import com.bigdata.rdf.internal.constraints.SameTermBOp; import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.model.BigdataLiteral; @@ -273,6 +276,40 @@ } } + /* + * This demonstrates the translation of one of the constraints into a + * key-range constraint on the access path. + * + * FIXME What is the purpose of RangeBOp#var? Why is it simProperty and + * not origProperty + * + * FIXME Is the RangeBOp in addition to, or instead of, the original + * constraint? + * + * [java] PipelineJoin[14](PipelineJoin[13])[ BOp.bopId=14, + * PipelineJoin. + * + * constraints=[Constraint(EBVBOp(CompareBOp(simProperty1,MINUS + * (origProperty1, XSDInteger(120)))[ CompareBOp.op=GT])), + * Constraint(EBVBOp(CompareBOp(simProperty1,PLUS(origProperty1, + * XSDInteger(120)))[ CompareBOp.op=LT]))], + * + * BOp.evaluationContext=ANY, + * + * PipelineJoin.predicate=SPOPredicate[7](product=null, TermId(279564U), simProperty1=null)[ + * + * IPredicate.relationName=[BSBM_284826.spo], + * + * IPredicate.timestamp=1299271044298, + * + * IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL], + * + * SPOPredicate.range=RangeBOp()[ RangeBOp.var=simProperty1, + * RangeBOp.from=MINUS(origProperty1, XSDInteger(120)), + * RangeBOp.to=PLUS(origProperty1, XSDInteger(120)) + * + * ], BOp.bopId=7], QueryHints.optimizer=None] + */ final boolean distinct = true; final IVariable<?>[] selected; final IConstraint[] constraints; @@ -351,6 +388,15 @@ },// new NV(BOp.Annotations.BOP_ID, nextId++),// new NV(Annotations.TIMESTAMP, timestamp),// + new NV(SPOPredicate.Annotations.RANGE, new RangeBOp(// + origProperty1,// FIXME verify correct var w/ MikeP + new MathBOp(origProperty1, new Constant( + new XSDIntegerIV(BigInteger.valueOf(120))), + MathOp.MINUS),// + new MathBOp(origProperty1, new Constant( + new XSDIntegerIV(BigInteger.valueOf(120))), + MathOp.PLUS)// + )),// new NV(IPredicate.Annotations.RELATION_NAME, spoRelation)// ); @@ -373,6 +419,15 @@ },// new NV(BOp.Annotations.BOP_ID, nextId++),// new NV(Annotations.TIMESTAMP, timestamp),// + new NV(SPOPredicate.Annotations.RANGE, new RangeBOp(// + origProperty2,// FIXME verify correct var with MikeP + new MathBOp(origProperty2, new Constant( + new XSDIntegerIV(BigInteger.valueOf(170))), + MathOp.MINUS),// + new MathBOp(origProperty2, new Constant( + new XSDIntegerIV(BigInteger.valueOf(170))), + MathOp.PLUS)// + )),// new NV(IPredicate.Annotations.RELATION_NAME, spoRelation)// ); @@ -433,7 +488,7 @@ // the constraints on the join graph. constraints = new IConstraint[ves.length]; for (int i = 0; i < ves.length; i++) { - constraints[i] = Constraint.wrap(ves[i]); + constraints[i] = SPARQLConstraint.wrap(ves[i]); } } @@ -737,17 +792,17 @@ preds = new IPredicate[] { p0, p1, p2, p3, p4, p5, p6 }; // FILTER ( ?p1 > %x% ) - c0 = Constraint.wrap(new CompareBOp(new BOp[] { p1Var, + c0 = SPARQLConstraint.wrap(new CompareBOp(new BOp[] { p1Var, new Constant(x.getIV()) }, NV.asMap(new NV[] { new NV( CompareBOp.Annotations.OP, CompareOp.GT) }))); // FILTER (?p3 < %y% ) - c1 = Constraint.wrap(new CompareBOp(new BOp[] { p3Var, + c1 = SPARQLConstraint.wrap(new CompareBOp(new BOp[] { p3Var, new Constant(y.getIV()) }, NV.asMap(new NV[] { new NV( CompareBOp.Annotations.OP, CompareOp.LT) }))); // FILTER (!bound(?testVar)) - c2 = Constraint.wrap(new NotBOp(new IsBoundBOp(testVar))); + c2 = SPARQLConstraint.wrap(new NotBOp(new IsBoundBOp(testVar))); // the constraints on the join graph. constraints = new IConstraint[] { c0, c1, c2 }; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -173,7 +173,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.GT)) + SPARQLConstraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.GT)) } ); @@ -279,7 +279,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.GE)) + SPARQLConstraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.GE)) }); try { @@ -386,7 +386,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.LT)) + SPARQLConstraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.LT)) }); if (log.isInfoEnabled()) @@ -494,7 +494,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.LE)) + SPARQLConstraint.wrap(new CompareBOp(a, new Constant<IV>(_35.getIV()), CompareOp.LE)) }); if (log.isInfoEnabled()) @@ -611,7 +611,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new MathBOp(dAge, new Constant<IV>(_5.getIV()), MathOp.PLUS), CompareOp.GT)) + SPARQLConstraint.wrap(new CompareBOp(a, new MathBOp(dAge, new Constant<IV>(_5.getIV()), MathOp.PLUS), CompareOp.GT)) }); try { @@ -724,7 +724,7 @@ }, // constraints on the rule. new IConstraint[] { - Constraint.wrap(new CompareBOp(a, new Constant<IV>(l2.getIV()), CompareOp.GT)) + SPARQLConstraint.wrap(new CompareBOp(a, new Constant<IV>(l2.getIV()), CompareOp.GT)) }); try { Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-03-04 18:54:31 UTC (rev 4274) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-03-05 20:38:35 UTC (rev 4275) @@ -90,7 +90,7 @@ import com.bigdata.rdf.internal.XSDIntegerIV; import com.bigdata.rdf.internal.constraints.AndBOp; import com.bigdata.rdf.internal.constraints.CompareBOp; -import com.bigdata.rdf.internal.constraints.Constraint; +import com.bigdata.rdf.internal.constraints.SPARQLConstraint; import com.bigdata.rdf.internal.constraints.EBVBOp; import com.bigdata.rdf.internal.constraints.IsBNodeBOp; import com.bigdata.rdf.internal.constraints.IsBoundBOp; @@ -1087,10 +1087,10 @@ for (SOp sop : g) { final BOp bop = sop.getBOp(); - if (!(bop instanceof Constraint)) { + if (!(bop instanceof SPARQLConstraint)) { continue; } - final Constraint c = (Constraint) bop; + final SPARQLConstraint c = (SPARQLConstraint) bop; if (!(c.getValueExpression() instanceof CompareBOp)) { continue; } @@ -1679,7 +1679,7 @@ private IConstraint toConstraint(final ValueExpr ve) { final IValueExpression<IV> veBOp = toVE(ve); - return Constraint.wrap(veBOp); + return SPARQLConstraint.wrap(veBOp); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-04 18:54:38
|
Revision: 4274 http://bigdata.svn.sourceforge.net/bigdata/?rev=4274&view=rev Author: thompsonbry Date: 2011-03-04 18:54:31 +0000 (Fri, 04 Mar 2011) Log Message: ----------- A variety of additional edits to further reduce heap pressure. There is no observable difference for this commit on the BSBM 100M full query mix, but it appears to add ~ 2000 QMpH on the reduced query mix. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IVariable.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractRelation.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataLiteralImpl.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataValueFactoryImpl.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/model/TestFactory.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -75,7 +75,7 @@ /** * An empty array. */ - static protected final transient BOp[] NOARGS = new BOp[] {}; + static public final transient BOp[] NOARGS = new BOp[] {}; /** * An empty immutable annotations map. Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IVariable.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IVariable.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IVariable.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -46,5 +46,8 @@ * Return <code>true</code> iff this is the special variable <code>*</code> */ boolean isWildcard(); - + + /** An empty {@link IVariable} array. */ + IVariable<?>[] EMPTY = new IVariable[0]; + } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -130,10 +130,12 @@ * of those {@link IPredicate}s is not yet known (it will be determined by a * query optimizer when it decides on an evaluation order for those joins). */ - public IConstraint[] getJoinGraphConstraints() { - return joinGraphConstraints - .toArray(new IConstraint[joinGraphConstraints.size()]); - } + public IConstraint[] getJoinGraphConstraints() { + final int size = joinGraphConstraints.size(); + if (size == 0) + return IConstraint.EMPTY; + return joinGraphConstraints.toArray(new IConstraint[size]); + } /** * Return the set of constraints which should be attached to the last join Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractRelation.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractRelation.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractRelation.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -99,7 +99,15 @@ static public <E> String getFQN(final IRelation<E> relation, final IKeyOrder<? extends E> keyOrder) { - return relation.getNamespace() + "." + keyOrder.getIndexName(); + /* + * TODO We wind up calling this a lot. intern() might help reduce the + * heap requirements while the returned value is being held, but it is + * not reducing the heap pressure caused by this string concatenation. + * To do that we would have to search a cache using the component + * elements [namespace] and [keyOrder]. + */ + return (relation.getNamespace() + "." + keyOrder.getIndexName()) + .intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -408,16 +408,10 @@ /* * Resolve the commit time from which this view was materialized (if - * known) + * known and otherwise null). */ - { - - final String val = getProperty(RelationSchema.COMMIT_TIME, null/* default */); + commitTime = (Long)properties.get(RelationSchema.COMMIT_TIME); - commitTime = val == null ? null : Long.valueOf(val); - - } - forceSerialExecution = Boolean.parseBoolean(getProperty( Options.FORCE_SERIAL_EXECUTION, Options.DEFAULT_FORCE_SERIAL_EXECUTION)); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -148,7 +148,7 @@ /** * The default timeout for stale entries in milliseconds. */ - protected static transient final long DEFAULT_CACHE_TIMEOUT = (60 * 1000); + protected static transient final long DEFAULT_CACHE_TIMEOUT = (10 * 1000); /** * Ctor uses {@link #DEFAULT_CACHE_CAPACITY} and @@ -663,8 +663,7 @@ * Make the commit time against which we are reading accessible to * the locatable resource. */ - properties.setProperty(RelationSchema.COMMIT_TIME, commitTime2 - .toString()); + properties.put(RelationSchema.COMMIT_TIME, commitTime2); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -102,13 +102,22 @@ return (IVariable<IV>) getRequiredProperty(Annotations.VAR); } - public IValueExpression<IV> from() { - return (IValueExpression<IV>) getRequiredProperty(Annotations.FROM); - } + public IValueExpression<IV> from() { + if (from == null) { + from = (IValueExpression<IV>) getRequiredProperty(Annotations.FROM); + } + return from; + } - public IValueExpression<IV> to() { - return (IValueExpression<IV>) getRequiredProperty(Annotations.TO); + public IValueExpression<IV> to() { + if (to == null) { + to = (IValueExpression<IV>) getRequiredProperty(Annotations.TO); + } + return to; } + + // cache to/from lookups. + private transient volatile IValueExpression<IV> to, from; final public Range get(final IBindingSet bs) { @@ -214,18 +223,18 @@ } - final private boolean _equals(final RangeBOp op) { - - return var().equals(op.var()) - && from().equals(op.from()) - && to().equals(op.to()); - - } +// final private boolean _equals(final RangeBOp op) { +// +// return var().equals(op.var()) +// && from().equals(op.from()) +// && to().equals(op.to()); +// +// } /** * Caches the hash code. */ - private int hash = 0; +// private int hash = 0; public int hashCode() { // // int h = hash; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -409,6 +409,7 @@ /* * Unshared for any other view of the triple store. */ + System.err.println("Unshared: "+termCacheCapacity);//FIXME remove stderr. termCache = new ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>(// termCacheCapacity, // queueCapacity .75f, // loadFactor (.75 is the default) @@ -1121,7 +1122,7 @@ * into the database. Otherwise unknown terms are inserted into * the database. */ - @SuppressWarnings("unchecked") +// @SuppressWarnings("unchecked") public void addTerms(final BigdataValue[] terms, final int numTerms, final boolean readOnly) { @@ -1377,10 +1378,17 @@ } + final int nremaining = terms2.size(); + + if(nremaining == 0) + return EMPTY_VALUE_ARRAY; + return terms2.toArray(new BigdataValue[terms2.size()]); } + private static transient final BigdataValue[] EMPTY_VALUE_ARRAY = new BigdataValue[0]; + /** * Index terms for keyword search. */ @@ -2254,8 +2262,10 @@ @Override protected ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue> newInstance( NT key, Integer termCacheCapacity) { + final int queueCapacity = 50000;// FIXME termCacheCapacity.intValue(); + System.err.println("Shared : "+termCacheCapacity);//FIXME remove stderr. return new ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>(// - termCacheCapacity.intValue(), // queueCapacity + queueCapacity,// backing hard reference LRU queue capacity. .75f, // loadFactor (.75 is the default) 16 // concurrency level (16 is the default) ); Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataLiteralImpl.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataLiteralImpl.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataLiteralImpl.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -90,7 +90,7 @@ this.label = label; // force to lowercase (Sesame does this too). - this.language = (language != null ? language.toLowerCase() : null); + this.language = (language != null ? language.toLowerCase().intern() : null); // this.language = language; this.datatype = datatype; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataValueFactoryImpl.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataValueFactoryImpl.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataValueFactoryImpl.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -27,6 +27,8 @@ package com.bigdata.rdf.model; +import java.util.LinkedHashMap; +import java.util.Map; import java.util.UUID; import javax.xml.datatype.XMLGregorianCalendar; @@ -63,9 +65,11 @@ * WARNING: Use {@link #getInstance(String)} NOT this constructor. */ private BigdataValueFactoryImpl() { + + xsdMap = getXSDMap(); } - + /** * Canonicalizing mapping for {@link BigdataValueFactoryImpl}s based on the * namespace of the {@link LexiconRelation}. @@ -220,6 +224,15 @@ public static final transient String xsd = NAMESPACE_XSD + "#"; + private final BigdataURIImpl xsd_string = new BigdataURIImpl(this, xsd + + "string"); + + private final BigdataURIImpl xsd_dateTime = new BigdataURIImpl(this, + xsd + "dateTime"); + + private final BigdataURIImpl xsd_date = new BigdataURIImpl(this, + xsd + "date"); + private final BigdataURIImpl xsd_long = new BigdataURIImpl(this, xsd + "long"); @@ -247,6 +260,34 @@ private final BigdataLiteralImpl FALSE = new BigdataLiteralImpl(this, "false", null, xsd_boolean); + /** + * Map for fast resolution of XSD URIs. The keys are the string values of + * the URIs. The values are the URIs. + */ + private final Map<String,BigdataURIImpl> xsdMap; + + /** + * Populate and return a map for fast resolution of XSD URIs. + */ + private Map<String, BigdataURIImpl> getXSDMap() { + + final Map<String, BigdataURIImpl> map = new LinkedHashMap<String, BigdataURIImpl>(); + + final BigdataURIImpl[] a = new BigdataURIImpl[] { xsd_string, + xsd_dateTime, xsd_date, xsd_long, xsd_int, xsd_byte, xsd_short, + xsd_double, xsd_float, xsd_boolean }; + + for (BigdataURIImpl x : a) { + + // stringValue of URI => URI + map.put(x.stringValue(), x); + + } + + return map; + + } + public BigdataLiteralImpl createLiteral(boolean arg0) { return (arg0 ? TRUE : FALSE); @@ -318,8 +359,8 @@ */ if (datatype != null && !(datatype instanceof BigdataURIImpl)) { - datatype = createURI(datatype.stringValue()); - + datatype = createURI(datatype.stringValue()); + } return new BigdataLiteralImpl(this, label, null, @@ -329,6 +370,21 @@ public BigdataURIImpl createURI(final String uriString) { + final String str = uriString; + +// if (str.startsWith(NAMESPACE_XSD)) { + + final BigdataURIImpl tmp = xsdMap.get(str); + + if(tmp != null) { + + // found in canonicalizing map. + return tmp; + + } + +// } + return new BigdataURIImpl(this, uriString); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/model/TestFactory.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/model/TestFactory.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/model/TestFactory.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -70,7 +70,25 @@ assertEquals(12, l1.intValue()); } + + /** + * Unit test verifies that the created URIs are canonical for well-known + * XSD URIs. + */ + public void test_create_xsdInt_canonical() { + + final BigdataURI v1 = vf.createURI(XSD.INT.stringValue()); + + final BigdataURI v2 = vf.createURI(XSD.INT.stringValue()); + + // verify the URI. + assertEquals(v1.stringValue(),XSD.INT.stringValue()); + + // verify the same reference (canonical). + assertTrue(v1 == v2); + } + /** * Unit test for {@link ValueFactory#createLiteral(String, URI)} when the * datatype URI is <code>null</code>. Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java 2011-03-04 15:24:20 UTC (rev 4273) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java 2011-03-04 18:54:31 UTC (rev 4274) @@ -31,7 +31,6 @@ import java.util.Arrays; import java.util.Collection; import java.util.Enumeration; -import java.util.Iterator; import java.util.LinkedList; import java.util.List; import java.util.Properties; @@ -56,7 +55,6 @@ import com.bigdata.bop.IVariableOrConstant; import com.bigdata.bop.NV; import com.bigdata.bop.PipelineOp; -import com.bigdata.bop.Var; import com.bigdata.bop.ap.Predicate; import com.bigdata.bop.ap.filter.DistinctFilter; import com.bigdata.bop.bindingSet.HashBindingSet; @@ -109,6 +107,8 @@ protected static final Logger log = Logger.getLogger(Rule2BOpUtility.class); + private static final transient IConstraint[][] NO_ASSIGNED_CONSTRAINTS = new IConstraint[0][]; + /** * Flag to conditionally enable the new named and default graph support. * <p> @@ -344,7 +344,7 @@ // // true iff the database is in quads mode. // final boolean isQuadsQuery = db.isQuads(); - final PipelineOp startOp = applyQueryHints(new StartOp(new BOp[] {}, + final PipelineOp startOp = applyQueryHints(new StartOp(BOpBase.NOARGS, NV.asMap(new NV[] {// new NV(Predicate.Annotations.BOP_ID, idFactory .incrementAndGet()),// @@ -576,22 +576,28 @@ * from SOp2BOpUtility anymore so ok for now */ final IConstraint[][] assignedConstraints; - { - // Extract IConstraint[] from the rule. - final IConstraint[] constraints = new IConstraint[rule.getConstraintCount()]; - for(int i=0; i<constraints.length; i++) { - constraints[i] = rule.getConstraint(i); - } - - // figure out which constraints are attached to which predicates. - assignedConstraints = PartitionedJoinGroup.getJoinGraphConstraints( - preds, constraints, - knownBound.toArray(new IVariable<?>[knownBound.size()]), - true// pathIsComplete - ); - } + { - /* + final int nconstraints = rule.getConstraintCount(); + + // Extract IConstraint[] from the rule. + final IConstraint[] constraints = new IConstraint[nconstraints]; + for (int i = 0; i < constraints.length; i++) { + constraints[i] = rule.getConstraint(i); + } + + final int nknownBound = knownBound.size(); + + // figure out which constraints are attached to which + // predicates. + assignedConstraints = PartitionedJoinGroup.getJoinGraphConstraints( + preds, constraints, + nknownBound == 0 ? IVariable.EMPTY : knownBound + .toArray(new IVariable<?>[nknownBound]), true// pathIsComplete + ); + } + + /* * */ for (int i = 0; i < preds.length; i++) { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-04 15:24:27
|
Revision: 4273 http://bigdata.svn.sourceforge.net/bigdata/?rev=4273&view=rev Author: thompsonbry Date: 2011-03-04 15:24:20 +0000 (Fri, 04 Mar 2011) Log Message: ----------- Working on reducing heap pressure under sustained concurrent query by improving sharing of data structures across views of the triple store backed by the same commit point. => Modified BigdataSail#createLTS() to locate the LTS even when it is created new such that the correct properties are materialized from the GRS and made visible to the LTS. This makes it possible to access the pre-materialized Vocabulary and Axioms objects. => Added AbstractResource#getBareProperties(), which does NOT wrap the Properties object. Wrapping the Properties object protects it against inadvertent modification, but doing so makes it impossible to access non-String property values using Hastable#get(name) since they are inside of the protected default Properties object. getBareProperties() can be used in those cases where you need to access non-String property values. The caller is responsible for avoiding mutation to the returned Properties object. => Replaced the LocalTriple(Properties) constructor, which was only used by the unit tests, with a static getInstance(properties) method. As of this change, getInstance() correctly reports the properties materialized from the GRS rather than those passed in by the caller. => Modified AbstractTripleStore#getVocabulary() to return the pre-materialized the object using getBareProperties() rather than re-materializing it from the GRS. => Modified AbstractTripleStore#getAxioms() to return the pre-materialized the object using getBareProperties() rather than re-materializing it from the GRS. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/AbstractTripleStore.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/LocalTripleStore.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestVocabulary.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/rules/TestTruthMaintenance.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalQuadStore.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStore.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutInlining.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutStatementIdentifiers.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataSail.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -513,15 +513,35 @@ } /** - * Return an object wrapping the properties specified to the ctor. + * Wrap and return the properties specified to the ctor. Wrapping the + * {@link Properties} object prevents inadvertent side-effects. */ public final Properties getProperties() { - + return new Properties(properties); } /** + * Return the {@link Properties} object without wrapping it. This method can + * be used in those cases where you need to access non-String property + * values. The caller is responsible for avoiding mutation to the returned + * Properties object. + * <p> + * Note: This explicitly does NOT wrap the properties. Doing so makes it + * impossible to access the default properties using Hashtable#get(), which + * in turn means that we can not access non-String objects which have been + * materialized from the GRS in the {@link Properties}. This does introduce + * some potential for side-effects between read-only instances of the same + * resource view which share the same properties object. + */ + protected final Properties getBareProperties() { + + return properties; + + } + + /** * Return the object used to locate indices, relations, and relation * containers and to execute operations on those resources. * <p> @@ -748,4 +768,21 @@ } +// /** +// * Sets the property on the underlying properties object but DOES NOT set +// * the property on the global row store (GRS). This method may be used when +// * a resource is newly created in order to cache objects which are persisted +// * on the GRS. +// * +// * @param name +// * The property name. +// * @param value +// * The property value. +// */ +// protected void setProperty(final String name, final Object value) { +// +// properties.put(name, value); +// +// } + } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -693,7 +693,7 @@ * @param properties * Configuration properties for the relation. * - * @return A new instance of the identifed resource. + * @return A new instance of the identified resource. */ protected T newInstance(final Class<? extends T> cls, final IIndexManager indexManager, final String namespace, Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/AbstractTripleStore.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/AbstractTripleStore.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/AbstractTripleStore.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -471,7 +471,8 @@ * @see LexiconRelation * @see KeyBuilder.Options */ - String LEXICON = AbstractTripleStore.class.getName() + ".lexicon"; + String LEXICON = (AbstractTripleStore.class.getName() + ".lexicon") + .intern(); String DEFAULT_LEXICON = "true"; @@ -494,7 +495,8 @@ * this option in order to spend less time writing the forward lexicon * index (and it will also take up less space). */ - String STORE_BLANK_NODES = AbstractTripleStore.class.getName() + ".storeBlankNodes"; + String STORE_BLANK_NODES = (AbstractTripleStore.class.getName() + ".storeBlankNodes") + .intern(); String DEFAULT_STORE_BLANK_NODES = "false"; @@ -556,9 +558,8 @@ * partitions for the statement indices, then SQRT(50) =~ 7 would be a * good choice. */ - String TERMID_BITS_TO_REVERSE = AbstractTripleStore.class - .getName() - + ".termIdBitsToReverse"; + String TERMID_BITS_TO_REVERSE = (AbstractTripleStore.class.getName() + ".termIdBitsToReverse") + .intern(); String DEFAULT_TERMID_BITS_TO_REVERSE = "6"; @@ -569,7 +570,8 @@ * * @see #TEXT_INDEXER_CLASS */ - String TEXT_INDEX = AbstractTripleStore.class.getName() + ".textIndex"; + String TEXT_INDEX = (AbstractTripleStore.class.getName() + ".textIndex") + .intern(); String DEFAULT_TEXT_INDEX = "true"; @@ -578,9 +580,8 @@ * full text index that may be used to lookup datatype literals by * tokens found in the text of those literals. */ - String TEXT_INDEX_DATATYPE_LITERALS = AbstractTripleStore.class - .getName() - + ".textIndex.datatypeLiterals"; + String TEXT_INDEX_DATATYPE_LITERALS = (AbstractTripleStore.class + .getName() + ".textIndex.datatypeLiterals").intern(); String DEFAULT_TEXT_INDEX_DATATYPE_LITERALS = "true"; @@ -589,8 +590,8 @@ * cache provides fast lookup of frequently used RDF {@link Value}s by * their term identifier. */ - String TERM_CACHE_CAPACITY = AbstractTripleStore.class.getName() - + ".termCache.capacity"; + String TERM_CACHE_CAPACITY = (AbstractTripleStore.class.getName() + + ".termCache.capacity").intern(); String DEFAULT_TERM_CACHE_CAPACITY = "500";//"50000"; @@ -609,7 +610,8 @@ * @see NoVocabulary * @see RDFSVocabulary */ - String VOCABULARY_CLASS = AbstractTripleStore.class.getName() + ".vocabularyClass"; + String VOCABULARY_CLASS = (AbstractTripleStore.class.getName() + ".vocabularyClass") + .intern(); String DEFAULT_VOCABULARY_CLASS = RDFSVocabulary.class.getName(); @@ -621,8 +623,9 @@ * {@link BaseAxioms}. This option is ignored if the lexicon is * disabled. Use {@link NoAxioms} to disable inference. */ - String AXIOMS_CLASS = AbstractTripleStore.class.getName() + ".axiomsClass"; - + String AXIOMS_CLASS = (AbstractTripleStore.class.getName() + ".axiomsClass") + .intern(); + String DEFAULT_AXIOMS_CLASS = OwlAxioms.class.getName(); /** @@ -653,7 +656,8 @@ * at query time. Both {@link FastClosure} and {@link FullClosure} are * aware of this and handle it correctly (e.g., as configured). */ - String CLOSURE_CLASS = AbstractTripleStore.class.getName() + ".closureClass"; + String CLOSURE_CLASS = (AbstractTripleStore.class.getName() + ".closureClass") + .intern(); String DEFAULT_CLOSURE_CLASS = FastClosure.class.getName(); @@ -673,7 +677,8 @@ * use the {@link #BLOOM_FILTER}. Otherwise it may be turned off to * realize some (minimal) performance gain. */ - String ONE_ACCESS_PATH = AbstractTripleStore.class.getName() + ".oneAccessPath"; + String ONE_ACCESS_PATH = (AbstractTripleStore.class.getName() + ".oneAccessPath") + .intern(); String DEFAULT_ONE_ACCESS_PATH = "false"; @@ -709,8 +714,9 @@ * which of them would benefit from the SPO bloom filter (TM, * backchainers, SIDs fixed point, etc). */ - String BLOOM_FILTER = AbstractTripleStore.class.getName() + ".bloomFilter"; - + String BLOOM_FILTER = (AbstractTripleStore.class.getName() + ".bloomFilter") + .intern(); + String DEFAULT_BLOOM_FILTER = "true"; /** @@ -727,7 +733,8 @@ * justifications are maintained in a distinct index and are only used * when retracting assertions. */ - String JUSTIFY = AbstractTripleStore.class.getName() + ".justify"; + String JUSTIFY = (AbstractTripleStore.class.getName() + ".justify") + .intern(); String DEFAULT_JUSTIFY = "true"; @@ -776,8 +783,8 @@ * <p> * There are examples for using the provenance mode online. */ - String STATEMENT_IDENTIFIERS = AbstractTripleStore.class.getName() - + ".statementIdentifiers"; + String STATEMENT_IDENTIFIERS = (AbstractTripleStore.class.getName() + ".statementIdentifiers") + .intern(); String DEFAULT_STATEMENT_IDENTIFIERS = "false"; @@ -787,7 +794,8 @@ * {@link #STATEMENT_IDENTIFIERS} option determines whether or not the * provenance mode is enabled. */ - String QUADS = AbstractTripleStore.class.getName() + ".quads"; + String QUADS = (AbstractTripleStore.class.getName() + ".quads") + .intern(); String DEFAULT_QUADS = "false"; @@ -802,8 +810,8 @@ * = <code>false</code></li> * </ul> */ - String TRIPLES_MODE = AbstractTripleStore.class.getName() - + ".triplesMode"; + String TRIPLES_MODE = (AbstractTripleStore.class.getName() + ".triplesMode") + .intern(); String DEFAULT_TRIPLES_MODE = "false"; @@ -818,8 +826,8 @@ * = <code>true</code></li> * </ul> */ - String TRIPLES_MODE_WITH_PROVENANCE = AbstractTripleStore.class.getName() - + ".triplesModeWithProvenance"; + String TRIPLES_MODE_WITH_PROVENANCE = (AbstractTripleStore.class + .getName() + ".triplesModeWithProvenance").intern(); String DEFAULT_TRIPLES_MODE_WITH_PROVENANCE = "false"; @@ -837,8 +845,8 @@ * = <code>com.bigdata.rdf.store.AbstractTripleStore.NoAxioms</code></li> * </ul> */ - String QUADS_MODE = AbstractTripleStore.class.getName() - + ".quadsMode"; + String QUADS_MODE = (AbstractTripleStore.class.getName() + ".quadsMode") + .intern(); String DEFAULT_QUADS_MODE = "false"; @@ -853,8 +861,8 @@ * * @see #DEFAULT_VALUE_FACTORY_CLASS */ - String VALUE_FACTORY_CLASS = AbstractTripleStore.class.getName() - + ".valueFactoryClass"; + String VALUE_FACTORY_CLASS = (AbstractTripleStore.class.getName() + ".valueFactoryClass") + .intern(); String DEFAULT_VALUE_FACTORY_CLASS = BigdataValueFactoryImpl.class .getName(); @@ -872,8 +880,8 @@ * * @see #DEFAULT_TEXT_INDEXER_CLASS */ - String TEXT_INDEXER_CLASS = AbstractTripleStore.class.getName() - + ".textIndexerClass"; + String TEXT_INDEXER_CLASS = (AbstractTripleStore.class.getName() + ".textIndexerClass") + .intern(); String DEFAULT_TEXT_INDEXER_CLASS = BigdataRDFFullTextIndex.class .getName(); @@ -883,8 +891,8 @@ * statement indices rather than using the lexicon to map them to term * identifiers and back. */ - String INLINE_LITERALS = AbstractTripleStore.class.getName() - + ".inlineLiterals"; + String INLINE_LITERALS = (AbstractTripleStore.class.getName() + ".inlineLiterals") + .intern(); String DEFAULT_INLINE_LITERALS = "true"; @@ -895,8 +903,8 @@ * <p> * See {@link Options#STORE_BLANK_NODES}. */ - String INLINE_BNODES = AbstractTripleStore.class.getName() - + ".inlineBNodes"; + String INLINE_BNODES = (AbstractTripleStore.class.getName() + + ".inlineBNodes").intern(); String DEFAULT_INLINE_BNODES = "false"; @@ -911,8 +919,8 @@ * * @see #INLINE_DATE_TIMES_TIMEZONE */ - String INLINE_DATE_TIMES = AbstractTripleStore.class.getName() - + ".inlineDateTimes"; + String INLINE_DATE_TIMES = (AbstractTripleStore.class.getName() + + ".inlineDateTimes").intern(); String DEFAULT_INLINE_DATE_TIMES = "false"; @@ -925,8 +933,8 @@ * * @see #INLINE_DATE_TIMES */ - String INLINE_DATE_TIMES_TIMEZONE = AbstractTripleStore.class.getName() - + ".inlineDateTimesTimezone"; + String INLINE_DATE_TIMES_TIMEZONE = (AbstractTripleStore.class.getName() + + ".inlineDateTimesTimezone").intern(); /** * @see #INLINE_DATE_TIMES_TIMEZONE @@ -944,8 +952,8 @@ * * @see #DEFAULT_EXTENSION_FACTORY_CLASS */ - String EXTENSION_FACTORY_CLASS = AbstractTripleStore.class.getName() - + ".extensionFactoryClass"; + String EXTENSION_FACTORY_CLASS = (AbstractTripleStore.class.getName() + ".extensionFactoryClass") + .intern(); String DEFAULT_EXTENSION_FACTORY_CLASS = DefaultExtensionFactory.class .getName(); @@ -972,8 +980,8 @@ * * @see XXXCShardSplitHandler */ - String CONSTRAIN_XXXC_SHARDS = AbstractTripleStore.class.getName() - + ".constrainXXXCShards"; + String CONSTRAIN_XXXC_SHARDS = (AbstractTripleStore.class.getName() + ".constrainXXXCShards") + .intern(); String DEFAULT_CONSTRAIN_XXXC_SHARDS = "true"; @@ -1326,7 +1334,6 @@ // set property that will let the contained relations locate their container. tmp.setProperty(RelationSchema.CONTAINER, getNamespace()); - if (Boolean.valueOf(tmp.getProperty(Options.TEXT_INDEX, Options.DEFAULT_TEXT_INDEX))) { @@ -1437,9 +1444,11 @@ // axioms. map.put(TripleStoreSchema.AXIOMS, axioms); +// setProperty(TripleStoreSchema.AXIOMS,axioms); // vocabulary. map.put(TripleStoreSchema.VOCABULARY, vocab); +// setProperty(TripleStoreSchema.VOCABULARY,vocab); if (lexiconRelation.isTextIndex()) { /* @@ -1548,14 +1557,19 @@ if (axioms == null) { /* - * Extract the de-serialized axiom model from the global row - * store. + * The vocabulary is stored in properties for the triple + * store instance in the global row store. However, we + * pre-materialize those properties so we can directly + * retrieve the vocabulary from the materialized properties. */ - - axioms = (Axioms) getIndexManager().getGlobalRowStore() - .get(RelationSchema.INSTANCE, getNamespace(), - TripleStoreSchema.AXIOMS); + axioms = (Axioms) getBareProperties().get( + TripleStoreSchema.AXIOMS); + +// axioms = (Axioms) getIndexManager().getGlobalRowStore() +// .get(RelationSchema.INSTANCE, getNamespace(), +// TripleStoreSchema.AXIOMS); + if (axioms == null) throw new RuntimeException("No axioms defined? : " + this); @@ -1599,13 +1613,18 @@ if (vocab == null) { /* - * Extract the de-serialized vocabulary from the global row - * store. + * The vocabulary is stored in properties for the triple + * store instance in the global row store. However, we + * pre-materialize those properties so we can directly + * retrieve the vocabulary from the materialized properties. */ - vocab = (Vocabulary) getIndexManager().getGlobalRowStore().get( - RelationSchema.INSTANCE, getNamespace(), + vocab = (Vocabulary) getBareProperties().get( TripleStoreSchema.VOCABULARY); + +// vocab = (Vocabulary) getIndexManager().getGlobalRowStore().get( +// RelationSchema.INSTANCE, getNamespace(), +// TripleStoreSchema.VOCABULARY); if (vocab == null) throw new RuntimeException("No vocabulary defined? : " Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/LocalTripleStore.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/LocalTripleStore.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/LocalTripleStore.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -33,7 +33,6 @@ import com.bigdata.journal.IIndexManager; import com.bigdata.journal.ITx; import com.bigdata.journal.Journal; -import com.bigdata.rdf.spo.SPORelation; import com.bigdata.relation.locator.DefaultResourceLocator; /** @@ -154,47 +153,131 @@ store = (Journal) indexManager; } - + /** * Create or re-open a triple store using a local embedded database. + * <p> + * Note: This is only used by the test suites. */ - /* public */LocalTripleStore(final Properties properties) { + /* public */static LocalTripleStore getInstance(final Properties properties) { - /* - * FIXME This should pass up the existing properties for the KB instance - * when the KB instance is pre-existing. Really though, you should first - * obtain the Journal and then attempt to locate the KB and create it if - * it does not exist. - */ - this(new Journal(properties), "kb"/* namespace */, ITx.UNISOLATED, - properties); - - /* - * FIXME Modify this to use a row scan for the contained relations. - * There is one other place where the same test is being used. The - * reason for this test is that getSPORelation() tries to _locate_ the - * relation, but that will fail if it does not exist. By using the ctor - * and exists() we can test for pre-existence. However, the best route - * is to perform a row scan when the container is created and then we - * can just materialize the existing relations and create them if they - * are not found. - */ - if (!new SPORelation(getIndexManager(), getNamespace() + "." - + SPORelation.NAME_SPO_RELATION, getTimestamp(), getProperties()).exists()) { + final String namespace = "kb"; + + // create/re-open journal. + final Journal journal = new Journal(properties); + + try { + + // Check for pre-existing instance. + { + final LocalTripleStore lts = (LocalTripleStore) journal + .getResourceLocator().locate(namespace, ITx.UNISOLATED); + + if (lts != null) { + + return lts; + + } + + } + + // Create a new instance. + { + + final LocalTripleStore lts = new LocalTripleStore( + journal, namespace, ITx.UNISOLATED, properties); + +// if (Boolean.parseBoolean(properties.getProperty( +// BigdataSail.Options.ISOLATABLE_INDICES, +// BigdataSail.Options.DEFAULT_ISOLATABLE_INDICES))) { +// +// final long txCreate = txService.newTx(ITx.UNISOLATED); +// +// final AbstractTripleStore txCreateView = new LocalTripleStore( +// journal, namespace, Long.valueOf(txCreate), properties); +// +// // create the kb instance within the tx. +// txCreateView.create(); +// +// // commit the tx. +// txService.commit(txCreate); +// +// } else { + + lts.create(); + +// } + + } + /* - * If we could not find the SPO relation then presume that this is a - * new KB and create it now. + * Now that we have created the instance locate the triple store + * resource and return it. */ - - create(); + { - } else { - - init(); - - } + final LocalTripleStore lts = (LocalTripleStore) journal + .getResourceLocator().locate(namespace, ITx.UNISOLATED); + if (lts == null) { + + /* + * This should only occur if there is a concurrent destroy, + * which is highly unlikely to say the least. + */ + throw new RuntimeException("Concurrent create/destroy: " + + namespace); + + } + + return lts; + + } + + } catch (Throwable ex) { + + journal.shutdownNow(); + + throw new RuntimeException(ex); + + } + +// /* +// * FIXME This should pass up the existing properties for the KB instance +// * when the KB instance is pre-existing. Really though, you should first +// * obtain the Journal and then attempt to locate the KB and create it if +// * it does not exist. +// */ +// this(new Journal(properties), "kb"/* namespace */, ITx.UNISOLATED, +// properties); +// +// /* +// * FIXME Modify this to use a row scan for the contained relations. +// * There is one other place where the same test is being used. The +// * reason for this test is that getSPORelation() tries to _locate_ the +// * relation, but that will fail if it does not exist. By using the ctor +// * and exists() we can test for pre-existence. However, the best route +// * is to perform a row scan when the container is created and then we +// * can just materialize the existing relations and create them if they +// * are not found. +// */ +// if (!new SPORelation(getIndexManager(), getNamespace() + "." +// + SPORelation.NAME_SPO_RELATION, getTimestamp(), getProperties()).exists()) { +// +// /* +// * If we could not find the SPO relation then presume that this is a +// * new KB and create it now. +// */ +// +// create(); +// +// } else { +// +// init(); +// +// } + } /** Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestVocabulary.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestVocabulary.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestVocabulary.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -160,7 +160,7 @@ public void test_RdfsVocabulary() { - Properties properties = getProperties(); + final Properties properties = getProperties(); // override the default. properties.setProperty(Options.VOCABULARY_CLASS, RDFSVocabulary.class Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/rules/TestTruthMaintenance.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/rules/TestTruthMaintenance.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/rules/TestTruthMaintenance.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -723,7 +723,7 @@ // add two { - StatementBuffer assertionBuffer = new StatementBuffer(tm + final StatementBuffer assertionBuffer = new StatementBuffer(tm .newTempTripleStore(), store, 100/* capacity */); assertionBuffer.add(a, sco, b ); @@ -746,7 +746,7 @@ // retract one { - StatementBuffer retractionBuffer = new StatementBuffer(tm + final StatementBuffer retractionBuffer = new StatementBuffer(tm .newTempTripleStore(), store, 100/* capacity */); retractionBuffer.add(b, sco, c); @@ -759,7 +759,7 @@ if (log.isInfoEnabled()) log.info("\ndump after retraction and re-closure:\n" - + store.dumpStore(true,true,false)); + + store.dumpStore(true, true, false)); } @@ -770,17 +770,18 @@ */ { - TempTripleStore controlStore = new TempTripleStore(store + final TempTripleStore controlStore = new TempTripleStore(store .getProperties()); // Note: maintains closure on the controlStore. - TruthMaintenance tmControlStore = new TruthMaintenance( + final TruthMaintenance tmControlStore = new TruthMaintenance( controlStore.getInferenceEngine()); try { - StatementBuffer assertionBuffer = new StatementBuffer( - tmControlStore.newTempTripleStore(), controlStore, 100/* capacity */); + final StatementBuffer assertionBuffer = new StatementBuffer( + tmControlStore.newTempTripleStore(), controlStore, + 100/* capacity */); assertionBuffer.add(a, sco, b); // assertionBuffer.add(c, sco, d ); Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalQuadStore.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalQuadStore.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalQuadStore.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -114,7 +114,7 @@ protected AbstractTripleStore getStore(final Properties properties) { - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } @@ -157,7 +157,7 @@ // Set the file property explicitly. properties.setProperty(Options.FILE, file.toString()); - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStore.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStore.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStore.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -121,7 +121,7 @@ protected AbstractTripleStore getStore(final Properties properties) { - return new LocalTripleStore( properties ); + return LocalTripleStore.getInstance( properties ); } @@ -164,7 +164,7 @@ // Set the file property explicitly. properties.setProperty(Options.FILE, file.toString()); - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutInlining.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutInlining.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutInlining.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -126,7 +126,7 @@ protected AbstractTripleStore getStore(final Properties properties) { - return new LocalTripleStore( properties ); + return LocalTripleStore.getInstance( properties ); } @@ -169,7 +169,7 @@ // Set the file property explicitly. properties.setProperty(Options.FILE, file.toString()); - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutStatementIdentifiers.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutStatementIdentifiers.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/store/TestLocalTripleStoreWithoutStatementIdentifiers.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -116,7 +116,7 @@ protected AbstractTripleStore getStore(Properties properties) { - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } @@ -159,7 +159,7 @@ // Set the file property explicitly. properties.setProperty(Options.FILE, file.toString()); - return new LocalTripleStore(properties); + return LocalTripleStore.getInstance(properties); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataSail.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataSail.java 2011-03-04 11:06:25 UTC (rev 4272) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataSail.java 2011-03-04 15:24:20 UTC (rev 4273) @@ -93,7 +93,6 @@ import org.openrdf.query.algebra.TupleExpr; import org.openrdf.query.algebra.ValueConstant; import org.openrdf.query.algebra.Var; -import org.openrdf.query.algebra.evaluation.EvaluationStrategy; import org.openrdf.query.algebra.evaluation.impl.BindingAssigner; import org.openrdf.query.algebra.evaluation.impl.CompareOptimizer; import org.openrdf.query.algebra.evaluation.impl.ConjunctiveConstraintSplitter; @@ -674,8 +673,21 @@ closeOnShutdown = true; } - - private static LocalTripleStore createLTS(Properties properties) { + + /** + * If the {@link LocalTripleStore} with the appropriate namespace exists, + * then return it. Otherwise, create the {@link LocalTripleStore}. When the + * properties indicate that full transactional isolation should be + * supported, a new {@link LocalTripleStore} will be created within a + * transaction in order to ensure that it uses isolatable indices. Otherwise + * it is created using the {@link ITx#UNISOLATED} connection. + * + * @param properties + * The properties. + * + * @return The {@link LocalTripleStore}. + */ + private static LocalTripleStore createLTS(final Properties properties) { final Journal journal = new Journal(properties); @@ -689,23 +701,38 @@ // throws an exception if there are inconsistent properties checkProperties(properties); - final LocalTripleStore lts = new LocalTripleStore( - journal, namespace, ITx.UNISOLATED, properties); - try { - final long tx0 = txService.newTx(ITx.READ_COMMITTED); +// final boolean create; +// final long tx0 = txService.newTx(ITx.READ_COMMITTED); +// try { +// // verify kb does not exist (can not be located). +// create = journal.getResourceLocator().locate(namespace, tx0) == null; +// } finally { +// txService.abort(tx0); +// } + + // Check for pre-existing instance. + { - // verify kb does not exist (can not be located). - final boolean create = - journal.getResourceLocator().locate(namespace, tx0) == null; + final LocalTripleStore lts = (LocalTripleStore) journal + .getResourceLocator().locate(namespace, ITx.UNISOLATED); - txService.abort(tx0); + if (lts != null) { + + return lts; + + } + + } -// if (!new SPORelation(journal, namespace + "." -// + SPORelation.NAME_SPO_RELATION, ITx.UNISOLATED, properties).exists()) { - if (create) { + // Create a new instance. +// if (create) + { + final LocalTripleStore lts = new LocalTripleStore( + journal, namespace, ITx.UNISOLATED, properties); + if (Boolean.parseBoolean(properties.getProperty( BigdataSail.Options.ISOLATABLE_INDICES, BigdataSail.Options.DEFAULT_ISOLATABLE_INDICES))) { @@ -728,6 +755,31 @@ } } + + /* + * Now that we have created the instance, either using a tx or the + * unisolated connection, locate the triple store resource and + * return it. + */ + { + + final LocalTripleStore lts = (LocalTripleStore) journal + .getResourceLocator().locate(namespace, ITx.UNISOLATED); + + if (lts == null) { + + /* + * This should only occur if there is a concurrent destroy, + * which is highly unlikely to say the least. + */ + throw new RuntimeException("Concurrent create/destroy: " + + namespace); + + } + + return lts; + + } } catch (IOException ex) { @@ -735,8 +787,6 @@ } - return lts; - } private static void checkProperties(Properties properties) This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-04 11:06:31
|
Revision: 4272 http://bigdata.svn.sourceforge.net/bigdata/?rev=4272&view=rev Author: thompsonbry Date: 2011-03-04 11:06:25 +0000 (Fri, 04 Mar 2011) Log Message: ----------- Added AbstractResource#getCommitTime() which reports the commit time from which a resource was materialized for a read-only resource. This is a workaround for https://sourceforge.net/apps/trac/bigdata/ticket/266 (thin tx interface) in support of https://sourceforge.net/apps/trac/bigdata/ticket/222 (improved lexicon term cache sharing). Modified the LexiconRelation to share the term cache for read-only instances materialized from the same commitTime. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/RelationSchema.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-03 21:45:15 UTC (rev 4271) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-04 11:06:25 UTC (rev 4272) @@ -83,6 +83,7 @@ final private String containerNamespace; final private long timestamp; + final private Long commitTime; final private Properties properties; @@ -181,23 +182,6 @@ } -// /** -// * When <code>true</code> the {@link NestedSubqueryWithJoinThreadsTask} is -// * applied. Otherwise the {@link JoinMasterTask} is applied. -// * -// * @see Options#NESTED_SUBQUERY -// * -// * @deprecated by {@link BOp} annotations and the pipeline join, which -// * always does better than the older nested subquery evaluation -// * logic. -// */ -// public boolean isNestedSubquery() { -// -//// return false; -// return nestedSubquery; -// -// } - /** * Options for locatable resources. * @@ -376,43 +360,7 @@ * @deprecated by {@link BOp} annotations. */ String DEFAULT_MAX_PARALLEL_SUBQUERIES = "5"; - -// /** -// * Boolean option controls the JOIN evaluation strategy. When -// * <code>true</code>, {@link NestedSubqueryWithJoinThreadsTask} is used -// * to compute joins. When <code>false</code>, {@link JoinMasterTask} is -// * used instead (aka pipeline joins). -// * <p> -// * Note: The default depends on the deployment mode. Nested subquery -// * joins are somewhat faster for local data (temporary stores, journals, -// * and a federation that does not support scale-out). However, pipeline -// * joins are MUCH faster for scale-out so they are used by default -// * whenever {@link IBigdataFederation#isScaleOut()} reports -// * <code>true</code>. -// * <p> -// * Note: Cold query performance for complex high volume queries appears -// * to be better for the pipeline join, so it may make sense to use the -// * pipeline join even for local data. -// * -// * @deprecated The {@link NestedSubqueryWithJoinThreadsTask} is much -// * slower than the pipeline join algorithm, even for a -// * single machine. -// */ -// String NESTED_SUBQUERY = DefaultRuleTaskFactory.class.getName() -// + ".nestedSubquery"; -// /** -// * @todo option to specify the class that will serve as the -// * {@link IRuleTaskFactory} - basically, this is how you choose -// * the join strategy. however, {@link DefaultRuleTaskFactory} -// * needs to be refactored in order to make this choice by -// * {@link Class} rather than by the object's state. Also note -// * that the pipeline join may be better off with maxParallel=0. -// */ -// String RULE_TASK_FACTORY_CLASS = "ruleTaskFactoryClass"; -// -// String DEFAULT_RULE_TASK_FACTORY_CLASS = DefaultRuleTaskFactory.class.getName(); - } /** @@ -440,13 +388,8 @@ // Note: Bound before we lookup property values! this.namespace = namespace; - { - String val = properties.getProperty(RelationSchema.CONTAINER); + this.containerNamespace = properties.getProperty(RelationSchema.CONTAINER); - this.containerNamespace = val; - - } - this.timestamp = timestamp; this.properties = properties; @@ -463,6 +406,18 @@ } + /* + * Resolve the commit time from which this view was materialized (if + * known) + */ + { + + final String val = getProperty(RelationSchema.COMMIT_TIME, null/* default */); + + commitTime = val == null ? null : Long.valueOf(val); + + } + forceSerialExecution = Boolean.parseBoolean(getProperty( Options.FORCE_SERIAL_EXECUTION, Options.DEFAULT_FORCE_SERIAL_EXECUTION)); @@ -471,17 +426,6 @@ Options.DEFAULT_MAX_PARALLEL_SUBQUERIES, IntegerValidator.GTE_ZERO); - /* - * Note: The pipeline join is flat out better all around. - */ -// final boolean pipelineIsBetter = (indexManager instanceof IBigdataFederation && ((IBigdataFederation) indexManager) -// .isScaleOut()); -// -// nestedSubquery = Boolean.parseBoolean(getProperty( -// Options.NESTED_SUBQUERY, "false")); -// Boolean -// .toString(!pipelineIsBetter))); - chunkOfChunksCapacity = getProperty(Options.CHUNK_OF_CHUNKS_CAPACITY, Options.DEFAULT_CHUNK_OF_CHUNKS_CAPACITY, IntegerValidator.GT_ZERO); @@ -557,6 +501,18 @@ } /** + * The commit time from which a read-only view was materialized (if known) + * and otherwise <code>null</code>. + * + * @see https://sourceforge.net/apps/trac/bigdata/ticket/266 + */ + protected Long getCommitTime() { + + return commitTime; + + } + + /** * Return an object wrapping the properties specified to the ctor. */ public final Properties getProperties() { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/RelationSchema.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/RelationSchema.java 2011-03-03 21:45:15 UTC (rev 4271) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/RelationSchema.java 2011-03-04 11:06:25 UTC (rev 4272) @@ -1,5 +1,6 @@ package com.bigdata.relation; +import com.bigdata.relation.locator.DefaultResourceLocator; import com.bigdata.relation.locator.ILocatableResource; import com.bigdata.sparse.KeyType; import com.bigdata.sparse.Schema; @@ -53,6 +54,20 @@ + ".container"; /** + * A dynamically injected property which can reveal the commit time from + * which a locatable resource was materialized. + * + * @see DefaultResourceLocator + * @see AbstractResource + * @see https://sourceforge.net/apps/trac/bigdata/ticket/266 + * + * TODO This is a workaround for and should be removed when we replace + * the native long tx identifier with a thin interface. + */ + public static final String COMMIT_TIME = (RelationSchema.class.getPackage() + .getName() + ".commitTime").intern(); + + /** * A shared instance. */ public transient static final RelationSchema INSTANCE = new RelationSchema(); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-03 21:45:15 UTC (rev 4271) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-04 11:06:25 UTC (rev 4272) @@ -552,6 +552,7 @@ * across resource views backed by the same commit point (and also avoid * unnecessary GRS reads). */ + Long commitTime2 = null; final Map<String, Object> map; if (TimestampUtility.isReadOnly(timestamp) && !TimestampUtility.isReadCommitted(timestamp) @@ -569,6 +570,9 @@ // find the timestamp associated with that commit record. final long commitTime = commitRecord.getTimestamp(); + // Save commitTime to stuff into the properties. + commitTime2 = commitTime; + // Check the cache before materializing the properties from the // GRS. final Map<String, Object> cachedMap = propertyCache.get(new NT( @@ -653,6 +657,17 @@ properties.putAll(map); + if (commitTime2 != null) { + + /* + * Make the commit time against which we are reading accessible to + * the locatable resource. + */ + properties.setProperty(RelationSchema.COMMIT_TIME, commitTime2 + .toString()); + + } + if (log.isTraceEnabled()) { log.trace("Read properties: indexManager=" + indexManager Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java 2011-03-03 21:45:15 UTC (rev 4271) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java 2011-03-04 11:06:25 UTC (rev 4272) @@ -28,11 +28,9 @@ package com.bigdata.relation.locator; -import java.lang.ref.WeakReference; import java.nio.ByteBuffer; import java.util.Iterator; import java.util.List; -import java.util.Map; import java.util.Properties; import java.util.Set; import java.util.UUID; @@ -389,6 +387,16 @@ assertEquals(tx1, view_tx1.getTimestamp()); assertEquals(tx2, view_tx2.getTimestamp()); + /* + * Read-only views report the commit time from which they were + * materialized. + */ + assertEquals(null, ((MockRelation) view_un).getCommitTime()); + assertEquals(Long.valueOf(lastCommitTime), + ((MockRelation) view_tx1).getCommitTime()); + assertEquals(Long.valueOf(lastCommitTime), + ((MockRelation) view_tx2).getCommitTime()); + // each view has its own Properties object. final Properties p_un = view_un.getProperties(); final Properties p_tx1 = view_tx1.getProperties(); @@ -460,6 +468,14 @@ private IIndex ndx; /** + * Exposed to the unit tests. + */ + @Override + public Long getCommitTime() { + return super.getCommitTime(); + } + + /** * @param indexManager * @param namespace * @param timestamp Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-03 21:45:15 UTC (rev 4271) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-04 11:06:25 UTC (rev 4272) @@ -81,6 +81,7 @@ import com.bigdata.journal.IIndexManager; import com.bigdata.journal.IResourceLock; import com.bigdata.journal.ITx; +import com.bigdata.journal.TimestampUtility; import com.bigdata.rawstore.Bytes; import com.bigdata.rdf.internal.IDatatypeURIResolver; import com.bigdata.rdf.internal.IExtensionFactory; @@ -114,6 +115,8 @@ import com.bigdata.striterator.ChunkedArrayIterator; import com.bigdata.striterator.IChunkedOrderedIterator; import com.bigdata.striterator.IKeyOrder; +import com.bigdata.util.CanonicalFactory; +import com.bigdata.util.NT; import cutthecrap.utils.striterators.Resolver; import cutthecrap.utils.striterators.Striterator; @@ -383,12 +386,37 @@ AbstractTripleStore.Options.TERM_CACHE_CAPACITY, AbstractTripleStore.Options.DEFAULT_TERM_CACHE_CAPACITY)); - termCache = new ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>(// - termCacheCapacity, // queueCapacity - .75f, // loadFactor (.75 is the default) - 16 // concurrency level (16 is the default) - ); + final Long commitTime = getCommitTime(); + if (commitTime != null && TimestampUtility.isReadOnly(timestamp)) { + + /* + * Shared for read-only views from sample commit time. Sharing + * allows us to reuse the same instances of the term cache for + * queries reading from the same commit point. The cache size is + * automatically increased to take advantage of the fact that it + * is a shared resource. + * + * Note: Sharing is limited to the same commit time to prevent + * life cycle issues across drop/create sequences for the triple + * store. + */ + termCache = termCacheFactory.getInstance(new NT(namespace, + commitTime.longValue()), termCacheCapacity * 2); + + } else { + + /* + * Unshared for any other view of the triple store. + */ + termCache = new ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>(// + termCacheCapacity, // queueCapacity + .75f, // loadFactor (.75 is the default) + 16 // concurrency level (16 is the default) + ); + + } + } { @@ -560,13 +588,9 @@ } // discard the value factory for the lexicon's namespace. - this.valueFactory.remove(getNamespace()); + valueFactory.remove(getNamespace()); - if (termCache != null) { - - termCache.clear(); - - } + termCache.clear(); } finally { @@ -1792,9 +1816,7 @@ if (log.isInfoEnabled()) log.info("nterms=" + n + ", numNotFound=" + numNotFound - + (termCache!=null?(", cacheSize=" + termCache.size() + "\n" -// + termCache.getStatistics() - ): "")); + + ", cacheSize=" + termCache.size()); /* * sort term identifiers into index order. @@ -2010,27 +2032,12 @@ // Set the term identifier. value.setIV(tid); - if (termCache != null) { + final BigdataValue tmp = termCache.putIfAbsent(tid, value); -// synchronized (termCache) { -// -// if (termCache.get(id) == null) { -// -// termCache.put(id, value, false/* dirty */); -// -// } -// -// } - - final BigdataValue tmp = termCache.putIfAbsent(tid, - value); + if (tmp != null) { - if (tmp != null) { + value = tmp; - value = tmp; - - } - } /* @@ -2237,9 +2244,25 @@ * Or perhaps this can be rolled into the {@link ValueFactory} impl * along with the reverse bnodes mapping? */ - private ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue> termCache; + private ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue> termCache; /** + * Factory used for {@link #termCache} for read-only views of the lexicon. + */ + static private CanonicalFactory<NT/* key */, ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>, Integer/* state */> termCacheFactory = new CanonicalFactory<NT, ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>, Integer>( + 1/* queueCapacity */) { + @Override + protected ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue> newInstance( + NT key, Integer termCacheCapacity) { + return new ConcurrentWeakValueCacheWithBatchedUpdates<IV, BigdataValue>(// + termCacheCapacity.intValue(), // queueCapacity + .75f, // loadFactor (.75 is the default) + 16 // concurrency level (16 is the default) + ); + } + }; + + /** * The {@link ILexiconConfiguration} instance, which will determine how * terms are encoded and decoded in the key space. */ @@ -2337,16 +2360,9 @@ } - // test the term cache. - if (termCache != null) { + // test the term cache, passing IV from caller as the cache key. + return termCache.get(tid); - // Note: passing the IV from the caller as the cache key. - return termCache.get(tid); - - } - - return null; - } /** @@ -2403,34 +2419,13 @@ // This sets the term identifier. value.setIV(iv); - if (termCache != null) { + // Note: passing the IV object as the key. + final BigdataValue tmp = termCache.putIfAbsent(iv, value); -// synchronized (termCache) { -// -// /* -// * Note: This code block is synchronized to address a possible race -// * condition where concurrent threads resolve the term against the -// * database. It both threads attempt to insert their resolved term -// * definitions, which are distinct objects, into the cache then one -// * will get an IllegalStateException since the other's object will -// * already be in the cache. -// */ -// -// if (termCache.get(id) == null) { -// -// termCache.put(id, value, false/* dirty */); -// -// } - - // Note: passing the IV object as the key. - final BigdataValue tmp = termCache.putIfAbsent(iv, value); + if (tmp != null) { - if (tmp != null) { + value = tmp; - value = tmp; - - } - } assert value.getIV() == iv : "expecting iv=" + iv + ", but found " @@ -2530,7 +2525,7 @@ * do not replace the entry if there is one already there. */ - if (termCache != null && impl.getValueFactory() == valueFactory) { + if (impl.getValueFactory() == valueFactory) { if (storeBlankNodes || !tid.isBNode()) { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 21:45:22
|
Revision: 4271 http://bigdata.svn.sourceforge.net/bigdata/?rev=4271&view=rev Author: thompsonbry Date: 2011-03-03 21:45:15 +0000 (Thu, 03 Mar 2011) Log Message: ----------- Disabling the RangeBOp transform in the committed version. This was accidentally enabled in a previous commit, but the RangeBOp transformation is not yet fully integrated. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-03-03 21:36:04 UTC (rev 4270) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-03-03 21:45:15 UTC (rev 4271) @@ -814,7 +814,7 @@ */ attachNamedGraphsFilterToSearches(sopTree); - if (true) { + if (false) { /* * Look for numerical filters that can be rotated inside predicates */ This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 21:36:10
|
Revision: 4270 http://bigdata.svn.sourceforge.net/bigdata/?rev=4270&view=rev Author: thompsonbry Date: 2011-03-03 21:36:04 +0000 (Thu, 03 Mar 2011) Log Message: ----------- Javadoc edits and notes on future direction for the RTO. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-03 21:32:03 UTC (rev 4269) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-03 21:36:04 UTC (rev 4270) @@ -1041,6 +1041,8 @@ PipelineOp lastOp = null; + final Set<IVariable<?>> knownBound = new LinkedHashSet<IVariable<?>>(); + for (int i = 0; i < preds.length; i++) { // The next vertex in the selected join order. @@ -1065,11 +1067,49 @@ assignedConstraints[i])); } + // collect variables used as arguments by this predicate. + final Set<IVariable<?>> pvars = new LinkedHashSet<IVariable<?>>(); + { + final Iterator<IVariable<?>> vitr = BOpUtility + .getArgumentVariables(p); + while (vitr.hasNext()) { + pvars.add(vitr.next()); + } + } + + // figure out if there are ANY shared variables. + boolean shared = false; + { + for(IVariable<?> v : pvars) { + if(knownBound.contains(v)) { + shared = true; + break; + } + } + } + + /* + * FIXME Explore the merit of this optimization with MikeP, + * including consideration of the PIPELINE_QUEUE_CAPACITY and + * whether or not to request an analytic join (hash join). + */ + if (false && !shared) { + System.err.println("Full cross product join: " + p); + /* + * Force at-once evaluation to ensure that we evaluate the AP + * for [p] exactly once. + */ + anns.add(new NV(PipelineOp.Annotations.PIPELINED, false)); + } + final PipelineJoin<?> joinOp = new PipelineJoin(// lastOp == null ? new BOp[0] : new BOp[] { lastOp }, // anns.toArray(new NV[anns.size()])// ); + // Add predicate argument variables to [knownBound]. + knownBound.addAll(pvars); + lastOp = joinOp; } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-03-03 21:32:03 UTC (rev 4269) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-03-03 21:36:04 UTC (rev 4270) @@ -423,6 +423,14 @@ * #of paths with cardinality estimate underflow to jump up and down * due to the sample which is making its way through each path in * each round. + * + * TODO The RTO needs an escape hatch here. FOr example, if the sum + * of the expected IOs for some path(s) strongly dominates all other + * paths sharing the same vertices, then we should prune those paths + * even if there is a cardinality estimate underflow in those paths. + * This will allow us to focus our efforts on those paths having + * less IO cost while we seek cardinality estimates which do not + * underflow. */ int nunderflow; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-03-03 21:32:03 UTC (rev 4269) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-03-03 21:36:04 UTC (rev 4270) @@ -493,6 +493,16 @@ + ", elapsed=" + q.getElapsed() + ", nout=" + nout + ", nchunks=" + nchunks + ", stats=" + stats); +// if(false) { +// +// final StringBuilder sb = new StringBuilder(); +// +// QueryLog.log(true/* tableHeader */, q, sb); +// +// System.err.println(sb); +// +// } + return stats; } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-03-03 21:32:03 UTC (rev 4269) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-03-03 21:36:04 UTC (rev 4270) @@ -534,7 +534,8 @@ if(false){ // Run some fixed order. // final IPredicate<?>[] path = { p5, p6, p0, p2, p1, p4, p3 }; - final IPredicate<?>[] path = { p5, p3, p1, p2, p4, p6, p0 }; +// final IPredicate<?>[] path = { p5, p3, p1, p2, p4, p6, p0 }; + final IPredicate<?>[] path = { p3, p5, p1, p2, p6, p4, p0 }; runQuery("FIXED ORDER", queryEngine, distinct, selected, path, constraints); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 21:32:09
|
Revision: 4269 http://bigdata.svn.sourceforge.net/bigdata/?rev=4269&view=rev Author: thompsonbry Date: 2011-03-03 21:32:03 +0000 (Thu, 03 Mar 2011) Log Message: ----------- QueryLog is now willing to return a formatted StringBuilder so you can log to the console, etc. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryLog.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryLog.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryLog.java 2011-03-03 21:31:26 UTC (rev 4268) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryLog.java 2011-03-03 21:32:03 UTC (rev 4269) @@ -75,12 +75,16 @@ static public void log(final IRunningQuery q) { if (log.isInfoEnabled()) { - + try { - logDetailRows(q); + final StringBuilder sb = new StringBuilder(1024); + + logDetailRows(q,sb); - logSummaryRow(q); + logSummaryRow(q,sb); + + log.info(sb); } catch (RuntimeException t) { @@ -93,9 +97,33 @@ } /** + * Log the query. + * + * @param q + * The query. + * @param sb + * Where to write the log message. + */ + static public void log(final boolean includeTableHeader, + final IRunningQuery q, final StringBuilder sb) { + + if(includeTableHeader) { + + sb.append(getTableHeader()); + + } + + logDetailRows(q, sb); + + logSummaryRow(q, sb); + + } + + /** * Log a detail row for each operator in the query. */ - static private void logDetailRows(final IRunningQuery q) { + static private void logDetailRows(final IRunningQuery q, + final StringBuilder sb) { final Integer[] order = BOpUtility.getEvaluationOrder(q.getQuery()); @@ -103,8 +131,10 @@ for (Integer bopId : order) { - log.info(getTableRow(q, orderIndex, bopId, false/* summary */)); + sb.append(getTableRow(q, orderIndex, bopId, false/* summary */)); +// sb.append('\n'); + orderIndex++; } @@ -114,9 +144,11 @@ /** * Log a summary row for the query. */ - static private void logSummaryRow(final IRunningQuery q) { + static private void logSummaryRow(final IRunningQuery q, final StringBuilder sb) { - log.info(getTableRow(q, -1/* orderIndex */, q.getQuery().getId(), true/* summary */)); + sb.append(getTableRow(q, -1/* orderIndex */, q.getQuery().getId(), true/* summary */)); + +// sb.append('\n'); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 21:31:32
|
Revision: 4268 http://bigdata.svn.sourceforge.net/bigdata/?rev=4268&view=rev Author: thompsonbry Date: 2011-03-03 21:31:26 +0000 (Thu, 03 Mar 2011) Log Message: ----------- Optimization for BOpUtility#getSharedVariables() with unit test. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpUtility.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/util/TestBOpUtility_sharedVariables.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpUtility.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpUtility.java 2011-03-03 21:27:02 UTC (rev 4267) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpUtility.java 2011-03-03 21:31:26 UTC (rev 4268) @@ -1108,11 +1108,8 @@ // if (p == c) // throw new IllegalArgumentException(); - // The set of variables which are shared. - final Set<IVariable<?>> sharedVars = new LinkedHashSet<IVariable<?>>(); - // Collect the variables appearing anywhere in [p]. - final Set<IVariable<?>> p1vars = new LinkedHashSet<IVariable<?>>(); + Set<IVariable<?>> p1vars = null; { final Iterator<IVariable<?>> itr = BOpUtility @@ -1120,12 +1117,29 @@ while (itr.hasNext()) { + if(p1vars == null) { + + // lazy initialization. + p1vars = new LinkedHashSet<IVariable<?>>(); + + } + p1vars.add(itr.next()); } } + if (p1vars == null) { + + // Fast path when no variables in [p]. + return Collections.emptySet(); + + } + + // The set of variables which are shared. + Set<IVariable<?>> sharedVars = null; + // Consider the variables appearing anywhere in [c]. { @@ -1138,16 +1152,26 @@ if (p1vars.contains(avar)) { - sharedVars.add(avar); + if (sharedVars == null) { + // lazy initialization. + sharedVars = new LinkedHashSet<IVariable<?>>(); + + } + + sharedVars.add(avar); + } } } - return sharedVars; + if (sharedVars == null) + return Collections.emptySet(); + return sharedVars; + } } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/util/TestBOpUtility_sharedVariables.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/util/TestBOpUtility_sharedVariables.java 2011-03-03 21:27:02 UTC (rev 4267) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/util/TestBOpUtility_sharedVariables.java 2011-03-03 21:31:26 UTC (rev 4268) @@ -98,6 +98,14 @@ @SuppressWarnings("unchecked") public void test_getSharedVariables_nothingShared() { + // nothing shared because no variables for one operand. + assertTrue(BOpUtility.getSharedVars(new Constant<Integer>(12), Var.var("y")) + .isEmpty()); + + // nothing shared because no variables for the other operand. + assertTrue(BOpUtility.getSharedVars(Var.var("y"),new Constant<Integer>(12)) + .isEmpty()); + // nothing shared. assertTrue(BOpUtility.getSharedVars(Var.var("x"), Var.var("y")) .isEmpty()); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 21:27:09
|
Revision: 4267 http://bigdata.svn.sourceforge.net/bigdata/?rev=4267&view=rev Author: thompsonbry Date: 2011-03-03 21:27:02 +0000 (Thu, 03 Mar 2011) Log Message: ----------- Refactored the DefaultResourceLocator to remove the base class, which was simply a cache. Added a propertyCache to the DefaultResourceLocator to provide sharing of the materialized properties from the global row store across views which are backed by the same commit point. This sharing is only done for the local Journal right now as it depends on fast access to the commit time. This change could be extended to scale-out with a refactor to replace the native long transaction identifier with a thin interface capable of reporting both the transaction identifier and the commit point against which the transaction is reading. Added unit test for properties caching by the DefaultResourceLocator. Modified Journal to expose access to historical views of the global row store. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/Journal.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java Removed Paths: ------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/AbstractCachingResourceLocator.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/Journal.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/Journal.java 2011-03-03 18:38:51 UTC (rev 4266) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/Journal.java 2011-03-03 21:27:02 UTC (rev 4267) @@ -1125,14 +1125,39 @@ /* * global row store. + */ + public SparseRowStore getGlobalRowStore() { + + return getGlobalRowStoreHelper().getGlobalRowStore(); + + } + + /** + * Return a view of the global row store as of the specified timestamp. This + * is mainly used to provide access to historical views. * + * @param timestamp + * The specified timestamp. + * + * @return The global row store view -or- <code>null</code> if no view + * exists as of that timestamp. + */ + public SparseRowStore getGlobalRowStore(final long timestamp) { + + return getGlobalRowStoreHelper().get(timestamp); + + } + + /** + * Return the {@link GlobalRowStoreHelper}. + * <p> * Note: An atomic reference provides us with a "lock" object which doubles * as a reference. We are not relying on its CAS properties. */ - public SparseRowStore getGlobalRowStore() { + private final GlobalRowStoreHelper getGlobalRowStoreHelper() { + + GlobalRowStoreHelper t = globalRowStoreHelper.get(); - GlobalRowStoreHelper t = globalRowStoreHelper.get(); - if (t == null) { synchronized (globalRowStoreHelper) { @@ -1151,14 +1176,14 @@ .set(t = new GlobalRowStoreHelper(this)); } - + } } - return globalRowStoreHelper.get().getGlobalRowStore(); + return globalRowStoreHelper.get(); + } - } final private AtomicReference<GlobalRowStoreHelper> globalRowStoreHelper = new AtomicReference<GlobalRowStoreHelper>(); /* Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-03 18:38:51 UTC (rev 4266) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/AbstractResource.java 2011-03-03 21:27:02 UTC (rev 4267) @@ -47,7 +47,6 @@ import com.bigdata.journal.IIndexManager; import com.bigdata.journal.IResourceLock; import com.bigdata.journal.IResourceLockService; -import com.bigdata.rawstore.Bytes; import com.bigdata.rdf.rules.FastClosure; import com.bigdata.rdf.rules.FullClosure; import com.bigdata.rdf.rules.RuleFastClosure5; @@ -691,12 +690,12 @@ } // Write the map on the row store. - final Map afterMap = indexManager.getGlobalRowStore().write( - RelationSchema.INSTANCE, map); - + final Map<String, Object> afterMap = indexManager.getGlobalRowStore() + .write(RelationSchema.INSTANCE, map); + if(log.isDebugEnabled()) { - log.debug("Properties after write: "+afterMap); + log.debug("Properties after write: " + afterMap); } Deleted: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/AbstractCachingResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/AbstractCachingResourceLocator.java 2011-03-03 18:38:51 UTC (rev 4266) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/AbstractCachingResourceLocator.java 2011-03-03 21:27:02 UTC (rev 4267) @@ -1,191 +0,0 @@ -/* - -Copyright (C) SYSTAP, LLC 2006-2008. All rights reserved. - -Contact: - SYSTAP, LLC - 4501 Tower Road - Greensboro, NC 27410 - lic...@bi... - -This program is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; version 2 of the License. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; if not, write to the Free Software -Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - -*/ -/* - * Created on Jun 30, 2008 - */ - -package com.bigdata.relation.locator; - -import java.lang.ref.WeakReference; -import java.util.concurrent.TimeUnit; - -import org.apache.log4j.Logger; - -import com.bigdata.cache.ConcurrentWeakValueCache; -import com.bigdata.cache.ConcurrentWeakValueCacheWithTimeout; -import com.bigdata.util.NT; - -/** - * Abstract base class for {@link IResourceLocator}s with caching. The cache - * uses {@link WeakReference}s so that cache entries will be cleared if the - * referenced item is cleared. - * - * @author <a href="mailto:tho...@us...">Bryan Thompson</a> - * @version $Id$ - */ -abstract public class AbstractCachingResourceLocator<T extends ILocatableResource> - implements IResourceLocator<T> { - - protected static final Logger log = Logger - .getLogger(AbstractCachingResourceLocator.class); - - protected static final boolean INFO = log.isInfoEnabled(); - - final private transient ConcurrentWeakValueCache<NT, T> cache; - - final private int capacity; - - /** - * The cache capacity. - */ - final public int capacity() { - - return capacity; - - } - - /** - * - * @param capacity - * The cache capacity. - * @param timeout - * The timeout in milliseconds for stale entries. - */ - protected AbstractCachingResourceLocator(final int capacity, - final long timeout) { - - if (capacity <= 0) - throw new IllegalArgumentException(); - - if (timeout < 0) - throw new IllegalArgumentException(); - - this.capacity = capacity; - -// this.cache = new WeakValueCache<NT, T>(new LRUCache<NT, T>(capacity)); - - this.cache = new ConcurrentWeakValueCacheWithTimeout<NT, T>(capacity, - TimeUnit.MILLISECONDS.toNanos(timeout)); - - } - - /** - * Looks up the resource in the cache (thread-safe since the underlying - * cache is thread-safe). - * - * @param namespace - * - * @param timestamp - * - * @return The relation -or- <code>null</code> iff it is not in the cache. - */ - protected T get(final String namespace, final long timestamp) { - - if (namespace == null) - throw new IllegalArgumentException(); - - final T r = cache.get(new NT(namespace, timestamp)); - - if (INFO) { - - log.info((r == null ? "miss" : "hit ") + ": namespace=" + namespace - + ", timestamp=" + timestamp); - - } - - return r; - - } - - /** - * Places the resource in the cache. - * <p> - * Note: The underlying cache is thread-safe. However, when adding an entry - * to the cache the caller MUST be synchronized on the named resource, use - * {@link #get(String, long)} to determine that there is no such entry in - * the cache, and then {@link #put(ILocatableResource)} the entry. - * <p> - * Note: Read committed views are allowed into the cache. - * <p> - * For a Journal, this depends on Journal#getIndex(name,timestamp) returning - * a ReadCommittedView for an index so that the view does in fact have - * read-committed semantics. - * <p> - * For a federation, read-committed semantics are achieved by the - * IClientIndex implementations since they always make standoff requests to - * one (or more) data services. Those requests allow the data service to - * resolve the then most recent view for the index for each request. - * - * @param resource - * The resource. - */ - protected void put(final T resource) { - - if (resource == null) - throw new IllegalArgumentException(); - - final String namespace = resource.getNamespace().toString(); - - final long timestamp = resource.getTimestamp(); - - if (INFO) { - - log.info("Caching: namespace=" + namespace + ", timestamp=" - + timestamp); - - } - - cache.put(new NT(namespace, timestamp), resource); - -// cache.put(new NT(namespace, timestamp), resource, false/* dirty */); - - } - - /** - * Clears any resource having the same namespace and timestamp from the - * cache. - * <p> - * Note: The caller MUST be synchronized on the named resource. - * - * @return <code>true</code> iff there was an entry in the cache for the - * same resource namespace and timestamp, in which case it was - * cleared from the cache. - */ - protected boolean clear(final String namespace, final long timestamp) { - - if (namespace == null) - throw new IllegalArgumentException(); - - if (cache.remove(new NT(namespace, timestamp)) != null) { - - return true; - - } - - return false; - - } - -} Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-03 18:38:51 UTC (rev 4266) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/relation/locator/DefaultResourceLocator.java 2011-03-03 21:27:02 UTC (rev 4267) @@ -33,24 +33,30 @@ import java.util.Properties; import java.util.UUID; import java.util.WeakHashMap; +import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicReference; import java.util.concurrent.locks.Lock; import org.apache.log4j.Logger; import com.bigdata.btree.IIndex; +import com.bigdata.cache.ConcurrentWeakValueCache; +import com.bigdata.cache.ConcurrentWeakValueCacheWithTimeout; import com.bigdata.cache.LRUCache; import com.bigdata.concurrent.NamedLock; import com.bigdata.journal.AbstractTask; +import com.bigdata.journal.ICommitRecord; import com.bigdata.journal.IIndexManager; import com.bigdata.journal.IIndexStore; import com.bigdata.journal.Journal; import com.bigdata.journal.TemporaryStore; +import com.bigdata.journal.TimestampUtility; import com.bigdata.relation.AbstractResource; import com.bigdata.relation.IRelation; import com.bigdata.relation.RelationSchema; import com.bigdata.service.IBigdataFederation; import com.bigdata.sparse.SparseRowStore; +import com.bigdata.util.NT; /** * Generic implementation relies on a ctor for the resource with the following @@ -108,21 +114,27 @@ * @param <T> * The generic type of the [R]elation. */ -public class DefaultResourceLocator<T extends ILocatableResource> extends - AbstractCachingResourceLocator<T> implements IResourceLocator<T> { +public class DefaultResourceLocator<T extends ILocatableResource> // + implements IResourceLocator<T> { protected static final transient Logger log = Logger .getLogger(DefaultResourceLocator.class); - - protected static final boolean INFO = log.isInfoEnabled(); - - protected static final boolean DEBUG = log.isDebugEnabled(); protected final transient IIndexManager indexManager; private final IResourceLocator<T> delegate; - + /** + * Cache for recently located resources. + */ + final private transient ConcurrentWeakValueCache<NT, T> resourceCache; + + /** + * Cache for recently materialized properties from the GRS. + */ + final /*private*/ transient ConcurrentWeakValueCache<NT, Map<String,Object>> propertyCache; + + /** * Provides locks on a per-namespace basis for higher concurrency. */ private final transient NamedLock<String> namedLock = new NamedLock<String>(); @@ -170,8 +182,6 @@ final IResourceLocator<T> delegate, final int cacheCapacity, final long cacheTimeout) { - super(cacheCapacity, cacheTimeout); - if (indexManager == null) throw new IllegalArgumentException(); @@ -179,6 +189,18 @@ this.delegate = delegate;// MAY be null. + if (cacheCapacity <= 0) + throw new IllegalArgumentException(); + + if (cacheTimeout < 0) + throw new IllegalArgumentException(); + + this.resourceCache = new ConcurrentWeakValueCacheWithTimeout<NT, T>( + cacheCapacity, TimeUnit.MILLISECONDS.toNanos(cacheTimeout)); + + this.propertyCache = new ConcurrentWeakValueCacheWithTimeout<NT, Map<String, Object>>( + cacheCapacity, TimeUnit.MILLISECONDS.toNanos(cacheTimeout)); + } // @todo hotspot 2% total query time. @@ -187,46 +209,86 @@ if (namespace == null) throw new IllegalArgumentException(); - if (INFO) { + if (log.isInfoEnabled()) { - log.info("namespace=" + namespace+", timestamp="+timestamp); + log.info("namespace=" + namespace + ", timestamp=" + timestamp); } + T resource = null; + final NT nt; + + /* + * Note: The drawback with resolving the resource against the + * [commitTime] is that the views will be the same object instance and + * will have the timestamp associated with the [commitTime] rather than + * the caller's timestamp. This breaks the assumption that + * resource#getTimestamp() returns the transaction identifier for a + * read-only transaction. In order to fix that, we resort to sharing the + * Properties object instead of the resource view. + */ +// if (TimestampUtility.isReadOnly(timestamp) +// && indexManager instanceof Journal) { +// +// /* +// * If we are looking on a local Journal (standalone database) then +// * we resolve the caller's [timestamp] to the commit point against +// * which the resource will be located and handle caching of the +// * resource using that commit point. This is done in order to share +// * a read-only view of a resource with any request which would be +// * serviced by the same commit point. Any such views are read-only +// * and immutable. +// */ +// +// final Journal journal = (Journal) indexManager; +// +// // find the commit record on which we need to read. +// final long commitTime = journal.getCommitRecord( +// TimestampUtility.asHistoricalRead(timestamp)) +// .getTimestamp(); +// +// nt = new NT(namespace, commitTime); +// +// } else { + + nt = new NT(namespace, timestamp); +// +// } + // test cache: hotspot 93% of method time. - T resource = get(namespace, timestamp); + resource = resourceCache.get(nt); if (resource != null) { - if (DEBUG) + if (log.isDebugEnabled()) log.debug("cache hit: " + resource); // cache hit. return resource; } - + /* - * Since there was a cache miss, acquire a lock the named relation so - * that the locate + cache.put sequence will be atomic. + * Since there was a cache miss, acquire a lock for the named relation + * so that the locate + cache.put sequence will be atomic. */ final Lock lock = namedLock.acquireLock(namespace); try { // test cache now that we have the lock. - resource = get(namespace, timestamp); + resource = resourceCache.get(nt); if (resource != null) { - if (DEBUG) + if (log.isDebugEnabled()) log.debug("cache hit: " + resource); return resource; } - if (INFO) + if (log.isInfoEnabled()) log.info("cache miss: namespace=" + namespace + ", timestamp=" + timestamp); @@ -250,7 +312,7 @@ * resolve this request. */ - if(INFO) { + if(log.isInfoEnabled()) { log.info("Not found - passing to delegate: namespace=" + namespace + ", timestamp=" + timestamp); @@ -262,7 +324,7 @@ if (resource != null) { - if (INFO) { + if (log.isInfoEnabled()) { log.info("delegate answered: " + resource); @@ -275,15 +337,15 @@ } if (log.isInfoEnabled()) - log.info("Not found: namespace=" + namespace + ", timestamp=" - + timestamp); + log.info("Not found: namespace=" + namespace + + ", timestamp=" + timestamp); // not found. return null; } - if (DEBUG) { + if (log.isDebugEnabled()) { log.debug(properties.toString()); @@ -310,7 +372,7 @@ } - if (DEBUG) { + if (log.isDebugEnabled()) { log.debug("Implementation class=" + cls.getName()); @@ -321,8 +383,8 @@ properties); // Add to the cache. - put(resource); - + resourceCache.put(nt, resource); + return resource; } finally { @@ -373,7 +435,7 @@ * and removed from the [seeAlso] weak value cache. */ - if (INFO) + if (log.isInfoEnabled()) log.info("Closed? " + indexManager); } else { @@ -400,7 +462,7 @@ if (properties != null) { - if (INFO) { + if (log.isInfoEnabled()) { log.info("Found: namespace=" + namespace + " on " + indexManager); @@ -428,7 +490,7 @@ if (properties != null) { - if (INFO) { + if (log.isInfoEnabled()) { log.info("Found: namespace=" + namespace + " on " + indexManager); @@ -460,28 +522,19 @@ * @param indexManager * @param namespace * The resource identifier - this is the primary key. - * @param timestampIsIgnored + * @param timestamp * The timestamp of the resource view. * * @return The {@link Properties} iff there is a logical row for the given * namespace. - * - * @todo The timestamp of the resource view is currently ignored. This - * probably should be modified to use the corresponding view of the - * global row store rather than always using the read-committed / - * unisolated view. That would make the properties immutable for a - * historical resource view and thus more easily cached. However, - * it would also make it impossible to modify those properties for - * historical views as any changes would only apply to views whose - * commit time was after the change to the global row store. */ protected Properties locateResourceOn(final IIndexManager indexManager, - final String namespace, final long timestampIsIgnored) { + final String namespace, final long timestamp) { - if (INFO) { + if (log.isInfoEnabled()) { log.info("indexManager=" + indexManager + ", namespace=" - + namespace + ", timestamp=" + timestampIsIgnored); + + namespace + ", timestamp=" + timestamp); } @@ -489,23 +542,105 @@ * Look at the global row store view corresponding to the specified * timestamp. * - * @todo caching may be useful here for historical reads. + * Note: caching here is important in order to reduce the heap pressure + * associated with large numbers of concurrent historical reads against + * the same commit point when those reads are performed within read-only + * transactions and, hence, each read is performed with a DISTINCT + * timestamp. Since the timestamps are distinct, the resource [cache] + * will have cache misses. This code provides for a [propertyCache] + * which ensures that we share the materialized properties from the GRS + * across resource views backed by the same commit point (and also avoid + * unnecessary GRS reads). */ - final SparseRowStore rowStore = indexManager - .getGlobalRowStore(/*timestamp*/); - - final Map<String, Object> map = rowStore == null ? null : rowStore - .read(RelationSchema.INSTANCE, namespace); + final Map<String, Object> map; + if (TimestampUtility.isReadOnly(timestamp) + && !TimestampUtility.isReadCommitted(timestamp) + && indexManager instanceof Journal) { -// System.err.println("Reading properties: namespace="+namespace+", timestamp="+timestampIsIgnored); -// log.fatal("Reading properties: "+namespace,new RuntimeException()); - + final Journal journal = (Journal) indexManager; + + // find the commit record on which we need to read. + final ICommitRecord commitRecord = journal + .getCommitRecord(TimestampUtility + .asHistoricalRead(timestamp)); + + if (commitRecord != null) { + + // find the timestamp associated with that commit record. + final long commitTime = commitRecord.getTimestamp(); + + // Check the cache before materializing the properties from the + // GRS. + final Map<String, Object> cachedMap = propertyCache.get(new NT( + namespace, commitTime)); + + if (cachedMap != null) { + + // The properties are in the cache. + map = cachedMap; + + } else { + + // Use the GRS view as of that commit point. + final SparseRowStore rowStore = journal + .getGlobalRowStore(commitTime); + + // Read the properties from the GRS. + map = rowStore == null ? null : rowStore.read( + RelationSchema.INSTANCE, namespace); + + if (map != null) { + + // Stuff the properties into the cache. + propertyCache.put(new NT(namespace, commitTime), map); + + } + + } + + } else { + + /* + * No such commit record. + * + * @todo We can probably just return [null] for this case. + */ + + final SparseRowStore rowStore = indexManager + .getGlobalRowStore(/* timestamp */); + + // Read the properties from the GRS. + map = rowStore == null ? null : rowStore.read( + RelationSchema.INSTANCE, namespace); + + } + + } else { + + /* + * @todo The timestamp of the resource view is currently ignored. + * This probably should be modified to use the corresponding view of + * the global row store rather than always using the read-committed + * / unisolated view, which will require exposing a + * getGlobalRowStore(timestamp) method on IIndexStore. + */ + + final SparseRowStore rowStore = indexManager + .getGlobalRowStore(/* timestamp */); + + // Read the properties from the GRS. + map = rowStore == null ? null : rowStore.read( + RelationSchema.INSTANCE, namespace); + + } + if (map == null) { - if (DEBUG) { + if (log.isDebugEnabled()) { - log.debug("No properties: indexManager=" + indexManager - + ", namespace=" + namespace); + log.debug("Not found: indexManager=" + indexManager + + ", namespace=" + namespace + ", timestamp=" + + timestamp); } @@ -513,21 +648,23 @@ } + // wrap with properties object to prevent cross view mutation. final Properties properties = new Properties(); properties.putAll(map); - if (DEBUG) { + if (log.isTraceEnabled()) { - log.debug("Read properties: indexManager=" + indexManager - + ", namespace=" + namespace + " :: " + properties); + log.trace("Read properties: indexManager=" + indexManager + + ", namespace=" + namespace + ", timestamp=" + timestamp + + " :: " + properties); } return properties; } - + /** * Create a new view of the relation. * @@ -588,7 +725,7 @@ r.init(); - if(INFO) { + if(log.isInfoEnabled()) { log.info("new instance: "+r); @@ -614,7 +751,7 @@ * @param instance * The instance. */ - public T putInstance(T instance) { + public T putInstance(final T instance) { if (instance == null) throw new IllegalArgumentException(); @@ -623,7 +760,7 @@ final long timestamp = instance.getTimestamp(); - if (INFO) { + if (log.isInfoEnabled()) { log.info("namespace=" + namespace+", timestamp="+timestamp); @@ -634,11 +771,13 @@ try { - final T tmp = get(namespace, timestamp); + final NT nt = new NT(namespace, timestamp); + final T tmp = resourceCache.get(nt); + if (tmp != null) { - if(INFO) { + if(log.isInfoEnabled()) { log.info("Existing instance already in cache: "+tmp); @@ -648,10 +787,10 @@ } - put(instance); - - if (INFO) { + resourceCache.put(nt, instance); + if (log.isInfoEnabled()) { + log.info("Instance added to cache: " + instance); } @@ -665,15 +804,15 @@ } } - + /** - * Resources that hold hard references to local index objects MUST discarded - * during abort processing. Otherwise the same resource objects will be - * returned from the cache and buffered writes on the indices for those - * relations (if they are local index objects) will still be visible, this - * defeating the abort semantics. + * Resources that hold hard references to local index objects MUST be + * discarded during abort processing. Otherwise the same resource objects + * will be returned from the cache and buffered writes on the indices for + * those relations (if they are local index objects) will still be visible, + * thus defeating the abort semantics. */ - public void discard(ILocatableResource<T> instance) { + public void discard(final ILocatableResource<T> instance) { if (instance == null) throw new IllegalArgumentException(); @@ -682,9 +821,9 @@ final long timestamp = instance.getTimestamp(); - if (INFO) { + if (log.isInfoEnabled()) { - log.info("namespace=" + namespace+", timestamp="+timestamp); + log.info("namespace=" + namespace + ", timestamp=" + timestamp); } @@ -693,10 +832,16 @@ try { - final boolean found = clear(namespace, timestamp); - - if (INFO) { + final NT nt = new NT(namespace, timestamp); + /* + * Clear the resource cache, but we do not need to clear the + * property cache since it only retains immutable historical state. + */ + final boolean found = resourceCache.remove(nt) != null; + + if (log.isInfoEnabled()) { + log.info("instance=" + instance + ", found=" + found); } @@ -744,7 +889,7 @@ */ seeAlso.put(indexManager, null); - if (INFO) { + if (log.isInfoEnabled()) { log.info("size=" + seeAlso.size() + ", added indexManager=" + indexManager); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java 2011-03-03 18:38:51 UTC (rev 4266) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/relation/locator/TestDefaultResourceLocator.java 2011-03-03 21:27:02 UTC (rev 4267) @@ -28,13 +28,14 @@ package com.bigdata.relation.locator; +import java.lang.ref.WeakReference; +import java.nio.ByteBuffer; import java.util.Iterator; import java.util.List; +import java.util.Map; import java.util.Properties; import java.util.Set; import java.util.UUID; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Executors; import junit.framework.TestCase2; @@ -49,11 +50,11 @@ import com.bigdata.journal.Journal.Options; import com.bigdata.rdf.spo.ISPO; import com.bigdata.relation.AbstractRelation; -import com.bigdata.relation.accesspath.IAccessPath; +import com.bigdata.relation.AbstractResource; import com.bigdata.service.IBigdataFederation; import com.bigdata.striterator.IChunkedOrderedIterator; import com.bigdata.striterator.IKeyOrder; -import com.bigdata.util.concurrent.DaemonThreadFactory; +import com.bigdata.util.NT; /** * Test suite for location relations, etc. @@ -102,8 +103,6 @@ final Journal store = new Journal( properties ); - final ExecutorService executorService = Executors.newCachedThreadPool(DaemonThreadFactory.defaultThreadFactory()); - final String namespace = "test"; try { @@ -221,12 +220,239 @@ store.destroy(); - executorService.shutdownNow(); - } } + /** + * Unit test for property caching for locatable resources. + */ + public void test_propertyCache() { + + final Properties properties = getProperties(); + + final Journal store = new Journal( properties ); + + final String namespace = "test"; + + try { + + // write a small record onto the journal and force a commit. + { + final ByteBuffer b = ByteBuffer.allocate(4); + b.putInt(0); + b.flip(); + store.write(b); + assertNotSame(0L,store.commit()); + } + + // verify resource can not be located yet. + { + // resource does not exist in UNISOLATED view. + assertNull(store.getResourceLocator().locate(namespace, + ITx.UNISOLATED)); + + // resource does not exist at lastCommitTime. + assertNull(store.getResourceLocator().locate(namespace, + store.getLastCommitTime())); + + // resource does not exist at read-only tx. + { + final long tx = store.newTx(ITx.READ_COMMITTED); + try { + assertNull(store.getResourceLocator().locate(namespace, + store.getLastCommitTime())); + } finally { + store.abort(tx); + } + } + } + + // instantiate relation. + MockRelation mockRelation = new MockRelation(store, namespace, + ITx.UNISOLATED, properties); + + // verify resource still can not be located. + { + // resource does not exist in UNISOLATED view. + assertNull(store.getResourceLocator().locate(namespace, + ITx.UNISOLATED)); + + // resource does not exist at lastCommitTime. + assertNull(store.getResourceLocator().locate(namespace, + store.getLastCommitTime())); + + // resource does not exist at read-only tx. + { + final long tx = store.newTx(ITx.READ_COMMITTED); + try { + assertNull(store.getResourceLocator().locate(namespace, + store.getLastCommitTime())); + } finally { + store.abort(tx); + } + } + } + + // create the resource, which writes the properties into the GRS. + mockRelation.create(); + + /* + */ + { + + /* + * The UNISOLATED view of the resource should be locatable now + * since the writes on the global row store are unisolated. + */ + assertNotNull(store.getResourceLocator().locate(namespace, + ITx.UNISOLATED)); + + // a request for the unisolated view gives us the same instance. + assertTrue(store.getResourceLocator().locate(namespace, + ITx.UNISOLATED) == mockRelation); + + /* + * The read-committed view of the resource is also locatable. + */ + assertNotNull(store.getResourceLocator().locate(namespace, + ITx.READ_COMMITTED)); + + /* + * The read committed view is not the same instance as the + * unisolated view. + */ + assertTrue(((MockRelation) store.getResourceLocator().locate( + namespace, ITx.READ_COMMITTED)) != mockRelation); + + } + + // commit time immediately proceeding this commit. + final long priorCommitTime = store.getLastCommitTime(); + + // commit, noting the commit time. + final long lastCommitTime = store.commit(); + + if(log.isInfoEnabled()) { + log.info("priorCommitTime=" + priorCommitTime); + log.info("lastCommitTime =" + lastCommitTime); + } + + /* + * Now create a few transactions against the newly created resource + * and verify that the views returned for those transactions are + * distinct, but that they share the same set of default Properties + * (e.g., the propertyCache is working). + * + * @todo also test a read-historical read ! + */ + final long tx1 = store.newTx(store.getLastCommitTime()); // read-only tx + final long tx2 = store.newTx(store.getLastCommitTime()); // read-only tx + final long ts1 = store.getLastCommitTime() - 1; // historical read + try { + + assertTrue(tx1 != tx2); + assertTrue(ts1 != tx1); + assertTrue(ts1 != tx2); + + /* + * @todo There might not be enough commit latency to have + * lastCommitTime - 1 be GT priorCommitTime. If this happens + * either resolve issue 145 or add some latency into the test. + * + * @see http://sourceforge.net/apps/trac/bigdata/ticket/145 + */ + assertTrue(ts1 > priorCommitTime); + + // unisolated view. + final AbstractResource<?> view_un = (AbstractResource<?>) store + .getResourceLocator().locate(namespace, ITx.UNISOLATED); + assertNotNull(view_un); + + // tx1 view. + final AbstractResource<?> view_tx1 = (AbstractResource<?>) store + .getResourceLocator().locate(namespace, tx1); + assertNotNull(view_tx1); + + // tx2 view. + final AbstractResource<?> view_tx2 = (AbstractResource<?>) store + .getResourceLocator().locate(namespace, tx2); + assertNotNull(view_tx2); + + // all views are distinct. + assertTrue(view_un != view_tx1); + assertTrue(view_un != view_tx2); + assertTrue(view_tx1 != view_tx2); + + // each view has its correct timestamp. + assertEquals(ITx.UNISOLATED, view_un.getTimestamp()); + assertEquals(tx1, view_tx1.getTimestamp()); + assertEquals(tx2, view_tx2.getTimestamp()); + + // each view has its own Properties object. + final Properties p_un = view_un.getProperties(); + final Properties p_tx1 = view_tx1.getProperties(); + final Properties p_tx2 = view_tx2.getProperties(); + assertTrue(p_un != p_tx1); + assertTrue(p_un != p_tx2); + assertTrue(p_tx1 != p_tx2); + + /* + * Verify that the [propertyCache] is working. + * + * Note: Unfortunately, I have not been able to devise any means + * of testing the [propertyCache] without exposing that as a + * package private object. + */ + final DefaultResourceLocator<?> locator = (DefaultResourceLocator<?>) store + .getResourceLocator(); + + // Not cached for the UNISOLATED view (mutable views can not be + // cached). + assertNull(locator.propertyCache.get(new NT(namespace, + ITx.UNISOLATED))); + +// if (true) { +// final Iterator<Map.Entry<NT, WeakReference<Map<String, Object>>>> itr = locator.propertyCache +// .entryIterator(); +// while (itr.hasNext()) { +// final Map.Entry<NT, WeakReference<Map<String, Object>>> e = itr +// .next(); +// System.err.println(e.getKey() + " => " +// + e.getValue().get()); +// } +// } + + // Not cached for the actual tx ids or read-only timestamp. + assertNull(locator.propertyCache.get(new NT(namespace,tx1))); + assertNull(locator.propertyCache.get(new NT(namespace,tx2))); + assertNull(locator.propertyCache.get(new NT(namespace,ts1))); + + /* + * Cached for the last commit time, which should have been used + * to hand back the Properties for {tx1, tx2, ts1}. + */ + assertNotNull(locator.propertyCache.get(new NT(namespace, + lastCommitTime))); + + // nothing for the prior commit time. + assertNull(locator.propertyCache.get(new NT(namespace, + priorCommitTime))); + + } finally { + store.abort(tx1); + store.abort(tx2); + } + + } finally { + + store.destroy(); + + } + + } + + @SuppressWarnings("unchecked") private static class MockRelation extends AbstractRelation { static final private String indexName = "foo"; @@ -298,66 +524,41 @@ @Override public String getFQN(IKeyOrder keyOrder) { - // TODO Auto-generated method stub return null; } public long delete(IChunkedOrderedIterator itr) { - // TODO Auto-generated method stub return 0; } public long insert(IChunkedOrderedIterator itr) { - // TODO Auto-generated method stub return 0; } -// public IAccessPath getAccessPath(IPredicate predicate) { -// // TODO Auto-generated method stub -// return null; -// } - - public long getElementCount(boolean exact) { - // TODO Auto-generated method stub - return 0; - } - public Set getIndexNames() { - // TODO Auto-generated method stub return null; } - public IKeyOrder getPrimaryKeyOrder() { - // TODO Auto-generated method stub return null; } public Iterator getKeyOrders() { - // TODO Auto-generated method stub return null; } public IKeyOrder getKeyOrder(IPredicate p) { - // TODO Auto-generated method stub return null; } - -// public Object newElement(IPredicate predicate, IBindingSet bindingSet) { -// // TODO Auto-generated method stub -// return null; -// } public Object newElement(List a, IBindingSet bindingSet) { - // TODO Auto-generated method stub return null; } public Class<ISPO> getElementClass() { - return null; + } - } - } + } // class MockRelation } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 18:38:58
|
Revision: 4266 http://bigdata.svn.sourceforge.net/bigdata/?rev=4266&view=rev Author: thompsonbry Date: 2011-03-03 18:38:51 +0000 (Thu, 03 Mar 2011) Log Message: ----------- https://sourceforge.net/apps/trac/bigdata/ticket/265 - Added a method to re-build the full text index and a unit test for the same (see TestFullTextIndex). - Javadoc edits to the ITextIndex interface. https://sourceforge.net/apps/trac/bigdata/ticket/221 - Modified AbstractJournal#dropIndex(name) to conditionally invoke removeAll() on the index in order to reclaim its storage when the index is backed by the RWStore. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/AbstractJournal.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/BigdataRDFFullTextIndex.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/ITextIndexer.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestFullTextIndex.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/AbstractJournal.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/AbstractJournal.java 2011-03-03 17:12:31 UTC (rev 4265) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/journal/AbstractJournal.java 2011-03-03 18:38:51 UTC (rev 4266) @@ -3448,14 +3448,25 @@ } - /** - * Drops the named index. The index will no longer participate in atomic - * commits and will not be visible to new transactions. Resources are NOT - * reclaimed on the {@link AbstractJournal} (it is an immortal store) and - * historical states of the index will continue to be accessible. - */ + /** + * Drops the named index. The index will no longer participate in atomic + * commits and will not be visible to new transactions. Storage will be + * reclaimed IFF the backing store support that functionality. + */ public void dropIndex(final String name) { + final BTree ndx = getIndex(name); + + if(ndx == null) + throw new NoSuchIndexException(name); + + if(getBufferStrategy() instanceof RWStrategy) { + /* + * Reclaim storage associated with the index. + */ + ndx.removeAll(); + } + final ReadLock lock = _fieldReadWriteLock.readLock(); lock.lock(); Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/BigdataRDFFullTextIndex.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/BigdataRDFFullTextIndex.java 2011-03-03 17:12:31 UTC (rev 4265) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/BigdataRDFFullTextIndex.java 2011-03-03 18:38:51 UTC (rev 4266) @@ -110,11 +110,15 @@ assertWritable(); - getIndexManager().dropIndex(getNamespace() + "." + NAME_SEARCH); + final String name = getNamespace() + "." + NAME_SEARCH; + + getIndexManager().dropIndex(name); } - public void index(int capacity, Iterator<BigdataValue> valuesIterator) { + public void index(final int capacity, + final Iterator<BigdataValue> valuesIterator) { + final TokenBuffer buffer = new TokenBuffer(capacity, this); int n = 0; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/ITextIndexer.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/ITextIndexer.java 2011-03-03 17:12:31 UTC (rev 4265) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/ITextIndexer.java 2011-03-03 18:38:51 UTC (rev 4266) @@ -68,6 +68,9 @@ * are tokenized using the default {@link Locale}. * </p> * + * @param capacity + * A hint to the underlying layer about the buffer size before an + * incremental flush of the index. * @param itr * Iterator visiting the terms to be indexed. * Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-03 17:12:31 UTC (rev 4265) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java 2011-03-03 18:38:51 UTC (rev 4266) @@ -63,21 +63,24 @@ import com.bigdata.btree.IRangeQuery; import com.bigdata.btree.ITuple; import com.bigdata.btree.ITupleIterator; +import com.bigdata.btree.ITupleSerializer; import com.bigdata.btree.IndexMetadata; import com.bigdata.btree.filter.PrefixFilter; +import com.bigdata.btree.filter.TupleFilter; import com.bigdata.btree.keys.DefaultKeyBuilderFactory; import com.bigdata.btree.keys.IKeyBuilder; import com.bigdata.btree.keys.KVO; import com.bigdata.btree.keys.KeyBuilder; import com.bigdata.btree.keys.StrengthEnum; +import com.bigdata.btree.proc.IResultHandler; import com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure.ResultBuffer; import com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure.ResultBufferHandler; import com.bigdata.btree.proc.BatchLookup.BatchLookupConstructor; -import com.bigdata.btree.proc.IResultHandler; import com.bigdata.btree.raba.IRaba; import com.bigdata.cache.ConcurrentWeakValueCacheWithBatchedUpdates; import com.bigdata.journal.IIndexManager; import com.bigdata.journal.IResourceLock; +import com.bigdata.journal.ITx; import com.bigdata.rawstore.Bytes; import com.bigdata.rdf.internal.IDatatypeURIResolver; import com.bigdata.rdf.internal.IExtensionFactory; @@ -88,6 +91,7 @@ import com.bigdata.rdf.internal.TermId; import com.bigdata.rdf.lexicon.Term2IdWriteProc.Term2IdWriteProcConstructor; import com.bigdata.rdf.model.BigdataBNode; +import com.bigdata.rdf.model.BigdataLiteral; import com.bigdata.rdf.model.BigdataURI; import com.bigdata.rdf.model.BigdataValue; import com.bigdata.rdf.model.BigdataValueFactory; @@ -1619,6 +1623,86 @@ } /** + * Utility method to (re-)build the full text index. This is a high latency + * operation for a database of any significant size. You must be using the + * unisolated view of the {@link AbstractTripleStore} for this operation. + * {@link AbstractTripleStore.Options#TEXT_INDEX} must be enabled. This + * operation is only supported when the {@link ITextIndexer} uses the + * {@link FullTextIndex} class. + * + * TODO This will have to be redone once we finish + * http://sourceforge.net/apps/trac/bigdata/ticket/109 (store large literals + * as blobs) since the ID2TERM index will disappear. + */ + @SuppressWarnings("unchecked") + public void rebuildTextIndex() { + + if (getTimestamp() != ITx.UNISOLATED) + throw new UnsupportedOperationException(); + + if(!textIndex) + throw new UnsupportedOperationException(); + + final ITextIndexer textIndexer = getSearchEngine(); + + if (textIndexer == null) { + throw new UnsupportedOperationException(); + } + + // destroy the existing text index. + textIndexer.destroy(); + + // create a new index. + textIndexer.create(); + + // the index to scan for the RDF Literals. + final IIndex id2term = getId2TermIndex(); + + // used to decode the + final ITupleSerializer tupSer = id2term.getIndexMetadata() + .getTupleSerializer(); + + /* + * Visit all plain, language code, and datatype literals in the lexicon. + * + * Note: This uses a filter on the ITupleIterator in order to filter out + * non-literal terms before they are shipped from a remote index shard. + */ + final Iterator<BigdataValue> itr = new Striterator(id2term + .rangeIterator(null/* fromKey */, null/* toKey */, + 0/* capacity */, IRangeQuery.DEFAULT, + new TupleFilter<BigdataValue>() { + private static final long serialVersionUID = 1L; + protected boolean isValid( + final ITuple<BigdataValue> obj) { + final IV iv = (IV) tupSer.deserializeKey(obj); + if (!iv.isInline() && iv.isLiteral()) { + return true; + } + return false; + } + })).addFilter(new Resolver() { + private static final long serialVersionUID = 1L; + + protected Object resolve(final Object obj) { + final BigdataLiteral lit = (BigdataLiteral) tupSer + .deserialize((ITuple) obj); +// System.err.println("lit: "+lit); + return lit; + } + }); + + final int capacity = 10000; + + while (itr.hasNext()) { + + indexTermText(capacity, itr); + + } + + } + + /** * Batch resolution of internal values to {@link BigdataValue}s. * * @param ivs Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestFullTextIndex.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestFullTextIndex.java 2011-03-03 17:12:31 UTC (rev 4265) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/lexicon/TestFullTextIndex.java 2011-03-03 18:38:51 UTC (rev 4266) @@ -362,5 +362,161 @@ } } + + /** + * Unit test for {@link LexiconRelation#rebuildTextIndex()}. + */ + public void test_rebuildIndex() { + + AbstractTripleStore store = getStore(); + + try { + + assertNotNull(store.getLexiconRelation().getSearchEngine()); + + final BigdataValueFactory f = store.getValueFactory(); + + final BigdataValue[] terms = new BigdataValue[] {// + f.createLiteral("abc"),// + f.createLiteral("abc", "en"),// + f.createLiteral("good day", "en"),// + f.createLiteral("gutten tag", "de"),// + f.createLiteral("tag team", "en"),// + f.createLiteral("the first day", "en"),// // 'the' is a stopword. + + f.createURI("http://www.bigdata.com"),// + f.asValue(RDF.TYPE),// + f.asValue(RDFS.SUBCLASSOF),// + f.asValue(XMLSchema.DECIMAL),// + + f.createBNode(UUID.randomUUID().toString()),// + f.createBNode("a12"),// + }; + + store.addTerms(terms); + + if(log.isInfoEnabled()) { + log.info(store.getLexiconRelation().dumpTerms()); + } + + /* + * Note: the language code is only used when tokenizing literals. It + * IS NOT applied as a filter to the recovered literals. + */ + + assertExpectedHits(store, "abc", null/* languageCode */, + new BigdataValue[] { // + f.createLiteral("abc"),// + f.createLiteral("abc", "en") // + }); + + assertExpectedHits(store, "tag", "en", new BigdataValue[] {// + f.createLiteral("gutten tag", "de"), // + f.createLiteral("tag team", "en") // + }); + + assertExpectedHits(store, "tag", "de", new BigdataValue[] {// + f.createLiteral("gutten tag", "de"), // + f.createLiteral("tag team", "en") // + }); + + assertExpectedHits(store, "GOOD DAY", "en", // + .0f, // minCosine + new BigdataValue[] {// + f.createLiteral("good day", "en"), // + f.createLiteral("the first day", "en") // + }); + + assertExpectedHits(store, "GOOD DAY", "en", // + .4f, // minCosine + new BigdataValue[] {// + f.createLiteral("good day", "en"), // + }); + + assertExpectedHits(store, "day", "en", // + .0f, // minCosine + new BigdataValue[] { + f.createLiteral("good day", "en"), + f.createLiteral("the first day", "en") }); + + // 'the' is a stopword, so there are no hits. + assertExpectedHits(store, "the", "en", new BigdataValue[] {}); + + /* + * re-open the store before search to verify that the data were made + * restart safe. + */ + if (store.isStable()) { + + store.commit(); + + store = reopenStore(store); + + } + + // rebuild the full text index. + store.getLexiconRelation().rebuildTextIndex(); + + /* + * re-open the store before search to verify that the data were made + * restart safe. + */ + if (store.isStable()) { + + store.commit(); + + store = reopenStore(store); + + } + + // re-verify the full text index. + { + + assertNotNull(store.getLexiconRelation().getSearchEngine()); + + assertExpectedHits(store, "abc", null/* languageCode */, + new BigdataValue[] { // + f.createLiteral("abc"),// + f.createLiteral("abc", "en") // + }); + + assertExpectedHits(store, "tag", "en", new BigdataValue[] {// + f.createLiteral("gutten tag", "de"), // + f.createLiteral("tag team", "en") // + }); + + assertExpectedHits(store, "tag", "de", new BigdataValue[] {// + f.createLiteral("gutten tag", "de"), // + f.createLiteral("tag team", "en") // + }); + + assertExpectedHits(store, "GOOD DAY", "en", // + .0f, // minCosine + new BigdataValue[] {// + f.createLiteral("good day", "en"), // + f.createLiteral("the first day", "en") // + }); + + assertExpectedHits(store, "GOOD DAY", "en", // + .4f, // minCosine + new BigdataValue[] {// + f.createLiteral("good day", "en"), // + }); + + assertExpectedHits(store, "day", "en", // + .0f, // minCosine + new BigdataValue[] { + f.createLiteral("good day", "en"), + f.createLiteral("the first day", "en") }); + + } + + } finally { + + store.__tearDownUnitTest(); + + } + + } } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 17:12:38
|
Revision: 4265 http://bigdata.svn.sourceforge.net/bigdata/?rev=4265&view=rev Author: thompsonbry Date: 2011-03-03 17:12:31 +0000 (Thu, 03 Mar 2011) Log Message: ----------- https://sourceforge.net/apps/trac/bigdata/ticket/221 Added an efficient code path for BTree#removeAll() when using the RWStore without delete markers for the index. There is now a unit test for this code path in TestRWJournal. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractBTree.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractNode.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTree.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Leaf.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Node.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestDirtyIterators.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestIterators.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestRemoveAll.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/rwstore/TestRWJournal.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractBTree.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractBTree.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractBTree.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -625,7 +625,7 @@ * split may occur without forcing eviction of either node participating in * the split. * <p> - * Note: The code in {@link Node#postOrderNodeIterator(boolean)} and + * Note: The code in {@link Node#postOrderNodeIterator(boolean, boolean)} and * {@link DirtyChildIterator} MUST NOT touch the hard reference queue since * those iterators are used when persisting a node using a post-order * traversal. If a hard reference queue eviction drives the serialization of @@ -3481,8 +3481,8 @@ * * Note: This iterator only visits dirty nodes. */ - final Iterator<AbstractNode> itr = node - .postOrderNodeIterator(true/* dirtyNodesOnly */); + final Iterator<AbstractNode> itr = node.postOrderNodeIterator( + true/* dirtyNodesOnly */, false/* nodesOnly */); while (itr.hasNext()) { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractNode.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractNode.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/AbstractNode.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -551,25 +551,46 @@ } - public Iterator<AbstractNode> postOrderNodeIterator() { + final public Iterator<AbstractNode> postOrderNodeIterator() { - return postOrderNodeIterator(false); + return postOrderNodeIterator(false/* dirtyNodesOnly */, false/* nodesOnly */); } /** - * Post-order traversal of nodes and leaves in the tree. For any given - * node, its children are always visited before the node itself (hence - * the node occurs in the post-order position in the traversal). The - * iterator is NOT safe for concurrent modification. + * Post-order traversal of nodes and leaves in the tree. For any given node, + * its children are always visited before the node itself (hence the node + * occurs in the post-order position in the traversal). The iterator is NOT + * safe for concurrent modification. * * @param dirtyNodesOnly * When true, only dirty nodes and leaves will be visited - * + * * @return Iterator visiting {@link AbstractNode}s. */ - abstract public Iterator<AbstractNode> postOrderNodeIterator(boolean dirtyNodesOnly); + final public Iterator<AbstractNode> postOrderNodeIterator( + final boolean dirtyNodesOnly) { + return postOrderNodeIterator(dirtyNodesOnly, false/* nodesOnly */); + + } + + /** + * Post-order traversal of nodes and leaves in the tree. For any given node, + * its children are always visited before the node itself (hence the node + * occurs in the post-order position in the traversal). The iterator is NOT + * safe for concurrent modification. + * + * @param dirtyNodesOnly + * When true, only dirty nodes and leaves will be visited + * @param nodesOnly + * When <code>true</code>, the leaves will not be visited. + * + * @return Iterator visiting {@link AbstractNode}s. + */ + abstract public Iterator<AbstractNode> postOrderNodeIterator( + final boolean dirtyNodesOnly, final boolean nodesOnly); + public ITupleIterator entryIterator() { return rangeIterator(null/* fromKey */, null/* toKey */, Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTree.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTree.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTree.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -28,6 +28,7 @@ import java.lang.ref.WeakReference; import java.lang.reflect.Constructor; +import java.util.Iterator; import java.util.concurrent.atomic.AtomicLong; import com.bigdata.BigdataStatics; @@ -1174,19 +1175,64 @@ assertNotReadOnly(); - /* - * FIXME Per https://sourceforge.net/apps/trac/bigdata/ticket/221, we - * should special case this for the RWStore when delete markers are not - * enabled and just issue deletes against each node and leave in the - * BTree. This could be done using a post-order traversal of the nodes - * and leaves such that the parent is not removed from the store until - * its children have been removed. The deletes should be low-level - * IRawStore#delete(addr) invocations without maintenance to the B+Tree - * data structures. Afterwards replaceRootWithEmptyLeaf() should be - * invoked to discard the hard reference ring buffer and associate a new - * root leaf with the B+Tree. - */ - if (getIndexMetadata().getDeleteMarkers() + if (!getIndexMetadata().getDeleteMarkers() + && getStore() instanceof RWStrategy) { + + /* + * Per https://sourceforge.net/apps/trac/bigdata/ticket/221, we + * special case this for the RWStore when delete markers are not + * enabled and just issue deletes against each node and leave in the + * BTree. This is done using a post-order traversal of the nodes and + * leaves such that the parent is not removed from the store until + * its children have been removed. The deletes are low-level + * IRawStore#delete(addr) invocations without maintenance to the + * B+Tree data structures. Afterwards replaceRootWithEmptyLeaf() is + * invoked to discard the hard reference ring buffer and associate a + * new root leaf with the B+Tree. + * + * FIXME https://sourceforge.net/apps/trac/bigdata/ticket/217 - we + * should update the performance counters when we use this short + * cut to release the storage associated with the B+Tree. + */ + + /* + * Visit all Nodes using a pre-order traversal, but do not + * materialize the leaves. + */ + final Iterator<AbstractNode> itr = getRoot().postOrderNodeIterator( + false/* dirtyNodesOnly */, true/* nodesOnly */); + + while(itr.hasNext()) { + + final Node node = (Node) itr.next(); + + final int nchildren = node.getChildCount(); + + for (int i = 0; i < nchildren; i++) { + + final long childAddr = node.getChildAddr(i); + + if(childAddr != 0L) { + + // delete persistent child. + getStore().delete(childAddr); + + } + + } + + } + + // delete root iff persistent. + if (getRoot().getIdentity() != 0L) { + + getStore().delete(getRoot().getIdentity()); + + } + + replaceRootWithEmptyLeaf(); + + } else if (getIndexMetadata().getDeleteMarkers() || getStore() instanceof RWStrategy) { /* Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Leaf.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Leaf.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Leaf.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -1708,17 +1708,24 @@ /** * Visits this leaf if unless it is not dirty and the flag is true, in which * case the returned iterator will not visit anything. + * + * {@inheritDoc} */ @Override @SuppressWarnings("unchecked") - public Iterator<AbstractNode> postOrderNodeIterator(final boolean dirtyNodesOnly) { + public Iterator<AbstractNode> postOrderNodeIterator( + final boolean dirtyNodesOnly, final boolean nodesOnly) { if (dirtyNodesOnly && ! isDirty() ) { return EmptyIterator.DEFAULT; + } else if(nodesOnly) { + + return EmptyIterator.DEFAULT; + } else { - + return new SingleValueIterator(this); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Node.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Node.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/Node.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -2874,7 +2874,7 @@ @Override @SuppressWarnings("unchecked") public Iterator<AbstractNode> postOrderNodeIterator( - final boolean dirtyNodesOnly) { + final boolean dirtyNodesOnly, final boolean nodesOnly) { if (dirtyNodesOnly && !dirty) { @@ -2886,7 +2886,7 @@ * Iterator append this node to the iterator in the post-order position. */ - return new Striterator(postOrderIterator1(dirtyNodesOnly)) + return new Striterator(postOrderIterator1(dirtyNodesOnly,nodesOnly)) .append(new SingleValueIterator(this)); } @@ -2916,7 +2916,7 @@ */ @SuppressWarnings("unchecked") private Iterator<AbstractNode> postOrderIterator1( - final boolean dirtyNodesOnly) { + final boolean dirtyNodesOnly,final boolean nodesOnly) { /* * Iterator visits the direct children, expanding them in turn with a @@ -2969,8 +2969,8 @@ // visit the children (recursive post-order // traversal). final Striterator itr = new Striterator( - ((Node) child) - .postOrderIterator1(dirtyNodesOnly)); + ((Node) child).postOrderIterator1( + dirtyNodesOnly, nodesOnly)); // append this node in post-order position. itr.append(new SingleValueIterator(child)); @@ -2985,6 +2985,9 @@ // BTree.log.debug("child is leaf: " + child); // Visit the leaf itself. + if (nodesOnly) + return EmptyIterator.DEFAULT; + return new SingleValueIterator(child); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestDirtyIterators.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestDirtyIterators.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestDirtyIterators.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -38,7 +38,7 @@ * mechanisms. * * @see Node#childIterator(boolean) - * @see AbstractNode#postOrderNodeIterator(boolean) + * @see AbstractNode#postOrderNodeIterator(boolean, boolean) * * @see TestIterators, which handles iterators that do not differentiate between * clear and dirty nodes. Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestIterators.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestIterators.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestIterators.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -40,12 +40,12 @@ * * @see Leaf#entryIterator() * @see Node#childIterator(boolean) - * @see AbstractNode#postOrderNodeIterator(boolean) + * @see AbstractNode#postOrderNodeIterator(boolean, boolean) * * @see TestDirtyIterators, which handles tests when some nodes or leaves are * NOT dirty and verifies that the iterators do NOT visit such nodes or * leaves. This tests {@link AbstractNode#postOrderNodeIterator()} as well - * since that is just {@link AbstractNode#postOrderNodeIterator(boolean)} + * since that is just {@link AbstractNode#postOrderNodeIterator(boolean, boolean)} * with <code>false</code> passed in. * * @author <a href="mailto:tho...@us...">Bryan Thompson</a> Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestRemoveAll.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestRemoveAll.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/btree/TestRemoveAll.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -27,15 +27,19 @@ package com.bigdata.btree; +import java.util.Properties; import java.util.UUID; import org.apache.log4j.Level; import com.bigdata.btree.keys.KeyBuilder; +import com.bigdata.journal.BufferMode; +import com.bigdata.journal.Journal; import com.bigdata.journal.TestRestartSafe; import com.bigdata.rawstore.Bytes; import com.bigdata.rawstore.IRawStore; import com.bigdata.rawstore.SimpleMemoryRawStore; +import com.bigdata.rwstore.RWStore; /** * Test suite for {@link BTree#removeAll()}. @@ -165,5 +169,5 @@ } } - + } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/rwstore/TestRWJournal.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/rwstore/TestRWJournal.java 2011-03-03 17:01:49 UTC (rev 4264) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/rwstore/TestRWJournal.java 2011-03-03 17:12:31 UTC (rev 4265) @@ -33,14 +33,18 @@ import java.util.ArrayList; import java.util.Properties; import java.util.TreeMap; +import java.util.UUID; import junit.extensions.proxy.ProxyTestSuite; import junit.framework.Test; +import com.bigdata.btree.BTree; import com.bigdata.btree.IIndex; import com.bigdata.btree.ITuple; import com.bigdata.btree.ITupleIterator; import com.bigdata.btree.IndexMetadata; +import com.bigdata.btree.SimpleEntry; +import com.bigdata.btree.keys.KeyBuilder; import com.bigdata.journal.AbstractInterruptsTestCase; import com.bigdata.journal.AbstractJournalTestCase; import com.bigdata.journal.AbstractMRMWTestCase; @@ -56,6 +60,7 @@ import com.bigdata.journal.TestJournalBasics; import com.bigdata.journal.Journal.Options; import com.bigdata.rawstore.AbstractRawStoreTestCase; +import com.bigdata.rawstore.Bytes; import com.bigdata.rawstore.IRawStore; import com.bigdata.util.InnerCause; @@ -228,7 +233,97 @@ } - /** + /** + * Unit tests for optimization when using the {@link RWStore} but not using + * delete markers. In this case, a post-order traversal is used to + * efficiently delete the nodes and leaves and the root leaf is then + * replaced. + */ + public void test_removeAllRWStore() { + + final Journal store = new Journal(getProperties()); + + try { + + final BTree btree; + { + + final IndexMetadata metadata = new IndexMetadata(UUID + .randomUUID()); + + metadata.setBranchingFactor(3); + + metadata.setDeleteMarkers(false); + + btree = BTree.create(store, metadata); + + } + + final KeyBuilder keyBuilder = new KeyBuilder(Bytes.SIZEOF_INT); + + final int NTRIALS = 100; + + final int NINSERTS = 1000; + + final double removeAllRate = 0.001; + + final double checkpointRate = 0.001; + + for (int i = 0; i < NTRIALS; i++) { + + for (int j = 0; j < NINSERTS; j++) { + + if (r.nextDouble() < checkpointRate) { + + // flush to the backing store. + if (log.isInfoEnabled()) + log.info("checkpoint: " + btree.getStatistics()); + + btree.writeCheckpoint(); + + } + + if (r.nextDouble() < removeAllRate) { + + if (log.isInfoEnabled()) + log.info("removeAll: " + btree.getStatistics()); + + btree.removeAll(); + + } + + final int tmp = r.nextInt(10000); + + final byte[] key = keyBuilder.reset().append(tmp).getKey(); + + btree.insert(key, new SimpleEntry(tmp)); + + } + + } + + if (log.isInfoEnabled()) + log.info("will removeAll: "+btree.getStatistics()); + + btree.removeAll(); + + if (log.isInfoEnabled()) + log.info("will checkpoint: " + btree.getStatistics()); + + btree.writeCheckpoint(); + + if (log.isInfoEnabled()) + log.info(" did checkpoint: " + btree.getStatistics()); + + } finally { + + store.destroy(); + + } + + } + + /** * Test suite integration for {@link AbstractRestartSafeTestCase}. * * @todo there are several unit tests in this class that deal with This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 17:01:55
|
Revision: 4264 http://bigdata.svn.sourceforge.net/bigdata/?rev=4264&view=rev Author: thompsonbry Date: 2011-03-03 17:01:49 +0000 (Thu, 03 Mar 2011) Log Message: ----------- BTreeStatistics#toString() was writing leafCount for the nodeCount. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTreeStatistics.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTreeStatistics.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTreeStatistics.java 2011-03-03 16:37:20 UTC (rev 4263) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/BTreeStatistics.java 2011-03-03 17:01:49 UTC (rev 4264) @@ -96,7 +96,7 @@ ",entryCount=" + entryCount+ // ",height=" + height+ // ",leafCount=" + leafCount+ // - ",nodeCount=" + leafCount+ // + ",nodeCount=" + nodeCount+ // ",utilReport=" + utilReport+ // "}"; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-03 16:37:30
|
Revision: 4263 http://bigdata.svn.sourceforge.net/bigdata/?rev=4263&view=rev Author: thompsonbry Date: 2011-03-03 16:37:20 +0000 (Thu, 03 Mar 2011) Log Message: ----------- Removed @Overrides which are causing compile errors. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-02 19:15:21 UTC (rev 4262) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-03 16:37:20 UTC (rev 4263) @@ -168,29 +168,29 @@ } - @Override +// @Override public boolean isVar() { return true; } - @Override +// @Override public boolean isConstant() { return false; } - @Override +// @Override public Range get() { // log.debug("somebody tried to get me"); return null; } - @Override +// @Override public String getName() { return var().getName(); } - @Override +// @Override public boolean isWildcard() { return false; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-03-02 19:15:30
|
Revision: 4262 http://bigdata.svn.sourceforge.net/bigdata/?rev=4262&view=rev Author: thompsonbry Date: 2011-03-02 19:15:21 +0000 (Wed, 02 Mar 2011) Log Message: ----------- Some optimizations of heap churn (the bop deep copy semantics are now shallow copy semantics, which is sufficient to maintain their immutable contract). Created a canonicalizing factory and applied it to the BigdataValueFactoryImpl, which is already shared for a given namespace. I am still looking at how to apply this to the LexiconRelation to increase sharing of the term cache. Added private loggers for the NanoHTTP class hierarchy. Added test w/o shared variables to TestPipelineJoin. Interned various strings which are used as annotation names. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/Constant.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IConstraint.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/PipelineOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/aggregate/AggregateBase.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/SampleIndex.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/CopyOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/INConstraint.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/AbstractSubqueryOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/SubqueryOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryEngine.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/mutation/InsertOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/ComparatorOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/GroupByOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SliceOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SortOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/IndexMetadata.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPD.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPDServer.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/httpd/AbstractHTTPD.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/httpd/NanoHTTPD.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/engine/PipelineDelayOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/join/TestPipelineJoin.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/aggregate/GROUP_CONCAT.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/join/DataSetJoin.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/CompareBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/IsInlineBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/model/BigdataValueFactoryImpl.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/store/AbstractTripleStore.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/bench/NanoSparqlServer.java Added Paths: ----------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/CanonicalFactory.java Property Changed: ---------------- branches/QUADS_QUERY_BRANCH/bigdata-perf/ Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -226,10 +226,13 @@ // * @return <code>true</code> if all arguments and annotations are the same. // */ // boolean sameData(final BOp o); - - /** - * Interface declaring well known annotations. - */ + + /** + * Interface declaring well known annotations. + * <p> + * Note: Annotation names should be {@link String#intern() interned} in + * order to avoid having duplicate values for those strings on the heap. + */ public interface Annotations { /** @@ -238,7 +241,7 @@ * identifier for the {@link BOp} within the context of its owning * query. */ - String BOP_ID = BOp.class.getName() + ".bopId"; + String BOP_ID = (BOp.class.getName() + ".bopId").intern(); /** * The timeout for the operator evaluation (milliseconds). @@ -253,7 +256,7 @@ * be interpreted with respect to the time when the query began to * execute. */ - String TIMEOUT = BOp.class.getName() + ".timeout"; + String TIMEOUT = (BOp.class.getName() + ".timeout").intern(); /** * The default timeout for operator evaluation. @@ -266,7 +269,7 @@ * * @see BOpEvaluationContext */ - String EVALUATION_CONTEXT = BOp.class.getName() + ".evaluationContext"; + String EVALUATION_CONTEXT = (BOp.class.getName() + ".evaluationContext").intern(); BOpEvaluationContext DEFAULT_EVALUATION_CONTEXT = BOpEvaluationContext.ANY; @@ -280,7 +283,7 @@ * * @see BOp#isController() */ - String CONTROLLER = BOp.class.getName()+".controller"; + String CONTROLLER = (BOp.class.getName()+".controller").intern(); boolean DEFAULT_CONTROLLER = false; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -85,30 +85,30 @@ * not. A "copy on write" map might be better. */ static protected final transient Map<String,Object> NOANNS = Collections.emptyMap(); - - /** - * The argument values - <strong>direct access to this field is - * discouraged</strong> - the field is protected to support - * <em>mutation</em> APIs and should not be relied on for other purposes. - * <p> - * Note: This field is reported out as a {@link List} so we can make it - * thread safe and, if desired, immutable. However, it is internally a - * simple array and exposed to subclasses so they can implement mutation - * operations which return deep copies in which the argument values have - * been modified. - * <p> - * If we allow mutation of the arguments then caching of the arguments (or - * annotations) by classes such as {@link EQ} will cause {@link #clone()} to - * fail because (a) it will do a field-by-field copy on the concrete - * implementation class; and (b) it will not consistently update the cached - * references. In order to "fix" this problem, any classes which cache - * arguments or annotations would have to explicitly overrides - * {@link #clone()} in order to set those fields based on the arguments on - * the cloned {@link BOpBase} class. - * <p> - * Note: This must be at least "effectively" final per the effectively - * immutable contract for {@link BOp}s. - */ + + /** + * The argument values - <strong>direct access to this field is + * discouraged</strong> - the field is protected to support + * <em>mutation</em> APIs and should not be relied on for other purposes. + * <p> + * Note: This field is reported out as a {@link List} so we can make it + * thread safe and, if desired, immutable. However, it is internally a + * simple array. Subclasses can implement mutation operations which return + * deep copies in which the argument values have been modified using + * {@link #_set(int, BOp)}. + * <p> + * If we allowed mutation of the arguments (outside of the object creation + * pattern) then caching of the arguments (or annotations) by classes such + * as {@link EQ} will cause {@link #clone()} to fail because (a) it will do + * a field-by-field copy on the concrete implementation class; and (b) it + * will not consistently update the cached references. In order to "fix" + * this problem, any classes which cache arguments or annotations would have + * to explicitly overrides {@link #clone()} in order to set those fields + * based on the arguments on the cloned {@link BOpBase} class. + * <p> + * Note: This must be at least "effectively" final per the effectively + * immutable contract for {@link BOp}s. + */ private final BOp[] args; /** @@ -118,7 +118,14 @@ * immutable contract for {@link BOp}s. */ private final Map<String,Object> annotations; - + + /** + * The default initial capacity used for an empty annotation map -- empty + * maps use the minimum initial capacity to avoid waste since we create a + * large number of {@link BOp}s during query evaluation. + */ + static private transient final int DEFAULT_INITIAL_CAPACITY = 2; + /** * Check the operator argument. * @@ -199,7 +206,12 @@ // deep copy the annotations. // annotations = deepCopy(op.annotations); // Note: only shallow copy is required to achieve immutable semantics! - args = Arrays.copyOf(op.args, op.args.length); + if (op.args == NOARGS || op.args.length == 0) { + // fast path for zero arity operators. + args = NOARGS; + } else { + args = Arrays.copyOf(op.args, op.args.length); + } annotations = new LinkedHashMap<String, Object>(op.annotations); } @@ -232,12 +244,57 @@ checkArgs(args); this.args = args; - - this.annotations = (annotations == null ? new LinkedHashMap<String, Object>() - : annotations); - + + this.annotations = (annotations == null ? new LinkedHashMap<String, Object>( + DEFAULT_INITIAL_CAPACITY) + : annotations); + } + /* + * Note: This will not work since the keys provide the strong references to + * the values.... For this purpose we need to use a ConcurrentHashMap with + * an access policy which did not rely on weak references to clear its + * entries. + */ +// static private Map<String, Object> internMap(final Map<String, Object> anns) { +// final int initialCapacity = (int) (anns.size() / .75f/* loadFactor */) + 1; +// final Map<String, Object> t = new LinkedHashMap<String, Object>( +// initialCapacity); +// for(Map.Entry<String,Object> e : t.entrySet()) { +// final String k = intern(e.getKey()); +// t.put(k, e.getValue()); +// } +// return t; +// } +// +// /** +// * Intern the string within a canonicalizing hash map using weak values. +// * @param s +// * @return +// */ +// static private String intern(final String s) { +// +// final String t = termCache.putIfAbsent(s, s); +// +// if (t != null) +// return t; +// +// return s; +// +// } +// +// /** +// * A canonicalizing hash map using weak values. Entries will be cleared from +// * the map once their values are no longer referenced. +// * ConcurrentWeakValueCacheWithBatchedUpdates +// */ +// static private transient final ConcurrentWeakValueCacheWithBatchedUpdates<String,String> termCache = new ConcurrentWeakValueCacheWithBatchedUpdates<String,String>(// +// 1000, // queueCapacity +// .75f, // loadFactor (.75 is the default) +// 16 // concurrency level (16 is the default) +// ); + final public Map<String, Object> annotations() { return Collections.unmodifiableMap(annotations); @@ -375,10 +432,10 @@ * before returning control to the caller. This would result in less * heap churn. */ - static protected BOp[] deepCopy(final BOp[] a) { - if (a == NOARGS) { - // fast path for zero arity operators. - return a; + static protected BOp[] deepCopy(final BOp[] a) { + if (a == NOARGS || a.length == 0) { + // fast path for zero arity operators. + return NOARGS; } final BOp[] t = new BOp[a.length]; for (int i = 0; i < a.length; i++) { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/Constant.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/Constant.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/Constant.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -44,7 +44,7 @@ * The {@link IVariable} which is bound to that constant value * (optional). */ - String VAR = Constant.class.getName() + ".var"; + String VAR = (Constant.class.getName() + ".var").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IConstraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IConstraint.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IConstraint.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -52,4 +52,9 @@ */ public boolean accept(IBindingSet bindingSet); + /** + * A zero length empty {@link IConstraint} array. + */ + public IConstraint[] EMPTY = new IConstraint[0]; + } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -84,7 +84,7 @@ * @see https://sourceforge.net/apps/trac/bigdata/ticket/180 (Migrate * the RDFS inference and truth maintenance logic to BOPs) */ - String RELATION_NAME = IPredicate.class.getName() + ".relationName"; + String RELATION_NAME = (IPredicate.class.getName() + ".relationName").intern(); // /** // * The {@link IKeyOrder} which will be used to read on the relation. @@ -99,7 +99,7 @@ /** * <code>true</code> iff the predicate has SPARQL optional semantics. */ - String OPTIONAL = IPredicate.class.getName() + ".optional"; + String OPTIONAL = (IPredicate.class.getName() + ".optional").intern(); // /** // * Constraints on the elements read from the relation. @@ -139,7 +139,7 @@ * * @see IRangeQuery#rangeIterator(byte[], byte[], int, int, IFilter) */ - String INDEX_LOCAL_FILTER = IPredicate.class.getName() + ".indexLocalFilter"; + String INDEX_LOCAL_FILTER = (IPredicate.class.getName() + ".indexLocalFilter").intern(); /** * An optional {@link BOpFilterBase} to be applied to the elements of @@ -156,7 +156,7 @@ * one another. You can chain {@link FilterBase} filters together as * well. */ - String ACCESS_PATH_FILTER = IPredicate.class.getName() + ".accessPathFilter"; + String ACCESS_PATH_FILTER = (IPredicate.class.getName() + ".accessPathFilter").intern(); /** * Access path expander pattern. This allows you to wrap or replace the @@ -185,13 +185,13 @@ * * @see IAccessPathExpander */ - String ACCESS_PATH_EXPANDER = IPredicate.class.getName() + ".accessPathExpander"; + String ACCESS_PATH_EXPANDER = (IPredicate.class.getName() + ".accessPathExpander").intern(); /** * The partition identifier -or- <code>-1</code> if the predicate does * not address a specific shard. */ - String PARTITION_ID = IPredicate.class.getName() + ".partitionId"; + String PARTITION_ID = (IPredicate.class.getName() + ".partitionId").intern(); int DEFAULT_PARTITION_ID = -1; @@ -233,7 +233,7 @@ * * @see BOpEvaluationContext */ - String REMOTE_ACCESS_PATH = IPredicate.class.getName() + ".remoteAccessPath"; + String REMOTE_ACCESS_PATH = (IPredicate.class.getName() + ".remoteAccessPath").intern(); boolean DEFAULT_REMOTE_ACCESS_PATH = true; @@ -245,8 +245,8 @@ * * @see #DEFAULT_FULLY_BUFFERED_READ_THRESHOLD */ - String FULLY_BUFFERED_READ_THRESHOLD = IPredicate.class.getName() - + ".fullyBufferedReadThreshold"; + String FULLY_BUFFERED_READ_THRESHOLD = (IPredicate.class.getName() + + ".fullyBufferedReadThreshold").intern(); /** * Default for {@link #FULLY_BUFFERED_READ_THRESHOLD}. @@ -277,7 +277,7 @@ * * @see #DEFAULT_FLAGS */ - String FLAGS = IPredicate.class.getName() + ".flags"; + String FLAGS = (IPredicate.class.getName() + ".flags").intern(); /** * The default flags will visit the keys and values of the non-deleted @@ -302,7 +302,7 @@ * * @see #TIMESTAMP */ - String MUTATION = IPredicate.class.getName() + ".mutation"; + String MUTATION = (IPredicate.class.getName() + ".mutation").intern(); boolean DEFAULT_MUTATION = false; @@ -312,7 +312,7 @@ * * @see #MUTATION */ - String TIMESTAMP = IPredicate.class.getName() + ".timestamp"; + String TIMESTAMP = (IPredicate.class.getName() + ".timestamp").intern(); // /** // * An optional {@link IConstraint}[] which places restrictions on the Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/PipelineOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/PipelineOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/PipelineOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -71,7 +71,7 @@ * the ancestor in the operator tree which serves as the default sink * for binding sets (optional, default is the parent). */ - String SINK_REF = PipelineOp.class.getName() + ".sinkRef"; + String SINK_REF = (PipelineOp.class.getName() + ".sinkRef").intern(); /** * The value of the annotation is the {@link BOp.Annotations#BOP_ID} of @@ -80,8 +80,8 @@ * * @see #ALT_SINK_GROUP */ - String ALT_SINK_REF = PipelineOp.class.getName() - + ".altSinkRef"; + String ALT_SINK_REF = (PipelineOp.class.getName() + ".altSinkRef") + .intern(); /** * The value reported by {@link PipelineOp#isSharedState()} (default @@ -96,7 +96,8 @@ * When <code>true</code>, the {@link QueryEngine} will impose the * necessary constraints when the operator is evaluated. */ - String SHARED_STATE = PipelineOp.class.getName() + ".sharedState"; + String SHARED_STATE = (PipelineOp.class.getName() + ".sharedState") + .intern(); boolean DEFAULT_SHARED_STATE = false; @@ -116,7 +117,7 @@ * have less effect and performance tends to be best around a modest * value (10) for those annotations. */ - String MAX_PARALLEL = PipelineOp.class.getName() + ".maxParallel"; + String MAX_PARALLEL = (PipelineOp.class.getName() + ".maxParallel").intern(); /** * @see #MAX_PARALLEL @@ -136,8 +137,8 @@ * data to be assigned to an evaluation task is governed by * {@link #MAX_MEMORY} instead. */ - String MAX_MESSAGES_PER_TASK = PipelineOp.class.getName() - + ".maxMessagesPerTask"; + String MAX_MESSAGES_PER_TASK = (PipelineOp.class.getName() + + ".maxMessagesPerTask").intern(); /** * @see #MAX_MESSAGES_PER_TASK @@ -151,8 +152,8 @@ * amount of data which can be buffered on the JVM heap during pipelined * query evaluation. */ - String PIPELINE_QUEUE_CAPACITY = PipelineOp.class.getName() - + ".pipelineQueueCapacity"; + String PIPELINE_QUEUE_CAPACITY = (PipelineOp.class.getName() + + ".pipelineQueueCapacity").intern(); /** * @see #PIPELINE_QUEUE_CAPACITY @@ -165,7 +166,7 @@ * "blocked" evaluation depending on how it buffers its data for * evaluation. */ - String PIPELINED = PipelineOp.class.getName() + ".pipelined"; + String PIPELINED = (PipelineOp.class.getName() + ".pipelined").intern(); /** * @see #PIPELINED @@ -201,7 +202,7 @@ * semantics. Such operators MUST throw an exception if the value of * this annotation could result in multiple evaluation passes. */ - String MAX_MEMORY = PipelineOp.class.getName() + ".maxMemory"; + String MAX_MEMORY = (PipelineOp.class.getName() + ".maxMemory").intern(); /** * @see #MAX_MEMORY Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/aggregate/AggregateBase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/aggregate/AggregateBase.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/aggregate/AggregateBase.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -103,14 +103,14 @@ * The aggregate function identifier ({@link FunctionCode#COUNT}, * {@link FunctionCode#SUM}, etc). */ - String FUNCTION_CODE = AggregateBase.class.getName() + ".functionCode"; + String FUNCTION_CODE = (AggregateBase.class.getName() + ".functionCode").intern(); /** * Optional boolean property indicates whether the aggregate applies to * the distinct within group solutions (default * {@value #DEFAULT_DISTINCT}). */ - String DISTINCT = AggregateBase.class.getName() + ".distinct"; + String DISTINCT = (AggregateBase.class.getName() + ".distinct").intern(); boolean DEFAULT_DISTINCT = false; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -35,7 +35,6 @@ import com.bigdata.bop.Constant; import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstant; -import com.bigdata.bop.IConstraint; import com.bigdata.bop.IElement; import com.bigdata.bop.IPredicate; import com.bigdata.bop.IVariable; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/SampleIndex.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/SampleIndex.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/SampleIndex.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -117,7 +117,7 @@ /** * The sample limit (default {@value #DEFAULT_LIMIT}). */ - String LIMIT = SampleIndex.class.getName() + ".limit"; + String LIMIT = (SampleIndex.class.getName() + ".limit").intern(); int DEFAULT_LIMIT = 100; @@ -126,7 +126,7 @@ * (default {@value #DEFAULT_SEED}). A non-zero value may be used to * create a repeatable sample. */ - String SEED = SampleIndex.class.getName() + ".seed"; + String SEED = (SampleIndex.class.getName() + ".seed").intern(); long DEFAULT_SEED = 0L; @@ -134,12 +134,12 @@ * The {@link IPredicate} describing the access path to be sampled * (required). */ - String PREDICATE = SampleIndex.class.getName() + ".predicate"; + String PREDICATE = (SampleIndex.class.getName() + ".predicate").intern(); /** * The type of sample to take (default {@value #DEFAULT_SAMPLE_TYPE)}. */ - String SAMPLE_TYPE = SampleIndex.class.getName() + ".sampleType"; + String SAMPLE_TYPE = (SampleIndex.class.getName() + ".sampleType").intern(); String DEFAULT_SAMPLE_TYPE = SampleType.RANDOM.name(); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/ConditionalRoutingOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -69,7 +69,7 @@ * When the condition is not satisfied, the binding set is routed to the * alternative sink. */ - String CONDITION = ConditionalRoutingOp.class.getName() + ".condition"; + String CONDITION = (ConditionalRoutingOp.class.getName() + ".condition").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/CopyOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/CopyOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/bset/CopyOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -68,13 +68,13 @@ * An optional {@link IConstraint}[] which places restrictions on the * legal patterns in the variable bindings. */ - String CONSTRAINTS = CopyOp.class.getName() + ".constraints"; + String CONSTRAINTS = (CopyOp.class.getName() + ".constraints").intern(); /** * An optional {@link IBindingSet}[] to be used <strong>instead</strong> * of the default source. */ - String BINDING_SETS = CopyOp.class.getName() + ".bindingSets"; + String BINDING_SETS = (CopyOp.class.getName() + ".bindingSets").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/INConstraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/INConstraint.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/INConstraint.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -59,7 +59,7 @@ /** * The variable against which the constraint is applied. */ - String VARIABLE = INConstraint.class.getName() + ".variable"; + String VARIABLE = (INConstraint.class.getName() + ".variable").intern(); /** * The set of allowed values for that variable. @@ -67,7 +67,7 @@ * @todo allow large sets to be specified by reference to a resource * which is then materialized on demand during evaluation. */ - String SET = INConstraint.class.getName() + ".set"; + String SET = (INConstraint.class.getName() + ".set").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/AbstractSubqueryOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/AbstractSubqueryOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/AbstractSubqueryOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -90,20 +90,21 @@ public interface Annotations extends PipelineOp.Annotations { - /** - * The ordered {@link BOp}[] of subqueries to be evaluated for each - * binding set presented (required). - */ - String SUBQUERIES = SubqueryOp.class.getName() + ".subqueries"; + /** + * The ordered {@link BOp}[] of subqueries to be evaluated for each + * binding set presented (required). + */ + String SUBQUERIES = (AbstractSubqueryOp.class.getName() + ".subqueries") + .intern(); - /** - * The maximum parallelism with which the subqueries will be evaluated - * (default is unlimited). - */ - String MAX_PARALLEL_SUBQUERIES = AbstractSubqueryOp.class.getName() - + ".maxParallelSubqueries"; + /** + * The maximum parallelism with which the subqueries will be evaluated + * (default is unlimited). + */ + String MAX_PARALLEL_SUBQUERIES = (AbstractSubqueryOp.class.getName() + ".maxParallelSubqueries") + .intern(); - int DEFAULT_MAX_PARALLEL_SUBQUERIES = Integer.MAX_VALUE; + int DEFAULT_MAX_PARALLEL_SUBQUERIES = Integer.MAX_VALUE; } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/SubqueryOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/SubqueryOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/controller/SubqueryOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -74,14 +74,14 @@ * {@link SubqueryOp} (required). This should be a * {@link PipelineOp}. */ - String SUBQUERY = SubqueryOp.class.getName() + ".subquery"; + String SUBQUERY = (SubqueryOp.class.getName() + ".subquery").intern(); /** * When <code>true</code> the subquery has optional semantics (if the * subquery fails, the original binding set will be passed along to the * downstream sink anyway) (default {@value #DEFAULT_OPTIONAL}). */ - String OPTIONAL = SubqueryOp.class.getName() + ".optional"; + String OPTIONAL = (SubqueryOp.class.getName() + ".optional").intern(); boolean DEFAULT_OPTIONAL = false; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryEngine.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryEngine.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/engine/QueryEngine.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -228,8 +228,8 @@ * {@link QueryEngine#newRunningQuery(QueryEngine, UUID, boolean, IQueryClient, PipelineOp)} * in which case they might not support this option. */ - String RUNNING_QUERY_CLASS = QueryEngine.class.getName() - + ".runningQueryClass"; + String RUNNING_QUERY_CLASS = (QueryEngine.class.getName() + + ".runningQueryClass").intern(); // String DEFAULT_RUNNING_QUERY_CLASS = StandaloneChainedRunningQuery.class.getName(); String DEFAULT_RUNNING_QUERY_CLASS = ChunkedRunningQuery.class.getName(); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -121,14 +121,14 @@ * The {@link IPredicate} which is used to generate the * {@link IAccessPath}s during the join. */ - String PREDICATE = PipelineJoin.class.getName() + ".predicate"; + String PREDICATE = (PipelineJoin.class.getName() + ".predicate").intern(); /** * An optional {@link IVariable}[] identifying the variables to be * retained in the {@link IBindingSet}s written out by the operator. All * variables are retained unless this annotation is specified. */ - String SELECT = PipelineJoin.class.getName() + ".select"; + String SELECT = (PipelineJoin.class.getName() + ".select").intern(); // /** // * Marks the join as "optional" in the SPARQL sense. Binding sets which @@ -149,7 +149,7 @@ * An {@link IConstraint}[] which places restrictions on the legal * patterns in the variable bindings (optional). */ - String CONSTRAINTS = PipelineJoin.class.getName() + ".constraints"; + String CONSTRAINTS = (PipelineJoin.class.getName() + ".constraints").intern(); /** * The maximum parallelism with which the pipeline will consume the @@ -175,7 +175,7 @@ * this option might well go away which would allow us to simplify * the PipelineJoin implementation. */ - String MAX_PARALLEL_CHUNKS = PipelineJoin.class.getName() + ".maxParallelChunks"; + String MAX_PARALLEL_CHUNKS = (PipelineJoin.class.getName() + ".maxParallelChunks").intern(); int DEFAULT_MAX_PARALLEL_CHUNKS = 0; @@ -195,8 +195,8 @@ * * @todo unit tests when (en|dis)abled. */ - String COALESCE_DUPLICATE_ACCESS_PATHS = PipelineJoin.class.getName() - + ".coalesceDuplicateAccessPaths"; + String COALESCE_DUPLICATE_ACCESS_PATHS = (PipelineJoin.class.getName() + + ".coalesceDuplicateAccessPaths").intern(); boolean DEFAULT_COALESCE_DUPLICATE_ACCESS_PATHS = true; @@ -206,7 +206,7 @@ * * @todo Unit tests for this feature (it is used by the JoinGraph). */ - String LIMIT = PipelineJoin.class.getName() + ".limit"; + String LIMIT = (PipelineJoin.class.getName() + ".limit").intern(); long DEFAULT_LIMIT = Long.MAX_VALUE; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -801,7 +801,7 @@ if (constraints == null) { // replace with an empty array. - constraints = new IConstraint[0]; + constraints = IConstraint.EMPTY; } /* Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -83,25 +83,25 @@ /** * The variables which are projected out of the join graph. */ - String SELECTED = JoinGraph.class.getName() + ".selected"; + String SELECTED = (JoinGraph.class.getName() + ".selected").intern(); /** * The vertices of the join graph, expressed an an {@link IPredicate}[] * (required). */ - String VERTICES = JoinGraph.class.getName() + ".vertices"; + String VERTICES = (JoinGraph.class.getName() + ".vertices").intern(); /** * The constraints on the join graph, expressed an an * {@link IConstraint}[] (optional, defaults to no constraints). */ - String CONSTRAINTS = JoinGraph.class.getName() + ".constraints"; + String CONSTRAINTS = (JoinGraph.class.getName() + ".constraints").intern(); /** * The initial limit for cutoff sampling (default * {@value #DEFAULT_LIMIT}). */ - String LIMIT = JoinGraph.class.getName() + ".limit"; + String LIMIT = (JoinGraph.class.getName() + ".limit").intern(); int DEFAULT_LIMIT = 100; @@ -110,7 +110,7 @@ * cardinality will be used to generate the initial join paths (default * {@value #DEFAULT_NEDGES}). This must be a positive integer. */ - String NEDGES = JoinGraph.class.getName() + ".nedges"; + String NEDGES = (JoinGraph.class.getName() + ".nedges").intern(); int DEFAULT_NEDGES = 2; @@ -119,7 +119,7 @@ * * @see SampleIndex.SampleType */ - String SAMPLE_TYPE = JoinGraph.class.getName() + ".sampleType"; + String SAMPLE_TYPE = (JoinGraph.class.getName() + ".sampleType").intern(); String DEFAULT_SAMPLE_TYPE = SampleType.RANDOM.name(); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/mutation/InsertOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/mutation/InsertOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/mutation/InsertOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -78,17 +78,17 @@ * @see IPredicate#asBound(IBindingSet) * @see IRelation#newElement(java.util.List, IBindingSet) */ - String SELECTED = InsertOp.class.getName() + ".selected"; + String SELECTED = (InsertOp.class.getName() + ".selected").intern(); /** * The namespace of the relation to which the index belongs. */ - String RELATION = InsertOp.class.getName() + ".relation"; + String RELATION = (InsertOp.class.getName() + ".relation").intern(); /** * The {@link IKeyOrder} for the index. */ - String KEY_ORDER = InsertOp.class.getName() + ".keyOrder"; + String KEY_ORDER = (InsertOp.class.getName() + ".keyOrder").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/ComparatorOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/ComparatorOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/ComparatorOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -55,7 +55,7 @@ * will be imposed and the order (ascending or descending) for each * variable. */ - String ORDER = ComparatorOp.class.getName() + ".order"; + String ORDER = (ComparatorOp.class.getName() + ".order").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -51,7 +51,7 @@ * Binding sets with distinct values for the specified variables will be * passed on. */ - String VARIABLES = DistinctBindingSetOp.class.getName() + ".variables"; + String VARIABLES = (DistinctBindingSetOp.class.getName() + ".variables").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/GroupByOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/GroupByOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/GroupByOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -59,7 +59,7 @@ * {@link #GROUP_BY} declaration as simple {@link IVariable} s; or (b) * be declared by {@link #COMPUTE}. */ - String SELECT = GroupByOp.class.getName() + ".select"; + String SELECT = (GroupByOp.class.getName() + ".select").intern(); // /** // * The ordered set of {@link IValueExpression}s which are to be @@ -90,7 +90,7 @@ * the aggregation groups (required). Variables references will be * resolved against the incoming solutions. */ - String GROUP_BY = GroupByOp.class.getName() + ".groupBy"; + String GROUP_BY = (GroupByOp.class.getName() + ".groupBy").intern(); /** * An {@link IConstraint}[] applied to the aggregated solutions @@ -99,7 +99,7 @@ * TODO Should be the BEV of an {@link IValueExpression}, which might or * might not be an {@link IConstraint}. */ - String HAVING = GroupByOp.class.getName() + ".having"; + String HAVING = (GroupByOp.class.getName() + ".having").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SliceOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SliceOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SliceOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -88,7 +88,7 @@ /** * The first solution to be returned to the caller (origin ZERO). */ - String OFFSET = SliceOp.class.getName() + ".offset"; + String OFFSET = (SliceOp.class.getName() + ".offset").intern(); long DEFAULT_OFFSET = 0L; @@ -96,7 +96,7 @@ * The maximum #of solutions to be returned to the caller (default is * all). */ - String LIMIT = SliceOp.class.getName() + ".limit"; + String LIMIT = (SliceOp.class.getName() + ".limit").intern(); /** * A value of {@link Long#MAX_VALUE} is used to indicate that there is Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SortOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SortOp.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/SortOp.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -55,7 +55,7 @@ * * @see ComparatorOp */ - String COMPARATOR = MemorySortOp.class.getName() + ".comparator"; + String COMPARATOR = (MemorySortOp.class.getName() + ".comparator").intern(); } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/IndexMetadata.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/IndexMetadata.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/btree/IndexMetadata.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -316,9 +316,9 @@ * * @see #DEFAULT_BLOOM_FILTER */ - String BLOOM_FILTER = com.bigdata.btree.BTree.class.getPackage() + String BLOOM_FILTER = (com.bigdata.btree.BTree.class.getPackage() .getName() - + ".bloomFilter"; + + ".bloomFilter").intern(); String DEFAULT_BLOOM_FILTER = "false"; @@ -379,9 +379,9 @@ * application is performing sustained writes on the index (hundreds of * thousands to millions of records). */ - String WRITE_RETENTION_QUEUE_CAPACITY = com.bigdata.btree.AbstractBTree.class + String WRITE_RETENTION_QUEUE_CAPACITY = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + ".writeRetentionQueue.capacity"; + + ".writeRetentionQueue.capacity").intern(); /** * The #of entries on the write retention queue that will be scanned for @@ -392,9 +392,9 @@ * incremental writes occur iff the {@link AbstractNode#referenceCount} * is zero and the node or leaf is dirty. */ - String WRITE_RETENTION_QUEUE_SCAN = com.bigdata.btree.AbstractBTree.class + String WRITE_RETENTION_QUEUE_SCAN = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + ".writeRetentionQueue.scan"; + + ".writeRetentionQueue.scan").intern(); String DEFAULT_WRITE_RETENTION_QUEUE_CAPACITY = "500";// was 500 @@ -408,17 +408,17 @@ * * FIXME {@link KeyBuilder} configuration support is not finished. */ - String KEY_BUILDER_FACTORY = com.bigdata.btree.AbstractBTree.class + String KEY_BUILDER_FACTORY = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + "keyBuilderFactory"; + + "keyBuilderFactory").intern(); /** * Override the {@link IRabaCoder} used for the keys in the nodes of a * B+Tree (the default is a {@link FrontCodedRabaCoder} instance). */ - String NODE_KEYS_CODER = com.bigdata.btree.AbstractBTree.class + String NODE_KEYS_CODER = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + "nodeKeysCoder"; + + "nodeKeysCoder").intern(); /** * Override the {@link IRabaCoder} used for the keys of leaves in @@ -426,9 +426,9 @@ * * @see DefaultTupleSerializer#setLeafKeysCoder(IRabaCoder) */ - String LEAF_KEYS_CODER = com.bigdata.btree.AbstractBTree.class + String LEAF_KEYS_CODER = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + ".leafKeysCoder"; + + ".leafKeysCoder").intern(); /** * Override the {@link IRabaCoder} used for the values of leaves in @@ -436,9 +436,9 @@ * * @see DefaultTupleSerializer#setLeafValuesCoder(IRabaCoder) */ - String LEAF_VALUES_CODER = com.bigdata.btree.AbstractBTree.class + String LEAF_VALUES_CODER = (com.bigdata.btree.AbstractBTree.class .getPackage().getName() - + ".leafValuesCoder"; + + ".leafValuesCoder").intern(); // /** // * Option determines whether or not per-child locks are used by @@ -492,7 +492,7 @@ * also need to override the {@link Checkpoint} class - for * example the {@link MetadataIndex} does this. */ - String BTREE_CLASS_NAME = BTree.class.getName()+".className"; + String BTREE_CLASS_NAME = (BTree.class.getName()+".className").intern(); /** * The name of an optional property whose value specifies the branching @@ -501,7 +501,7 @@ * @see #DEFAULT_BTREE_BRANCHING_FACTOR * @see #INDEX_SEGMENT_BRANCHING_FACTOR */ - String BTREE_BRANCHING_FACTOR = BTree.class.getName()+".branchingFactor"; + String BTREE_BRANCHING_FACTOR = (BTree.class.getName()+".branchingFactor").intern(); /** * The default branching factor for a mutable {@link BTree}. @@ -595,8 +595,8 @@ * * FIXME Record level compression support is not finished. */ - String BTREE_RECORD_COMPRESSOR_FACTORY = BTree.class.getName() - + ".recordCompressorFactory"; + String BTREE_RECORD_COMPRESSOR_FACTORY = (BTree.class.getName() + + ".recordCompressorFactory").intern(); /** * @@ -614,9 +614,9 @@ * The name of the property whose value specifies the branching factory * for an immutable {@link IndexSegment}. */ - String INDEX_SEGMENT_BRANCHING_FACTOR = IndexSegment.class + String INDEX_SEGMENT_BRANCHING_FACTOR = (IndexSegment.class .getName() - + ".branchingFactor"; + + ".branchingFactor").intern(); /** * The default branching factor for an {@link IndexSegment}. @@ -646,8 +646,8 @@ * @todo should be on by default? (but verify that the unit tests do * not run out of memory when it is enabled by default). */ - String INDEX_SEGMENT_BUFFER_NODES = IndexSegment.class.getName() - + ".bufferNodes"; + String INDEX_SEGMENT_BUFFER_NODES = (IndexSegment.class.getName() + + ".bufferNodes").intern(); /** * @see #INDEX_SEGMENT_BUFFER_NODES @@ -711,8 +711,8 @@ * * FIXME Record level compression support is not finished. */ - String INDEX_SEGMENT_RECORD_COMPRESSOR_FACTORY = IndexSegment.class.getName() - + ".recordCompressorFactory"; + String INDEX_SEGMENT_RECORD_COMPRESSOR_FACTORY = (IndexSegment.class.getName() + + ".recordCompressorFactory").intern(); /** * @@ -796,9 +796,9 @@ * {@link AbstractSubtask} sink handling writes for the associated index * partition. */ - String MASTER_QUEUE_CAPACITY = AsynchronousIndexWriteConfiguration.class + String MASTER_QUEUE_CAPACITY = (AsynchronousIndexWriteConfiguration.class .getName() - + ".masterQueueCapacity"; + + ".masterQueueCapacity").intern(); String DEFAULT_MASTER_QUEUE_CAPACITY = "5000"; @@ -806,9 +806,9 @@ * The desired size of the chunks that the master will draw from its * queue. */ - String MASTER_CHUNK_SIZE = AsynchronousIndexWriteConfiguration.class + String MASTER_CHUNK_SIZE = (AsynchronousIndexWriteConfiguration.class .getName() - + ".masterChunkSize"; + + ".masterChunkSize").intern(); String DEFAULT_MASTER_CHUNK_SIZE = "10000"; @@ -816,9 +816,9 @@ * The time in nanoseconds that the master will combine smaller chunks * so that it can satisfy the desired <i>masterChunkSize</i>. */ - String MASTER_CHUNK_TIMEOUT_NANOS = AsynchronousIndexWriteConfiguration.class + String MASTER_CHUNK_TIMEOUT_NANOS = (AsynchronousIndexWriteConfiguration.class .getName() - + ".masterChunkTimeoutNanos"; + + ".masterChunkTimeoutNanos").intern(); String DEFAULT_MASTER_CHUNK_TIMEOUT_NANOS = "" + TimeUnit.MILLISECONDS.toNanos(50); @@ -830,9 +830,9 @@ * the sink remains responsible rather than blocking inside of the * {@link IAsynchronousIterator} for long periods of time. */ - String SINK_POLL_TIMEOUT_NANOS = AsynchronousIndexWriteConfiguration.class + String SINK_POLL_TIMEOUT_NANOS = (AsynchronousIndexWriteConfiguration.class .getName() - + ".sinkPollTimeoutNanos"; + + ".sinkPollTimeoutNanos").intern(); String DEFAULT_SINK_POLL_TIMEOUT_NANOS = "" + TimeUnit.MILLISECONDS.toNanos(50); @@ -840,9 +840,9 @@ /** * The capacity of the internal queue for the per-sink output buffer. */ - String SINK_QUEUE_CAPACITY = AsynchronousIndexWriteConfiguration.class + String SINK_QUEUE_CAPACITY = (AsynchronousIndexWriteConfiguration.class .getName() - + ".sinkQueueCapacity"; + + ".sinkQueueCapacity").intern(); String DEFAULT_SINK_QUEUE_CAPACITY = "5000"; @@ -850,9 +850,9 @@ * The desired size of the chunks written that will be written by the * {@link AbstractSubtask sink}. */ - String SINK_CHUNK_SIZE = AsynchronousIndexWriteConfiguration.class + String SINK_CHUNK_SIZE = (AsynchronousIndexWriteConfiguration.class .getName() - + ".sinkChunkSize"; + + ".sinkChunkSize").intern(); String DEFAULT_SINK_CHUNK_SIZE = "10000"; @@ -865,9 +865,9 @@ * the index partition. This makes it much easier to adjust the * performance since you simply adjust the {@link #SINK_CHUNK_SIZE}. */ - String SINK_CHUNK_TIMEOUT_NANOS = AsynchronousIndexWriteConfiguration.class + String SINK_CHUNK_TIMEOUT_NANOS = (AsynchronousIndexWriteConfiguration.class .getName() - + ".sinkChunkTimeoutNanos"; + + ".sinkChunkTimeoutNanos").intern(); String DEFAULT_SINK_CHUNK_TIMEOUT_NANOS = "" + Long.MAX_VALUE; @@ -890,9 +890,9 @@ * sink is writing. */ // GTE chunkTimeout - String SINK_IDLE_TIMEOUT_NANOS = AsynchronousIndexWriteConfiguration.class + String SINK_IDLE_TIMEOUT_NANOS = (AsynchronousIndexWriteConfiguration.class .getName() - + ".sinkIdleTimeoutNanos"; + + ".sinkIdleTimeoutNanos").intern(); String DEFAULT_SINK_IDLE_TIMEOUT_NANOS = "" + Long.MAX_VALUE; @@ -916,9 +916,9 @@ * * @see OverflowManager.Options#SCATTER_SPLIT_ENABLED */ - String SCATTER_SPLIT_ENABLED = ScatterSplitConfiguration.class + String SCATTER_SPLIT_ENABLED = (ScatterSplitConfiguration.class .getName() - + ".enabled"; + + ".enabled").intern(); String DEFAULT_SCATTER_SPLIT_ENABLED = "true"; @@ -935,9 +935,9 @@ * performed. The allowable range is therefore constrained to * <code>(0.1 : 1.0)</code>. */ - String SCATTER_SPLIT_PERCENT_OF_SPLIT_THRESHOLD = ScatterSplitConfiguration.class + String SCATTER_SPLIT_PERCENT_OF_SPLIT_THRESHOLD = (ScatterSplitConfiguration.class .getName() - + ".percentOfSplitThreshold"; + + ".percentOfSplitThreshold").intern(); String DEFAULT_SCATTER_SPLIT_PERCENT_OF_SPLIT_THRESHOLD = ".25"; @@ -946,9 +946,9 @@ * to use all discovered data services (default * {@value #DEFAULT_SCATTER_SPLIT_DATA_SERVICE_COUNT}). */ - String SCATTER_SPLIT_DATA_SERVICE_COUNT = ScatterSplitConfiguration.class + String SCATTER_SPLIT_DATA_SERVICE_COUNT = (ScatterSplitConfiguration.class .getName() - + ".dataServiceCount"; + + ".dataServiceCount").intern(); String DEFAULT_SCATTER_SPLIT_DATA_SERVICE_COUNT = "0"; @@ -980,9 +980,9 @@ * asynchronous index writes in order to obtain high throughput with * sustained index writes. */ - String SCATTER_SPLIT_INDEX_PARTITION_COUNT = ScatterSplitConfiguration.class + String SCATTER_SPLIT_INDEX_PARTITION_COUNT = (ScatterSplitConfiguration.class .getName() - + ".indexPartitionCount"; + + ".indexPartitionCount").intern(); String DEFAULT_SCATTER_SPLIT_INDEX_PARTITION_COUNT = "0"; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPD.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPD.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPD.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -12,6 +12,8 @@ import java.util.Properties; import java.util.Vector; +import org.apache.log4j.Logger; + import com.bigdata.counters.CounterSet; import com.bigdata.counters.query.CounterSetSelector; import com.bigdata.counters.query.ICounterSelector; @@ -30,6 +32,8 @@ */ public class CounterSetHTTPD extends AbstractHTTPD { + static private final Logger log = Logger.getLogger(CounterSetHTTPD.class); + /** * The {@link CounterSet} exposed by this service. */ Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPDServer.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPDServer.java 2011-03-01 01:05:33 UTC (rev 4261) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/counters/httpd/CounterSetHTTPDServer.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -140,7 +140,7 @@ Logger.getLogger(XHTMLRenderer.class).setLevel(level); // set logging level on the service. - NanoHTTPD.log.setLevel(level); + Logger.getLogger(NanoHTTPD.class).setLevel(level); } else if (arg.equals("-events")) { Added: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/CanonicalFactory.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/CanonicalFactory.java (rev 0) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/util/CanonicalFactory.java 2011-03-02 19:15:21 UTC (rev 4262) @@ -0,0 +1,128 @@ +package com.bigdata.util; + +import com.bigdata.cache.ConcurrentWeakValueCache; + +/** + * A pattern for a canonicalizing factory based on a map with weak values. + * + * @param <KEY> + * @param <VAL> + * @param <STATE> + * + * @author thompsonbry + */ +abstract public class CanonicalFactory<KEY, VAL, STATE> { + + /** + * Canonicalizing mapping. + */ +// private WeakValueCache<KEY, VAL> cache; + private ConcurrentWeakValueCache<KEY,VAL> cache; + + /** + * + * @param queueCapacity + * The capacity of the backing hard reference queue. This places + * a lower bound on the #of instances which will be retained by + * the factory. + */ + public CanonicalFactory(final int queueCapacity) { + +// cache = new WeakValueCache<KEY, VAL>(new LRUCache<KEY, VAL>(queueCapacity)); + cache = new ConcurrentWeakValueCache<KEY, VAL>(queueCapacity); + + } + + /** + * Canonical factory pattern. + * + * @param key + * The key. + * @param state + * Additional state from the caller which will be passed through + * to {@link #newInstance(Object, Object)} when creating a new + * instance (optional). + * + * @return The instance paired with that key. + * + * @throws IllegalArgumentException + * if the key is <code>null</code>. + */ + public VAL getInstance(final KEY key, final STATE state) { + + if (key == null) + throw new IllegalArgumentException(); + + // check first w/o lock. + VAL val = cache.get(key); + + if (val != null) { + /* + * Fast code path if entry exists for that key. This amortizes the + * lock costs by relying on the striped locks of the CHM to provide + * less lock contention. + */ + return val; + } + + // obtain lock + synchronized (cache) { + + // check with lock held + val = cache.get(key); + + if (val == null) { + + // create an instance + val = newInstance(key,state); + + // pair that instance with the key in the map. +// cache.put(key, val, true/* dirty */); + cache.put(key, val); + + } + + return val; + + } + + } + + /** + * Remove an entry from the cache. + * <p> + * Note: It is sometimes necessary to clear a cache entry. For example, if a + * pers... [truncated message content] |
From: <tho...@us...> - 2011-03-01 01:05:41
|
Revision: 4261 http://bigdata.svn.sourceforge.net/bigdata/?rev=4261&view=rev Author: thompsonbry Date: 2011-03-01 01:05:33 +0000 (Tue, 01 Mar 2011) Log Message: ----------- Commit of partial support for rotating a key-range constraint onto an access path from/to key based on a path by MikeP. The logic in the SAIL which recognizes and lifts the key-range constraint onto the predicate is disabled in this commit. Look at ~817 of BigdataSailEvaluationStrategy3 to enable this behavior. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/BooleanValueExpression.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/striterator/AbstractKeyOrder.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/aggregate/SUM.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/IVUtility.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/CompareBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/ValueExpressionBOp.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOKeyOrder.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/QueryHints.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/QueryOptimizerEnum.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/bench/NanoSparqlClient.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/sop/SOp2BOpUtility.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/sop/SOpTree.java Added Paths: ----------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/Range.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/test/com/bigdata/rdf/sail/TestRangeBOp.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/BOpBase.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -195,9 +195,12 @@ */ public BOpBase(final BOpBase op) { // deep copy the arguments. - args = deepCopy(op.args); +// args = deepCopy(op.args); // deep copy the annotations. - annotations = deepCopy(op.annotations); +// annotations = deepCopy(op.annotations); + // Note: only shallow copy is required to achieve immutable semantics! + args = Arrays.copyOf(op.args, op.args.length); + annotations = new LinkedHashMap<String, Object>(op.annotations); } // /** Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/IPredicate.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -678,6 +678,12 @@ * @return The newly annotated {@link IPredicate}. */ public IPredicate<E> setBOpId(int bopId); + + /** + * Return a copy of this predicate with a different {@link IVariableOrConstant} + * for the arg specified by the supplied index parameter. + */ + public IPredicate<E> setArg(int index, IVariableOrConstant arg); /** * Return <code>true</code> iff this operator is an access path which writes Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/ap/Predicate.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -419,6 +419,16 @@ } + public Predicate<E> setArg(final int index, final IVariableOrConstant arg) { + + final Predicate<E> tmp = this.clone(); + + tmp._set(index, arg); + + return tmp; + + } + /** * Add an {@link Annotations#INDEX_LOCAL_FILTER}. When there is a filter for * the named property, the filters are combined. Otherwise the filter is Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/BooleanValueExpression.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/BooleanValueExpression.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/BooleanValueExpression.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -27,16 +27,16 @@ import java.util.Map; import com.bigdata.bop.BOp; +import com.bigdata.bop.BOpBase; import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IValueExpression; -import com.bigdata.bop.ImmutableBOp; /** * Base class for boolean value expression BOps. Value expressions perform some * evaluation on one or more value expressions as input and produce one * boolean as output. */ -public abstract class BooleanValueExpression extends ImmutableBOp +public abstract class BooleanValueExpression extends BOpBase implements IValueExpression<Boolean> { /** Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -29,14 +29,14 @@ import org.apache.log4j.Logger; import com.bigdata.bop.BOp; +import com.bigdata.bop.BOpBase; import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstraint; -import com.bigdata.bop.ImmutableBOp; /** * BOpConstraint that wraps a {@link BooleanValueExpression}. */ -public class Constraint extends ImmutableBOp implements IConstraint { +public class Constraint extends BOpBase implements IConstraint { /** * Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/join/PipelineJoin.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -1593,9 +1593,11 @@ this.accessPath = context.getAccessPath(relation, predicate); - if (log.isDebugEnabled()) - log.debug("joinOp=" + joinOp + ", #bindingSets=" + n - + ", accessPath=" + accessPath); + if (log.isDebugEnabled()) { + log.debug("joinOp=" + joinOp); + log.debug("#bindingSets=" + n); + log.debug("accessPath=" + accessPath); + } // convert to array for thread-safe traversal. this.bindingSets = bindingSets.toArray(new IBindingSet[n]); @@ -1644,7 +1646,11 @@ // range count of the as-bound access path (should be cached). final long rangeCount = accessPath .rangeCount(false/* exact */); - + + if (log.isDebugEnabled()) { + log.debug("range count: " + rangeCount); + } + stats.accessPathCount.increment(); stats.accessPathRangeCount.add(rangeCount); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/striterator/AbstractKeyOrder.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/striterator/AbstractKeyOrder.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/striterator/AbstractKeyOrder.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -72,7 +72,7 @@ } - final public byte[] getFromKey(final IKeyBuilder keyBuilder, + public byte[] getFromKey(final IKeyBuilder keyBuilder, final IPredicate<E> predicate) { keyBuilder.reset(); @@ -99,17 +99,43 @@ } - return noneBound ? null : keyBuilder.getKey(); + final byte[] key = noneBound ? null : keyBuilder.getKey(); + return key; + } - final public byte[] getToKey(final IKeyBuilder keyBuilder, + public byte[] getToKey(final IKeyBuilder keyBuilder, final IPredicate<E> predicate) { - final byte[] from = getFromKey(keyBuilder, predicate); + keyBuilder.reset(); - return from == null ? null : SuccessorUtil.successor(from); + final int keyArity = getKeyArity(); // use the key's "arity". + boolean noneBound = true; + + for (int i = 0; i < keyArity; i++) { + + final IVariableOrConstant<?> term = predicate.get(getKeyOrder(i)); + + // Note: term MAY be null for the context position. + if (term == null || term.isVar()) + break; + + /* + * Note: If you need to override the default IKeyBuilder behavior do + * it in the invoked method. + */ + appendKeyComponent(keyBuilder, i, term.get()); + + noneBound = false; + + } + + final byte[] key = noneBound ? null : keyBuilder.getKey(); + + return key == null ? null : SuccessorUtil.successor(key); + } /** Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/aggregate/SUM.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/aggregate/SUM.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/bop/rdf/aggregate/SUM.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -25,8 +25,6 @@ import java.util.Map; -import org.openrdf.query.algebra.MathExpr.MathOp; - import com.bigdata.bop.BOp; import com.bigdata.bop.BOpBase; import com.bigdata.bop.IBindingSet; @@ -34,10 +32,10 @@ import com.bigdata.bop.IVariable; import com.bigdata.bop.aggregate.AggregateBase; import com.bigdata.bop.aggregate.IAggregate; -import com.bigdata.bop.aggregate.AggregateBase.FunctionCode; import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.IVUtility; import com.bigdata.rdf.internal.XSDLongIV; +import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.model.BigdataLiteral; /** Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/IVUtility.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/IVUtility.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/IVUtility.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -32,11 +32,10 @@ import java.util.ArrayList; import java.util.UUID; -import org.openrdf.query.algebra.MathExpr.MathOp; - import com.bigdata.btree.keys.IKeyBuilder; import com.bigdata.btree.keys.KeyBuilder; import com.bigdata.rawstore.Bytes; +import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.model.BigdataBNode; import com.bigdata.rdf.model.BigdataLiteral; @@ -235,6 +234,10 @@ return new XSDDecimalIV(left.multiply(right)); case DIVIDE: return new XSDDecimalIV(left.divide(right)); + case MIN: + return new XSDDecimalIV(left.compareTo(right) < 0 ? left : right); + case MAX: + return new XSDDecimalIV(left.compareTo(right) > 0 ? left : right); default: throw new UnsupportedOperationException(); } @@ -253,6 +256,10 @@ return new XSDIntegerIV(left.multiply(right)); case DIVIDE: return new XSDIntegerIV(left.divide(right)); + case MIN: + return new XSDIntegerIV(left.compareTo(right) < 0 ? left : right); + case MAX: + return new XSDIntegerIV(left.compareTo(right) > 0 ? left : right); default: throw new UnsupportedOperationException(); } @@ -271,6 +278,10 @@ return new XSDFloatIV(left*right); case DIVIDE: return new XSDFloatIV(left/right); + case MIN: + return new XSDFloatIV(Math.min(left,right)); + case MAX: + return new XSDFloatIV(Math.max(left,right)); default: throw new UnsupportedOperationException(); } @@ -289,6 +300,10 @@ return new XSDDoubleIV(left*right); case DIVIDE: return new XSDDoubleIV(left/right); + case MIN: + return new XSDDoubleIV(Math.min(left,right)); + case MAX: + return new XSDDoubleIV(Math.max(left,right)); default: throw new UnsupportedOperationException(); } @@ -307,6 +322,10 @@ return new XSDIntIV(left*right); case DIVIDE: return new XSDIntIV(left/right); + case MIN: + return new XSDIntIV(Math.min(left,right)); + case MAX: + return new XSDIntIV(Math.max(left,right)); default: throw new UnsupportedOperationException(); } @@ -325,6 +344,10 @@ return new XSDLongIV(left*right); case DIVIDE: return new XSDLongIV(left/right); + case MIN: + return new XSDLongIV(Math.min(left,right)); + case MAX: + return new XSDLongIV(Math.max(left,right)); default: throw new UnsupportedOperationException(); } Added: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/Range.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/Range.java (rev 0) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/Range.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -0,0 +1,73 @@ +/** + +Copyright (C) SYSTAP, LLC 2011. All rights reserved. + +Contact: + SYSTAP, LLC + 4501 Tower Road + Greensboro, NC 27410 + lic...@bi... + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; version 2 of the License. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +package com.bigdata.rdf.internal; + +import java.io.Serializable; + +/** + * Represents a numerical range of IVs - a lower bound and an upper bound. + * Useful for constraining predicates to a particular range of values for the + * object. + */ +public class Range implements Serializable { + + /** + * + */ + private static final long serialVersionUID = -706615195901299026L; + + private final IV from, to; + + /** + * Construct a numerical range using two IVs. The range includes the from + * and to value (>= from && <= to). Non-inclusive from and to must be + * accomplished using a filter. The from must be less than or equal to the + * to. + */ + public Range(final IV from, final IV to) { + + if (!from.isNumeric()) + throw new IllegalArgumentException("not numeric: " + from); + if (!to.isNumeric()) + throw new IllegalArgumentException("not numeric: " + to); + + final int compare = IVUtility.numericalCompare(from, to); + if (compare > 0) + throw new IllegalArgumentException("invalid range: " + from+">"+to); + + this.from = from; + this.to = to; + + } + + public IV from() { + return from; + } + + public IV to() { + return to; + } + +} Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/CompareBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/CompareBOp.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/CompareBOp.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -93,6 +93,10 @@ public CompareBOp(final CompareBOp op) { super(op); } + + public CompareOp op() { + return (CompareOp) getRequiredProperty(Annotations.OP); + } public boolean accept(final IBindingSet s) { @@ -103,7 +107,7 @@ if (left == null || right == null) throw new SparqlTypeErrorException(); - final CompareOp op = (CompareOp) getProperty(Annotations.OP); + final CompareOp op = op(); if (left.isTermId() && right.isTermId()) { if (op == CompareOp.EQ || op == CompareOp.NE) { Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -86,6 +86,10 @@ return (EBVBOp) super.get(i); } + public IValueExpression<IV> getValueExpression() { + return get(0).get(0); + } + public boolean accept(final IBindingSet bs) { try { Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/MathBOp.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -62,6 +62,25 @@ } + public enum MathOp { + PLUS, + MINUS, + MULTIPLY, + DIVIDE, + MIN, + MAX; + + public static MathOp valueOf(org.openrdf.query.algebra.MathExpr.MathOp op) { + switch(op) { + case PLUS: return MathOp.PLUS; + case MINUS: return MathOp.MINUS; + case MULTIPLY: return MathOp.MULTIPLY; + case DIVIDE: return MathOp.DIVIDE; + } + throw new IllegalArgumentException(); + } + } + /** * * @param left @@ -189,5 +208,5 @@ return h; } - + } Added: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java (rev 0) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/RangeBOp.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -0,0 +1,245 @@ +/** + +Copyright (C) SYSTAP, LLC 2006-2011. All rights reserved. + +Contact: + SYSTAP, LLC + 4501 Tower Road + Greensboro, NC 27410 + lic...@bi... + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; version 2 of the License. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ +package com.bigdata.rdf.internal.constraints; + +import java.util.Map; + +import org.apache.log4j.Logger; + +import com.bigdata.bop.BOp; +import com.bigdata.bop.BOpBase; +import com.bigdata.bop.Constant; +import com.bigdata.bop.IBindingSet; +import com.bigdata.bop.IConstant; +import com.bigdata.bop.IValueExpression; +import com.bigdata.bop.IVariable; +import com.bigdata.bop.IVariableOrConstant; +import com.bigdata.bop.ImmutableBOp; +import com.bigdata.bop.NV; +import com.bigdata.rdf.error.SparqlTypeErrorException; +import com.bigdata.rdf.internal.IV; +import com.bigdata.rdf.internal.Range; + +final public class RangeBOp extends BOpBase + implements IVariable<Range> { + + /** + * + */ + private static final long serialVersionUID = 3368581489737593349L; + +// private static final Logger log = Logger.getLogger(RangeBOp.class); + + + public interface Annotations extends ImmutableBOp.Annotations { + + String VAR = RangeBOp.class.getName() + ".var"; + + String FROM = RangeBOp.class.getName() + ".from"; + + String TO = RangeBOp.class.getName() + ".to"; + + } + + @SuppressWarnings("rawtypes") + public RangeBOp(final IVariable<IV> var, + final IValueExpression<IV> from, + final IValueExpression<IV> to) { + + this(NOARGS, + NV.asMap(new NV(Annotations.VAR, var), + new NV(Annotations.FROM, from), + new NV(Annotations.TO, to))); + + } + + /** + * Required shallow copy constructor. + */ + public RangeBOp(final BOp[] args, Map<String,Object> anns) { + + super(args,anns); + + if (getProperty(Annotations.VAR) == null + || getProperty(Annotations.FROM) == null + || getProperty(Annotations.TO) == null) { + + throw new IllegalArgumentException(); + + } + + } + + /** + * Required deep copy constructor. + */ + public RangeBOp(final RangeBOp op) { + + super(op); + + } + + public IVariable<IV> var() { + return (IVariable<IV>) getRequiredProperty(Annotations.VAR); + } + + public IValueExpression<IV> from() { + return (IValueExpression<IV>) getRequiredProperty(Annotations.FROM); + } + + public IValueExpression<IV> to() { + return (IValueExpression<IV>) getRequiredProperty(Annotations.TO); + } + + final public Range get(final IBindingSet bs) { + +// log.debug("getting the asBound value"); + + final IV from = from().get(bs); + final IV to = to().get(bs); + +// log.debug("from: " + from); +// log.debug("to: " + to); + + // sort of like Var.get(), which returns null when the variable + // is not yet bound + if (from == null || to == null) + return null; + + try { + // let Range ctor() do the type checks and valid range checks + return new Range(from, to); + } catch (IllegalArgumentException ex) { + // log the reason the range is invalid +// if (log.isInfoEnabled()) +// log.info("dropping solution: " + ex.getMessage()); + // drop the solution + throw new SparqlTypeErrorException(); + } + + } + + final public RangeBOp asBound(final IBindingSet bs) { + + final RangeBOp asBound = (RangeBOp) this.clone(); + +// log.debug("getting the asBound value"); + + final IV from = from().get(bs); + final IV to = to().get(bs); + +// log.debug("from: " + from); +// log.debug("to: " + to); + + // sort of like Var.get(), which returns null when the variable + // is not yet bound + if (from == null || to == null) + return asBound; + + asBound._setProperty(Annotations.FROM, new Constant(from)); + asBound._setProperty(Annotations.TO, new Constant(to)); + + return asBound; + + } + + final public boolean isFullyBound() { + + return from() instanceof IConstant && to() instanceof IConstant; + + } + + @Override + public boolean isVar() { + return true; + } + + @Override + public boolean isConstant() { + return false; + } + + @Override + public Range get() { +// log.debug("somebody tried to get me"); + + return null; + } + + @Override + public String getName() { + return var().getName(); + } + + @Override + public boolean isWildcard() { + return false; + } + + + final public boolean equals(final IVariableOrConstant op) { + + if (op == null) + return false; + + if (this == op) + return true; + + if (op instanceof IVariable<?>) { + + return var().getName().equals(((IVariable<?>) op).getName()); + + } + + return false; + + } + + final private boolean _equals(final RangeBOp op) { + + return var().equals(op.var()) + && from().equals(op.from()) + && to().equals(op.to()); + + } + + /** + * Caches the hash code. + */ + private int hash = 0; + public int hashCode() { +// +// int h = hash; +// if (h == 0) { +// h = 31 * h + var().hashCode(); +// h = 31 * h + from().hashCode(); +// h = 31 * h + to().hashCode(); +// hash = h; +// } +// return h; +// + return var().hashCode(); + } + +} Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/ValueExpressionBOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/ValueExpressionBOp.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/ValueExpressionBOp.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -27,8 +27,8 @@ import java.util.Map; import com.bigdata.bop.BOp; +import com.bigdata.bop.BOpBase; import com.bigdata.bop.IValueExpression; -import com.bigdata.bop.ImmutableBOp; import com.bigdata.rdf.internal.IV; /** @@ -36,7 +36,7 @@ * evaluation on one or more value expressions as input and produce one * value expression as output (boolean, numeric value, etc.) */ -public abstract class ValueExpressionBOp extends ImmutableBOp +public abstract class ValueExpressionBOp extends BOpBase implements IValueExpression<IV> { /** Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOKeyOrder.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOKeyOrder.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOKeyOrder.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -35,14 +35,19 @@ import java.util.Iterator; import java.util.NoSuchElementException; +import org.apache.log4j.Logger; + +import com.bigdata.bop.IConstant; import com.bigdata.bop.IPredicate; import com.bigdata.bop.IVariableOrConstant; import com.bigdata.btree.keys.IKeyBuilder; +import com.bigdata.btree.keys.SuccessorUtil; import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.IVUtility; +import com.bigdata.rdf.internal.Range; +import com.bigdata.rdf.internal.constraints.RangeBOp; import com.bigdata.rdf.model.StatementEnum; import com.bigdata.striterator.AbstractKeyOrder; -import com.bigdata.striterator.IKeyOrder; /** * Represents the key order used by an index for a triple relation. @@ -65,6 +70,8 @@ */ private static final long serialVersionUID = 87501920529732159L; + private static Logger log = Logger.getLogger(SPOKeyOrder.class); + /* * Note: these constants make it possible to use switch(index()) constructs. */ @@ -466,11 +473,92 @@ // // } - @Override + public byte[] getFromKey(final IKeyBuilder keyBuilder, + final IPredicate<ISPO> predicate) { + + keyBuilder.reset(); + + final int keyArity = getKeyArity(); // use the key's "arity". + + boolean noneBound = true; + + final RangeBOp range = (RangeBOp) + predicate.getProperty(SPOPredicate.Annotations.RANGE); + + for (int i = 0; i < keyArity; i++) { + + final int index = getKeyOrder(i); + + final IVariableOrConstant<?> term = predicate.get(index); + + if (term == null || term.isVar()) { + if (index == 2 && range != null && range.isFullyBound()) { + final IConstant<IV> c = (IConstant<IV>) range.from(); + appendKeyComponent(keyBuilder, i, c.get()); + noneBound = false; + } else { + break; + } + } else { + appendKeyComponent(keyBuilder, i, term.get()); + noneBound = false; + } + + } + + final byte[] key = noneBound ? null : keyBuilder.getKey(); + + return key; + + } + + public byte[] getToKey(final IKeyBuilder keyBuilder, + final IPredicate<ISPO> predicate) { + + keyBuilder.reset(); + + final int keyArity = getKeyArity(); // use the key's "arity". + + boolean noneBound = true; + + final RangeBOp range = (RangeBOp) + predicate.getProperty(SPOPredicate.Annotations.RANGE); + + for (int i = 0; i < keyArity; i++) { + + final int index = getKeyOrder(i); + + final IVariableOrConstant<?> term = predicate.get(index); + + // Note: term MAY be null for the context position. + if (term == null || term.isVar()) { + if (index == 2 && range != null && range.isFullyBound()) { + final IConstant<IV> c = (IConstant<IV>) range.to(); + appendKeyComponent(keyBuilder, i, c.get()); + noneBound = false; + } else { + break; + } + } else { + appendKeyComponent(keyBuilder, i, term.get()); + noneBound = false; + } + + } + + final byte[] key = noneBound ? null : keyBuilder.getKey(); + + return key == null ? null : SuccessorUtil.successor(key); + + } + + protected void appendKeyComponent(final IKeyBuilder keyBuilder, final int index, final Object keyComponent) { ((IV) keyComponent).encode(keyBuilder); + +// log.debug("appending key component: " + keyComponent); } @@ -672,32 +760,34 @@ static public SPOKeyOrder getKeyOrder(final IPredicate<ISPO> predicate, final int keyArity) { - final Object s = predicate.get(0).isVar() ? null : predicate - .get(0).get(); + final RangeBOp range = (RangeBOp) + predicate.getProperty(SPOPredicate.Annotations.RANGE); + + final boolean rangeIsBound = range != null && range.isFullyBound(); + + final boolean s = !predicate.get(0).isVar(); - final Object p = predicate.get(1).isVar() ? null : predicate - .get(1).get(); + final boolean p = !predicate.get(1).isVar(); - final Object o = predicate.get(2).isVar() ? null : predicate - .get(2).get(); + final boolean o = !predicate.get(2).isVar() || rangeIsBound; if (keyArity == 3) { // Note: Context is ignored! - if (s != null && p != null && o != null) { + if (s && p && o) { return SPO; - } else if (s != null && p != null) { + } else if (s && p) { return SPO; - } else if (s != null && o != null) { + } else if (s && o) { return OSP; - } else if (p != null && o != null) { + } else if (p && o) { return POS; - } else if (s != null) { + } else if (s) { return SPO; - } else if (p != null) { + } else if (p) { return POS; - } else if (o != null) { + } else if (o) { return OSP; } else { return SPO; @@ -708,39 +798,31 @@ @SuppressWarnings("unchecked") final IVariableOrConstant<IV> t = predicate.get(3); - final IV c = t == null ? null : (t.isVar() ? null : t.get()); + final boolean c = t != null && !t.isVar(); - /* - * if ((s == null && p == null && o == null && c == null) || (s != - * null && p == null && o == null && c == null) || (s != null && p - * != null && o == null && c == null) || (s != null && p != null && - * o != null && c == null) || (s != null && p != null && o != null - * && c != null)) { return SPOKeyOrder.SPOC; } - */ - - if ((s == null && p != null && o == null && c == null) - || (s == null && p != null && o != null && c == null) - || (s == null && p != null && o != null && c != null)) { + if ((!s && p && !o && !c) + || (!s && p && o && !c) + || (!s && p && o && c)) { return POCS; } - if ((s == null && p == null && o != null && c == null) - || (s == null && p == null && o != null && c != null) - || (s != null && p == null && o != null && c != null)) { + if ((!s && !p && o && !c) + || (!s && !p && o && c) + || (s && !p && o && c)) { return OCSP; } - if ((s == null && p == null && o == null && c != null) - || (s != null && p == null && o == null && c != null) - || (s != null && p != null && o == null && c != null)) { + if ((!s && !p && !o && c) + || (s && !p && !o && c) + || (s && p && !o && c)) { return CSPO; } - if ((s == null && p != null && o == null && c != null)) { + if ((!s && p && !o && c)) { return PCSO; } - if ((s != null && p == null && o != null && c == null)) { + if ((s && !p && o && !c)) { return SOPC; } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/spo/SPOPredicate.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -26,12 +26,15 @@ import java.util.Map; import com.bigdata.bop.BOp; +import com.bigdata.bop.Constant; import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstant; import com.bigdata.bop.IVariableOrConstant; import com.bigdata.bop.NV; import com.bigdata.bop.ap.Predicate; import com.bigdata.rdf.internal.IV; +import com.bigdata.rdf.internal.Range; +import com.bigdata.rdf.internal.constraints.RangeBOp; import com.bigdata.relation.rule.IAccessPathExpander; /** @@ -49,10 +52,17 @@ public class SPOPredicate extends Predicate<ISPO> { /** - * - */ - private static final long serialVersionUID = 1L; + * + */ + private static final long serialVersionUID = 3517916629931687107L; + public interface Annotations extends Predicate.Annotations { + + String RANGE = SPOPredicate.class.getName() + ".range"; + + } + + /** * Variable argument version of the shallow copy constructor. */ @@ -275,9 +285,9 @@ } @SuppressWarnings("unchecked") - final public IVariableOrConstant<IV> o() { + final public IVariableOrConstant o() { - return (IVariableOrConstant<IV>) get(2/* o */); + return (IVariableOrConstant) get(2/* o */); } @@ -287,6 +297,12 @@ return (IVariableOrConstant<IV>) get(3/* c */); } + + final public RangeBOp range() { + + return (RangeBOp) getProperty(Annotations.RANGE); + + } /** * Strengthened return type. @@ -296,8 +312,23 @@ @Override public SPOPredicate asBound(final IBindingSet bindingSet) { - return (SPOPredicate) super.asBound(bindingSet); + if (bindingSet == null) + throw new IllegalArgumentException(); + final SPOPredicate tmp = (SPOPredicate) super.asBound(bindingSet); + + final RangeBOp rangeBOp = range(); + + // we don't have a range bop for ?o + if (rangeBOp == null) + return tmp; + + final RangeBOp asBound = rangeBOp.asBound(bindingSet); + + tmp._setProperty(Annotations.RANGE, asBound); + + return tmp; + } } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -5,7 +5,6 @@ import java.util.UUID; import org.openrdf.query.algebra.Compare.CompareOp; -import org.openrdf.query.algebra.MathExpr.MathOp; import org.openrdf.rio.RDFFormat; import com.bigdata.bop.BOp; @@ -18,7 +17,6 @@ import com.bigdata.bop.NV; import com.bigdata.bop.Var; import com.bigdata.bop.IPredicate.Annotations; -import com.bigdata.bop.engine.QueryEngine; import com.bigdata.bop.engine.QueryLog; import com.bigdata.bop.joinGraph.rto.JoinGraph; import com.bigdata.journal.ITx; @@ -30,6 +28,7 @@ import com.bigdata.rdf.internal.constraints.MathBOp; import com.bigdata.rdf.internal.constraints.NotBOp; import com.bigdata.rdf.internal.constraints.SameTermBOp; +import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.model.BigdataLiteral; import com.bigdata.rdf.model.BigdataURI; import com.bigdata.rdf.model.BigdataValue; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/rdf/internal/constraints/TestInlineConstraints.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -39,7 +39,6 @@ import org.openrdf.model.vocabulary.RDF; import org.openrdf.query.QueryEvaluationException; import org.openrdf.query.algebra.Compare.CompareOp; -import org.openrdf.query.algebra.MathExpr.MathOp; import com.bigdata.bop.BOp; import com.bigdata.bop.BOpUtility; @@ -48,46 +47,36 @@ import com.bigdata.bop.IConstant; import com.bigdata.bop.IConstraint; import com.bigdata.bop.IPredicate; +import com.bigdata.bop.IPredicate.Annotations; import com.bigdata.bop.IValueExpression; import com.bigdata.bop.IVariable; import com.bigdata.bop.IVariableOrConstant; import com.bigdata.bop.NV; import com.bigdata.bop.PipelineOp; import com.bigdata.bop.Var; -import com.bigdata.bop.IPredicate.Annotations; import com.bigdata.bop.bindingSet.HashBindingSet; import com.bigdata.bop.engine.IRunningQuery; import com.bigdata.bop.engine.QueryEngine; import com.bigdata.bop.fed.QueryEngineFactory; -import com.bigdata.bop.joinGraph.IEvaluationPlan; -import com.bigdata.bop.joinGraph.IEvaluationPlanFactory; -import com.bigdata.bop.joinGraph.fast.DefaultEvaluationPlanFactory2; import com.bigdata.btree.IRangeQuery; import com.bigdata.rdf.error.SparqlTypeErrorException; import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.IVUtility; import com.bigdata.rdf.internal.XSDBooleanIV; +import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.model.BigdataLiteral; import com.bigdata.rdf.model.BigdataURI; import com.bigdata.rdf.model.BigdataValue; import com.bigdata.rdf.model.BigdataValueFactory; import com.bigdata.rdf.rio.StatementBuffer; -import com.bigdata.rdf.rules.RuleContextEnum; import com.bigdata.rdf.sail.BigdataSail; import com.bigdata.rdf.sail.Rule2BOpUtility; -import com.bigdata.rdf.sail.sop.SOp2BOpUtility; -import com.bigdata.rdf.sail.sop.UnsupportedOperatorException; import com.bigdata.rdf.spo.SPOPredicate; import com.bigdata.rdf.store.AbstractTripleStore; import com.bigdata.rdf.store.ProxyTestCase; -import com.bigdata.relation.accesspath.ElementFilter; import com.bigdata.relation.accesspath.IAsynchronousIterator; import com.bigdata.relation.rule.IRule; import com.bigdata.relation.rule.Rule; -import com.bigdata.relation.rule.eval.ActionEnum; -import com.bigdata.relation.rule.eval.IJoinNexus; -import com.bigdata.relation.rule.eval.IJoinNexusFactory; -import com.bigdata.relation.rule.eval.ISolution; import com.bigdata.striterator.ChunkedWrappedIterator; import com.bigdata.striterator.Dechunkerator; import com.bigdata.striterator.IChunkedOrderedIterator; Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-02-28 16:26:37 UTC (rev 4260) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/BigdataEvaluationStrategyImpl3.java 2011-03-01 01:05:33 UTC (rev 4261) @@ -28,6 +28,7 @@ import org.openrdf.query.algebra.And; import org.openrdf.query.algebra.Bound; import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.Compare.CompareOp; import org.openrdf.query.algebra.Filter; import org.openrdf.query.algebra.Group; import org.openrdf.query.algebra.IsBNode; @@ -37,7 +38,6 @@ import org.openrdf.query.algebra.Join; import org.openrdf.query.algebra.LeftJoin; import org.openrdf.query.algebra.MathExpr; -import org.openrdf.query.algebra.MathExpr.MathOp; import org.openrdf.query.algebra.MultiProjection; import org.openrdf.query.algebra.Not; import org.openrdf.query.algebra.Or; @@ -51,7 +51,6 @@ import org.openrdf.query.algebra.SameTerm; import org.openrdf.query.algebra.StatementPattern; import org.openrdf.query.algebra.StatementPattern.Scope; -import org.openrdf.query.algebra.Str; import org.openrdf.query.algebra.TupleExpr; import org.openrdf.query.algebra.UnaryTupleOperator; import org.openrdf.query.algebra.Union; @@ -85,6 +84,10 @@ import com.bigdata.rdf.internal.DummyIV; import com.bigdata.rdf.internal.IV; import com.bigdata.rdf.internal.XSDBooleanIV; +import com.bigdata.rdf.internal.XSDDecimalIV; +import com.bigdata.rdf.internal.XSDDoubleIV; +import com.bigdata.rdf.internal.XSDIntIV; +import com.bigdata.rdf.internal.XSDIntegerIV; import com.bigdata.rdf.internal.constraints.AndBOp; import com.bigdata.rdf.internal.constraints.CompareBOp; import com.bigdata.rdf.internal.constraints.Constraint; @@ -94,8 +97,10 @@ import com.bigdata.rdf.internal.constraints.IsLiteralBOp; import com.bigdata.rdf.internal.constraints.IsURIBOp; import com.bigdata.rdf.internal.constraints.MathBOp; +import com.bigdata.rdf.internal.constraints.MathBOp.MathOp; import com.bigdata.rdf.internal.constraints.NotBOp; import com.bigdata.rdf.internal.constraints.OrBOp; +import com.bigdata.rdf.internal.constraints.RangeBOp; import com.bigdata.rdf.internal.constraints.SameTermBOp; import com.bigdata.rdf.lexicon.LexiconRelation; import com.bigdata.rdf.model.BigdataValue; @@ -809,6 +814,17 @@ */ attachNamedGraphsFilterToSearches(sopTree); + if (false) { + /* + * Look for numerical filters that can be rotated inside predicates + */ + final Iterator<SOpGroup> groups = sopTree.groups(); + while (groups.hasNext()) { + final SOpGroup g = groups.next(); + attachRangeBOps(g); + } + } + /* * Gather variables required by Sesame outside of the query * evaluation (projection and global sesame filters). @@ -939,7 +955,7 @@ final IChunkedOrderedIterator<IBindingSet> it2 = new ChunkedWrappedIterator<IBindingSet>( new Dechunkerator<IBindingSet>(it1)); - + // Materialize IVs as RDF Values. final CloseableIteration<BindingSet, QueryEvaluationException> result = // Monitor IRunningQuery and cancel if Sesame iterator is closed. @@ -954,315 +970,6 @@ } -// /** -// * This is the method that will attempt to take a top-level join or left -// * join and turn it into a native bigdata rule. The Sesame operators Join -// * and LeftJoin share only the common base class BinaryTupleOperator, but -// * other BinaryTupleOperators are not supported by this method. Other -// * specific types of BinaryTupleOperators will cause this method to throw -// * an exception. -// * <p> -// * This method will also turn a single top-level StatementPattern into a -// * rule with one predicate in it. -// * <p> -// * Note: As a pre-condition, the {@link Value}s in the query expression -// * MUST have been rewritten as {@link BigdataValue}s and their term -// * identifiers MUST have been resolved. Any term identifier that remains -// * {@link IRawTripleStore#NULL} is an indication that there is no entry for -// * that {@link Value} in the database. Since the JOINs are required (vs -// * OPTIONALs), that means that there is no solution for the JOINs and an -// * {@link EmptyIteration} is returned rather than evaluating the query. -// * -// * @param join -// * @return native bigdata rule -// * @throws UnsupportedOperatorException -// * this exception will be thrown if the Sesame join contains any -// * SPARQL language constructs that cannot be converted into -// * the bigdata native rule model -// * @throws QueryEvaluationException -// */ -// private IRule createNativeQueryOld(final TupleExpr join) -// throws UnsupportedOperatorException, -// QueryEvaluationException { -// -// if (!(join instanceof StatementPattern || -// join instanceof Join || join instanceof LeftJoin || -// join instanceof Filter)) { -// throw new AssertionError( -// "only StatementPattern, Join, and LeftJoin supported"); -// } -// -// // flattened collection of statement patterns nested within this join, -// // along with whether or not each one is optional -// final Map<StatementPattern, Boolean> stmtPatterns = -// new LinkedHashMap<StatementPattern, Boolean>(); -// // flattened collection of filters nested within this join -// final Collection<Filter> filters = new LinkedList<Filter>(); -// -// // will throw EncounteredUnknownTupleExprException if the join -// // contains something we don't handle yet -//// collectStatementPatterns(join, stmtPatterns, filters); -// -// if (false) { -// for (Map.Entry<StatementPattern, Boolean> entry : -// stmtPatterns.entrySet()) { -// log.debug(entry.getKey() + ", optional=" + entry.getValue()); -// } -// for (Filter filter : filters) { -// log.debug(filter.getCondition()); -// } -// } -// -// // generate tails -// Collection<IPredicate> tails = new LinkedList<IPredicate>(); -// // keep a list of free text searches for later to solve a named graphs -// // problem -// final Map<IPredicate, StatementPattern> searches = -// new HashMap<IPredicate, StatementPattern>(); -// for (Map.Entry<StatementPattern, Boolean> entry : stmtPatterns -// .entrySet()) { -// StatementPattern sp = entry.getKey(); -// boolean optional = entry.getValue(); -// IPredicate tail = toPredicate(sp, optional); -// // encountered a value not in the database lexicon -// if (tail == null) { -// if (log.isDebugEnabled()) { -// log.debug("could not generate tail for: " + sp); -// } -// if (optional) { -// // for optionals, just skip the tail -// continue; -// } else { -// // for non-optionals, skip the entire rule -// return null; -// } -// } -// if (tail.getAccessPathExpander() instanceof FreeTextSearchExpander) { -// searches.put(tail, sp); -// } -// tails.add(tail); -// } -// -// /* -// * When in quads mode, we need to go through the free text searches and -// * make sure that they are properly filtered for the dataset where -// * needed. Joins will take care of this, so we only need to add a filter -// * when a search variable does not appear in any other tails that are -// * non-optional. -// * -// * @todo Bryan seems to think this can be fixed with a DISTINCT JOIN -// * mechanism in the rule evaluation. -// */ -// if (database.isQuads() && dataset != null) { -// for (IPredicate search : searches.keySet()) { -// final Set<URI> graphs; -// StatementPattern sp = searches.get(search); -// switch (sp.getScope()) { -// case DEFAULT_CONTEXTS: { -// /* -// * Query against the RDF merge of zero or more source -// * graphs. -// */ -// graphs = dataset.getDefaultGraphs(); -// break; -// } -// case NAMED_CONTEXTS: { -// /* -// * Query against zero or more named graphs. -// */ -// graphs = dataset.getNamedGraphs(); -// break; -// } -// default: -// throw new AssertionError(); -// } -// if (graphs == null) { -// continue; -// } -// // why would we use a constant with a free text search??? -// if (search.get(0).isConstant()) { -// throw new AssertionError(); -// } -// // get ahold of the search variable -// com.bigdata.bop.Var searchVar = -// (com.bigdata.bop.Var) search.get(0); -// if (log.isDebugEnabled()) { -// log.debug(searchVar); -// } -// // start by assuming it needs filtering, guilty until proven -// // innocent -// boolean needsFilter = true; -// // check the other tails one by one -// for (IPredicate<ISPO> tail : tails) { -// IAccessPathExpander<ISPO> expander = -// tail.getAccessPathExpander(); -// // only concerned with non-optional tails that are not -// // themselves magic searches -// if (expander instanceof FreeTextSearchExpander -// || tail.isOptional()) { -// continue; -// } -// // see if the search variable appears in this tail -// boolean appears = false; -// for (int i = 0; i < tail.arity(); i++) { -// IVariableOrConstant term = tail.get(i); -// if (log.isDebugEnabled()) { -// log.debug(term); -// } -// if (term.equals(searchVar)) { -// appears = true; -// break; -// } -// } -// // if it appears, we don't need a filter -// if (appears) { -// needsFilter = false; -// break; -// } -// } -// // if it needs a filter, add it to the expander -// if (needsFilter) { -// if (log.isDebugEnabled()) { -// log.debug("needs filter: " + searchVar); -// } -// FreeTextSearchExpander expander = (FreeTextSearchExpander) -// search.getAccessPathExpander(); -// expander.addNamedGraphsFilter(graphs); -// } -// } -// } -// -// // generate constraints -// final Collection<IConstraint> constraints = -// new LinkedList<IConstraint>(); -// final Iterator<Filter> filterIt = filters.iterator(); -// while (filterIt.hasNext()) { -// final Filter filter = filterIt.next(); -// final IConstraint constraint = toConstraint(filter.getCondition()); -// if (constraint != null) { -// // remove if we are able to generate a native constraint for it -// if (log.isDebugEnabled()) { -// log.debug("able to generate a constraint: " + constraint); -// } -// filterIt.remove(); -// constraints.add(constraint); -// } -// } -// -// /* -// * FIXME Native slice, DISTINCT, etc. are all commented out for now. -// * Except for ORDER_BY, support exists for all of these features in the -// * native rules, but we need to separate the rewrite of the tupleExpr -// * and its evaluation in order to properly handle this stuff. -// */ -// IQueryOptions queryOptions = QueryOptions.NONE; -// // if (slice) { -// // if (!distinct && !union) { -// // final ISlice slice = new Slice(offset, limit); -// // queryOptions = new QueryOptions(false/* distinct */, -// // true/* stable */, null/* orderBy */, slice); -// // } -// // } else { -// // if (distinct && !union) { -// // queryOptions = QueryOptions.DISTINCT; -// // } -// // } -// -//// if (log.isDebugEnabled()) { -//// for (IPredicate<ISPO> tail : tails) { -//// IAccessPathExpander<ISPO> expander = tail.getAccessPathExpander(); -//// if (expander != null) { -//// IAccessPath<ISPO> accessPath = database.getSPORelation() -//// .getAccessPath(tail); -//// accessPath = expander.getAccessPath(accessPath); -//// IChunkedOrderedIterator<ISPO> it = accessPath.iterator(); -//// while (it.hasNext()) { -//// log.debug(it.next().toString(database)); -//// } -//// } -//// } -//// } -// -// /* -// * Collect a set of variables required beyond just the join (i.e. -// * aggregation, projection, filters, etc.) -// */ -// Set<String> required = new HashSet<String>(); -// -// try { -// -// QueryModelNode p = join; -// while (true) { -// p = p.getParentNode(); -// if (log.isDebugEnabled()) { -// log.debug(p.getClass()); -// } -// if (p instanceof UnaryTupleOperator) { -// required.addAll(collectVariables((UnaryTupleOperator) p)); -// } -// if (p instanceof QueryRoot) { -// break; -// } -// } -// -// if (filters.size() > 0) { -// for (Filter filter : filters) { -// required.addAll(collectVariables((UnaryTupleOperator) filter)); -// } -// } -// -// } catch (Exception ex) { -// throw new QueryEvaluationException(ex); -// } -// -// IVariable[] requiredVars = new IVariable[required.size()]; -// int i = 0; -// for (String v : required) { -// requiredVars[i++] = com.bigdata.bop.Var.var(v); -// } -// -// if (log.isDebugEnabled()) { -// log.debug("required binding names: " + Arrays.toString(requiredVars)); -// } -// -//// if (starJoins) { // database.isQuads() == false) { -//// if (log.isDebugEnabled()) { -//// log.debug("generating star joins"); -//// } -//// tails = generateStarJoins(tails); -//// } -// -// // generate native rule -// IRule rule = new Rule("nativeJoin", -// // @todo should serialize the query string here for the logs. -// null, // head -// tails.toArray(new IPredicate[tails.size()]), queryOptions, -// // constraints on the rule. -// constraints.size() > 0 ? constraints -// .toArray(new IConstraint[constraints.size()]) : null, -// null/* constants */, null/* taskFactory */, requiredVars); -// -// if (BigdataStatics.debug) { -// System.err.println(join.toString()); -// System.err.println(rule.toString()); -// } -// -// // we have filters that we could not translate natively -// if (filters.size() > 0) { -// if (log.isDebugEnabled()) { -// log.debug("could not translate " + filters.size() -// + " filters into native constraints:"); -// for (Filter filter : filters) { -// log.debug("\n" + filter.getCondition()); -// } -// } -// // use the basic filter iterator for remaining filters -//// rule = new ProxyRuleWithSesameFilters(rule, filters); -// } -// -// return rule; -// -// } private void attachNamedGraphsFilterToSearches(final SOpTree sopTree) { @@ -1370,7 +1077,125 @@ } } + + protected void attachRangeBOps(final SOpGroup g) { + final Map<IVariable,Collection<IValueExpression>> lowerBounds = + new LinkedHashMap<IVariable,Collection<IValueExpression>>(); + final Map<IVariable,Collection<IValueExpression>> upperBounds = + new LinkedHashMap<IVariable,Collection<IValueExpression>>(); + + for (SOp sop : g) { + final BOp bop = sop.getBOp(); + if (!(bop instanceof Constraint)) { + continue; + } + final Constraint c = (Constraint) bop; + if (!(c.getValueExpression() instanceof Com... [truncated message content] |
From: <tho...@us...> - 2011-02-28 16:26:45
|
Revision: 4260 http://bigdata.svn.sourceforge.net/bigdata/?rev=4260&view=rev Author: thompsonbry Date: 2011-02-28 16:26:37 +0000 (Mon, 28 Feb 2011) Log Message: ----------- Some more work on the RTO. - added SELECTed variables to the test cases. - added [distinct] flag and run DISTINCT when specified to the test cases. - more fiddling with the estCard, estRead, and estCost. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBarData.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -23,6 +23,7 @@ import com.bigdata.bop.PipelineOp; import com.bigdata.bop.join.PipelineJoin; import com.bigdata.bop.joinGraph.rto.JoinGraph; +import com.bigdata.bop.solutions.DistinctBindingSetOp; import com.bigdata.bop.solutions.SliceOp; /** @@ -956,58 +957,87 @@ } - /** - * Generate a query plan from an ordered collection of predicates. - * - * @param p - * The join path. - * - * @return The query plan. - * - * FIXME Select only those variables required by downstream - * processing or explicitly specified by the caller (in the case - * when this is a subquery, the caller has to declare which - * variables are selected and will be returned out of the subquery). - * - * FIXME For scale-out, we need to either mark the join's evaluation - * context based on whether or not the access path is local or - * remote (and whether the index is key-range distributed or hash - * partitioned). - * - * FIXME Add a method to generate a runnable query plan from the - * collection of predicates and constraints on the - * {@link PartitionedJoinGroup} together with an ordering over the - * join graph. This is a bit different for the join graph and the - * optionals in the tail plan. The join graph itself should either - * be a {@link JoinGraph} operator which gets evaluated at run time - * or reordered by whichever optimizer is selected for the query - * (query hints). - * - * @todo The order of the {@link IPredicate}s in the tail plan is currently - * unchanged from their given order (optional joins without - * constraints can not reduce the selectivity of the query). However, - * it could be worthwhile to run optionals with constraints before - * those without constraints since the constraints can reduce the - * selectivity of the query. If we do this, then we need to reorder - * the optionals based on the partial order imposed what variables - * they MIGHT bind (which are not bound by the join graph). - * - * @todo multiple runFirst predicates can be evaluated in parallel unless - * they have shared variables. When there are no shared variables, - * construct a TEE pattern such that evaluation proceeds in parallel. - * When there are shared variables, the runFirst predicates must be - * ordered based on those shared variables (at which point, it is - * probably an error to flag them as runFirst). - */ - static public PipelineOp getQuery(final BOpIdFactory idFactory, - final IPredicate<?>[] preds, final IConstraint[] constraints) { + /** + * Generate a query plan from an ordered collection of predicates. + * + * @param distinct + * <code>true</code> iff only the distinct solutions are desired. + * @param selected + * The variable(s) to be projected out of the join graph. + * @param preds + * The join path which will be used to execute the join graph. + * @param constraints + * The constraints on the join graph. + * + * @return The query plan. + * + * FIXME Select only those variables required by downstream + * processing or explicitly specified by the caller (in the case + * when this is a subquery, the caller has to declare which + * variables are selected and will be returned out of the subquery). + * + * FIXME For scale-out, we need to either mark the join's evaluation + * context based on whether or not the access path is local or + * remote (and whether the index is key-range distributed or hash + * partitioned). + * + * FIXME Add a method to generate a runnable query plan from the + * collection of predicates and constraints on the + * {@link PartitionedJoinGroup} together with an ordering over the + * join graph. This is a bit different for the join graph and the + * optionals in the tail plan. The join graph itself should either + * be a {@link JoinGraph} operator which gets evaluated at run time + * or reordered by whichever optimizer is selected for the query + * (query hints). + * + * @todo The order of the {@link IPredicate}s in the tail plan is currently + * unchanged from their given order (optional joins without + * constraints can not reduce the selectivity of the query). However, + * it could be worthwhile to run optionals with constraints before + * those without constraints since the constraints can reduce the + * selectivity of the query. If we do this, then we need to reorder + * the optionals based on the partial order imposed what variables + * they MIGHT bind (which are not bound by the join graph). + * + * @todo multiple runFirst predicates can be evaluated in parallel unless + * they have shared variables. When there are no shared variables, + * construct a TEE pattern such that evaluation proceeds in parallel. + * When there are shared variables, the runFirst predicates must be + * ordered based on those shared variables (at which point, it is + * probably an error to flag them as runFirst). + */ + static public PipelineOp getQuery(final BOpIdFactory idFactory, + final boolean distinct, final IVariable<?>[] selected, + final IPredicate<?>[] preds, final IConstraint[] constraints) { + /* + * Reserve ids used by the join graph or its constraints. + */ + { + for (IPredicate<?> p : preds) { + idFactory.reserve(p.getId()); + } + if (constraints != null) { + for (IConstraint c : constraints) { + final Iterator<BOp> itr = BOpUtility + .preOrderIteratorWithAnnotations(c); + while (itr.hasNext()) { + final BOp y = itr.next(); + final Integer anId = (Integer) y + .getProperty(BOp.Annotations.BOP_ID); + if (anId != null) + idFactory.reserve(anId.intValue()); + } + } + } + } + // figure out which constraints are attached to which predicates. final IConstraint[][] assignedConstraints = PartitionedJoinGroup .getJoinGraphConstraints(preds, constraints, null/*knownBound*/, true/*pathIsComplete*/); - final PipelineJoin<?>[] joins = new PipelineJoin[preds.length]; +// final PipelineJoin<?>[] joins = new PipelineJoin[preds.length]; PipelineOp lastOp = null; @@ -1016,6 +1046,7 @@ // The next vertex in the selected join order. final IPredicate<?> p = preds[i]; + // Annotations for this join. final List<NV> anns = new LinkedList<NV>(); anns.add(new NV(PipelineJoin.Annotations.PREDICATE, p)); @@ -1027,23 +1058,35 @@ // // anns.add(new NV(PipelineJoin.Annotations.SELECT, vars.toArray(new IVariable[vars.size()]))); - if (assignedConstraints[i] != null - && assignedConstraints[i].length > 0) - anns - .add(new NV(PipelineJoin.Annotations.CONSTRAINTS, - assignedConstraints[i])); + if (assignedConstraints[i] != null + && assignedConstraints[i].length > 0) { + // attach constraints to this join. + anns.add(new NV(PipelineJoin.Annotations.CONSTRAINTS, + assignedConstraints[i])); + } - final PipelineJoin<?> joinOp = new PipelineJoin( - lastOp == null ? new BOp[0] : new BOp[] { lastOp }, anns - .toArray(new NV[anns.size()])); + final PipelineJoin<?> joinOp = new PipelineJoin(// + lastOp == null ? new BOp[0] : new BOp[] { lastOp }, // + anns.toArray(new NV[anns.size()])// + ); - joins[i] = joinOp; - lastOp = joinOp; } -// final PipelineOp queryOp = lastOp; + if (distinct) { + lastOp = new DistinctBindingSetOp(new BOp[] { lastOp }, NV + .asMap(new NV[] { + new NV(PipelineOp.Annotations.BOP_ID, idFactory + .nextId()), // + new NV(PipelineOp.Annotations.EVALUATION_CONTEXT, + BOpEvaluationContext.CONTROLLER),// + new NV(PipelineOp.Annotations.SHARED_STATE, true),// + new NV(DistinctBindingSetOp.Annotations.VARIABLES, + selected),// + })// + ); + } /* * FIXME Why does wrapping with this slice appear to be @@ -1052,7 +1095,7 @@ * * [This should perhaps be moved into the caller.] */ - final PipelineOp queryOp = new SliceOp(new BOp[] { lastOp }, NV + lastOp = new SliceOp(new BOp[] { lastOp }, NV .asMap(new NV[] { new NV(JoinGraph.Annotations.BOP_ID, idFactory.nextId()), // new NV(JoinGraph.Annotations.EVALUATION_CONTEXT, @@ -1061,9 +1104,7 @@ }) // ); -// final PipelineOp queryOp = lastOp; - - return queryOp; + return lastOp; } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -37,6 +37,7 @@ import java.util.List; import java.util.Map; import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; import org.apache.log4j.Logger; @@ -415,6 +416,27 @@ while (paths.length > 0 && round < nvertices - 1) { + /* + * Resample the paths. + * + * Note: Since the vertex samples are random, it is possible for the + * #of paths with cardinality estimate underflow to jump up and down + * due to the sample which is making its way through each path in + * each round. + */ + int nunderflow; + + while ((nunderflow = resamplePaths(queryEngine, limit, round, + paths, edgeSamples)) > 0) { + + log.warn("resampling in round=" + round + " : " + nunderflow + + " paths have cardinality estimate underflow."); + + } + + /* + * Extend the paths by one vertex. + */ paths = expand(queryEngine, limit, round++, paths, edgeSamples); } @@ -426,8 +448,11 @@ } - // Should be one winner. - assert paths.length == 1; + // Should be one winner. + if (paths.length != 1) { + throw new AssertionError("Expected one path but have " + + paths.length + " paths."); + } if (log.isInfoEnabled()) { @@ -629,48 +654,30 @@ } - /** - * Do one breadth first expansion. In each breadth first expansion we extend - * each of the active join paths by one vertex for each remaining vertex - * which enjoys a constrained join with that join path. In the event that - * there are no remaining constrained joins, we will extend the join path - * using an unconstrained join if one exists. In all, there are three - * classes of joins to be considered: - * <ol> - * <li>The target predicate directly shares a variable with the source join - * path. Such joins are always constrained since the source predicate will - * have bound that variable.</li> - * <li>The target predicate indirectly shares a variable with the source - * join path via a constraint can run for the target predicate and which - * shares a variable with the source join path. These joins are indirectly - * constrained by the shared variable in the constraint. BSBM Q5 is an - * example of this case.</li> - * <li>Any predicates may always be join to an existing join path. However, - * joins which do not share variables either directly or indirectly will be - * full cross products. Therefore such joins are added to the join path only - * after all constrained joins have been consumed.</li> - * </ol> - * - * @param queryEngine - * The query engine. - * @param limitIn - * The limit (this is automatically multiplied by the round to - * increase the sample size in each round). - * @param round - * The round number in [1:n]. - * @param a - * The set of paths from the previous round. For the first round, - * this is formed from the initial set of edges to consider. - * @param edgeSamples - * A map used to associate join path segments (expressed as an - * ordered array of bopIds) with {@link EdgeSample}s to avoid - * redundant effort. - * - * @return The set of paths which survived pruning in this round. - * - * @throws Exception - */ - public Path[] expand(final QueryEngine queryEngine, int limitIn, + /** + * Resample the initial vertices for the specified join paths and then + * resample the cutoff join for each given join path in path order. + * + * @param queryEngine + * The query engine. + * @param limitIn + * The original limit. + * @param round + * The round number in [1:n]. + * @param a + * The set of paths from the previous round. For the first round, + * this is formed from the initial set of edges to consider. + * @param edgeSamples + * A map used to associate join path segments (expressed as an + * ordered array of bopIds) with {@link EdgeSample}s to avoid + * redundant effort. + * + * @return The number of join paths which are experiencing cardinality + * estimate underflow. + * + * @throws Exception + */ + public int resamplePaths(final QueryEngine queryEngine, int limitIn, final int round, final Path[] a, final Map<PathIds, EdgeSample> edgeSamples) throws Exception { @@ -684,38 +691,60 @@ throw new IllegalArgumentException(); if (a.length == 0) throw new IllegalArgumentException(); - -// // increment the limit by itself in each round. -// final int limit = (round + 1) * limitIn; + + /* + * Re-sample the vertices which are the initial vertex of any of the + * existing paths. + * + * Note: We do not need to resample vertices unless they are the first + * vertex in some path. E.g., the initial vertices from which we start. + * The inputs to an EdgeSample are always either the sample of an + * initial vertex or the sample of a prior cutoff join in the join + * path's own history. + * + * Note: A request to re-sample a vertex is a NOP unless the limit has + * been increased since the last time the vertex was sampled. It is also + * a NOP if the vertex has been fully materialized. + * + * Note: Before resampling the vertices, decide what the maximum limit + * will be for each vertex by examining the paths using that vertex, + * their current sample limit (TODO there should be a distinct limit for + * the vertex and for each cutoff join), and whether or not each path + * experiences a cardinality estimate underflow. + */ + { - if (log.isDebugEnabled()) - log.debug("round=" + round + ", #paths(in)=" + a.length); + if (log.isDebugEnabled()) + log.debug("Re-sampling in-use vertices."); - /* - * Re-sample the vertices which are the initial vertex of any of the - * existing paths. - * - * Note: We do not need to resample vertices unless they are the first - * vertex in some path. E.g., the initial vertices from which we start. - * The inputs to an EdgeSample are always either the sample of an - * initial vertex or the sample of a prior cutoff join in the join - * path's own history. - * - * Note: A request to re-sample a vertex is a NOP unless the limit has - * been increased since the last time the vertex was sampled. It is also - * a NOP if the vertex has been fully materialized. - */ - if (log.isDebugEnabled()) - log.debug("Re-sampling in-use vertices."); + final Map<Vertex, AtomicInteger/* limit */> vertexLimit = new LinkedHashMap<Vertex, AtomicInteger>(); - for (Path x : a) { + for (Path x : a) { - final int limit = x.getNewLimit(limitIn); + final int limit = x.getNewLimit(limitIn); - x.vertices[0].sample(queryEngine, limit, sampleType); + final Vertex v = x.vertices[0]; - } + AtomicInteger theLimit = vertexLimit.get(v); + if (theLimit == null) { + vertexLimit.put(v, theLimit = new AtomicInteger()); + } + theLimit.set(limit); + } + + for (Path x : a) { + + final Vertex v = x.vertices[0]; + + final int limit = vertexLimit.get(v).intValue(); + + v.sample(queryEngine, limit, sampleType); + + } + + } + /* * Re-sample the cutoff join for each edge in each of the existing * paths using the newly re-sampled vertices. @@ -732,6 +761,7 @@ if (log.isDebugEnabled()) log.debug("Re-sampling in-use path segments."); + int nunderflow = 0; for (Path x : a) { /* @@ -759,8 +789,8 @@ if (edgeSample != null && edgeSample.limit < limit && !edgeSample.isExact()) { - if (log.isDebugEnabled()) - log.debug("Will resample at higher limit: " + ids); + if (log.isTraceEnabled()) + log.trace("Will resample at higher limit: " + ids); // Time to resample this edge. edgeSamples.remove(ids); edgeSample = null; @@ -833,8 +863,8 @@ priorEdgeSample// ); - if (log.isDebugEnabled()) - log.debug("Resampled: " + ids + " : " + edgeSample); + if (log.isTraceEnabled()) + log.trace("Resampled: " + ids + " : " + edgeSample); if (edgeSamples.put(ids, edgeSample) != null) throw new AssertionError(); @@ -853,9 +883,80 @@ // Save the result on the path. x.edgeSample = priorEdgeSample; + + if (x.edgeSample.estimateEnum == EstimateEnum.Underflow) { + if (log.isDebugEnabled()) + log.debug("Cardinality underflow: " + x); + nunderflow++; + } } // next Path [x]. + return nunderflow; + + } + + /** + * Do one breadth first expansion. In each breadth first expansion we extend + * each of the active join paths by one vertex for each remaining vertex + * which enjoys a constrained join with that join path. In the event that + * there are no remaining constrained joins, we will extend the join path + * using an unconstrained join if one exists. In all, there are three + * classes of joins to be considered: + * <ol> + * <li>The target predicate directly shares a variable with the source join + * path. Such joins are always constrained since the source predicate will + * have bound that variable.</li> + * <li>The target predicate indirectly shares a variable with the source + * join path via a constraint can run for the target predicate and which + * shares a variable with the source join path. These joins are indirectly + * constrained by the shared variable in the constraint. BSBM Q5 is an + * example of this case.</li> + * <li>Any predicates may always be join to an existing join path. However, + * joins which do not share variables either directly or indirectly will be + * full cross products. Therefore such joins are added to the join path only + * after all constrained joins have been consumed.</li> + * </ol> + * + * @param queryEngine + * The query engine. + * @param limitIn + * The original limit. + * @param round + * The round number in [1:n]. + * @param a + * The set of paths from the previous round. For the first round, + * this is formed from the initial set of edges to consider. + * @param edgeSamples + * A map used to associate join path segments (expressed as an + * ordered array of bopIds) with {@link EdgeSample}s to avoid + * redundant effort. + * + * @return The set of paths which survived pruning in this round. + * + * @throws Exception + */ + public Path[] expand(final QueryEngine queryEngine, int limitIn, + final int round, final Path[] a, + final Map<PathIds, EdgeSample> edgeSamples) throws Exception { + + if (queryEngine == null) + throw new IllegalArgumentException(); + if (limitIn <= 0) + throw new IllegalArgumentException(); + if (round <= 0) + throw new IllegalArgumentException(); + if (a == null) + throw new IllegalArgumentException(); + if (a.length == 0) + throw new IllegalArgumentException(); + +// // increment the limit by itself in each round. +// final int limit = (round + 1) * limitIn; + + if (log.isDebugEnabled()) + log.debug("round=" + round + ", #paths(in)=" + a.length); + /* * Expand each path one step from each vertex which branches to an * unused vertex. @@ -1005,12 +1106,13 @@ final Path[] paths_tp1_pruned = pruneJoinPaths(paths_tp1, edgeSamples); - if (log.isDebugEnabled()) - log.debug("\n*** round=" + round - + " : generated paths\n" - + JGraph.showTable(paths_tp1, paths_tp1_pruned)); + if (log.isDebugEnabled()) // shows which paths were pruned. + log.info("\n*** round=" + round + ": paths{in=" + a.length + + ",considered=" + paths_tp1.length + ",out=" + + paths_tp1_pruned.length + "}\n" + + JGraph.showTable(paths_tp1, paths_tp1_pruned)); - if (log.isInfoEnabled()) + if (log.isInfoEnabled()) // only shows the surviving paths. log.info("\n*** round=" + round + ": paths{in=" + a.length + ",considered=" + paths_tp1.length + ",out=" + paths_tp1_pruned.length @@ -1243,11 +1345,6 @@ final Path Pi = a[i]; if (Pi.edgeSample == null) throw new RuntimeException("Not sampled: " + Pi); - if (neverPruneUnderflow - && Pi.edgeSample.estimateEnum == EstimateEnum.Underflow) { - // Do not prune if path has cardinality underflow. - continue; - } if (Pi.vertices.length < maxPathLen) { /* * Only the most recently generated set of paths survive to @@ -1256,28 +1353,32 @@ pruned.add(Pi); continue; } - if (pruned.contains(Pi)) + if (pruned.contains(Pi)) { + // already pruned. continue; + } + if (neverPruneUnderflow + && Pi.edgeSample.estimateEnum == EstimateEnum.Underflow) { + // Do not use path to prune if path has cardinality underflow. + continue; + } for (int j = 0; j < a.length; j++) { if (i == j) continue; final Path Pj = a[j]; if (Pj.edgeSample == null) throw new RuntimeException("Not sampled: " + Pj); - if (neverPruneUnderflow - && Pj.edgeSample.estimateEnum == EstimateEnum.Underflow) { - // Do not prune if path has cardinality underflow. - continue; + if (pruned.contains(Pj)) { + // already pruned. + continue; } - if (pruned.contains(Pj)) - continue; final boolean isPiSuperSet = Pi.isUnorderedVariant(Pj); if (!isPiSuperSet) { // Can not directly compare these join paths. continue; } - final long costPi = Pi.sumEstCard; - final long costPj = Pj.sumEstCard; + final long costPi = Pi.sumEstCost; + final long costPj = Pj.sumEstCost; final boolean lte = costPi <= costPj; List<Integer> prunedByThisPath = null; if (lte) { @@ -1418,7 +1519,7 @@ static public String showTable(final Path[] a,final Path[] pruned) { final StringBuilder sb = new StringBuilder(); final Formatter f = new Formatter(sb); - f.format("%-4s %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s %10s", + f.format("%-4s %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s %10s %10s", "path",// "srcCard",// "",// sourceSampleExact @@ -1434,8 +1535,9 @@ "estRead",// "estCard",// "",// estimateIs(Exact|LowerBound|UpperBound) - "sumEstRead",// sumEstimatedTuplesRead - "sumEstCard",// sumEstimatedCardinality + "sumEstRead",// + "sumEstCard",// + "sumEstCost",// "joinPath\n" ); for (int i = 0; i < a.length; i++) { @@ -1453,11 +1555,11 @@ } final EdgeSample edgeSample = x.edgeSample; if (edgeSample == null) { - f.format("%4d %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s",// - i, NA, "", NA, NA, NA, NA, NA, NA, NA, NA, NA, "", NA, + f.format("%4d %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s %10s",// + i, NA, "", NA, NA, NA, NA, NA, NA, NA, NA, NA, "", NA, NA, NA); } else { - f.format("%4d %10d%1s * % 10.2f (%8d %8d %8d %8d %8d %8d) = %10d % 10d%1s : % 10d % 10d", // + f.format("%4d %10d%1s * % 10.2f (%8d %8d %8d %8d %8d %8d) = %10d % 10d%1s : % 10d % 10d % 10d", // i,// edgeSample.sourceSample.estCard,// edgeSample.sourceSample.estimateEnum.getCode(),// @@ -1473,7 +1575,8 @@ edgeSample.estCard,// edgeSample.estimateEnum.getCode(),// x.sumEstRead,// - x.sumEstCard// + x.sumEstCard,// + x.sumEstCost ); } sb.append(" ["); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JoinGraph.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -38,6 +38,7 @@ import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstraint; import com.bigdata.bop.IPredicate; +import com.bigdata.bop.IVariable; import com.bigdata.bop.NV; import com.bigdata.bop.PipelineOp; import com.bigdata.bop.ap.SampleIndex; @@ -79,6 +80,11 @@ */ public interface Annotations extends PipelineOp.Annotations { + /** + * The variables which are projected out of the join graph. + */ + String SELECTED = JoinGraph.class.getName() + ".selected"; + /** * The vertices of the join graph, expressed an an {@link IPredicate}[] * (required). @@ -120,6 +126,15 @@ } /** + * @see Annotations#SELECTED + */ + public IVariable<?>[] getSelected() { + + return (IVariable[]) getRequiredProperty(Annotations.SELECTED); + + } + + /** * @see Annotations#VERTICES */ public IPredicate<?>[] getVertices() { @@ -176,6 +191,15 @@ super(args, anns); // required property. + final IVariable<?>[] selected = (IVariable[]) getProperty(Annotations.SELECTED); + + if (selected == null) + throw new IllegalArgumentException(Annotations.SELECTED); + + if (selected.length == 0) + throw new IllegalArgumentException(Annotations.SELECTED); + + // required property. final IPredicate<?>[] vertices = (IPredicate[]) getProperty(Annotations.VERTICES); if (vertices == null) @@ -253,6 +277,12 @@ } + /** + * {@inheritDoc} + * + * + * TODO where to handle DISTINCT, ORDER BY, GROUP BY for join graph? + */ public Void call() throws Exception { // Create the join graph. @@ -266,9 +296,10 @@ // Factory avoids reuse of bopIds assigned to the predicates. final BOpIdFactory idFactory = new BOpIdFactory(); - // Generate the query from the join path. - final PipelineOp queryOp = PartitionedJoinGroup.getQuery(idFactory, - p.getPredicates(), getConstraints()); + // Generate the query from the join path. + final PipelineOp queryOp = PartitionedJoinGroup.getQuery(idFactory, + false/* distinct */, getSelected(), p.getPredicates(), + getConstraints()); // Run the query, blocking until it is done. JoinGraph.runSubquery(context, queryOp); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -158,21 +158,26 @@ * reflects the #of tuples read from the disk. These two measures are * tracked separately and then combined into the {@link #sumEstCost}. */ - final public double sumEstCost; - - /** - * Combine the cumulative expected cardinality and the cumulative expected - * tuples read to produce an overall measure of the expected cost of the - * join path if it were to be fully executed. - * - * @return The cumulative estimated cost of the join path. - */ - private static double getTotalCost(final Path p) { + final public long sumEstCost; - final long total = p.sumEstCard + // - p.sumEstRead// - ; + /** + * Combine the cumulative expected cardinality and the cumulative expected + * tuples read to produce an overall measure of the expected cost of the + * join path if it were to be fully executed. + * + * @return The cumulative estimated cost of the join path. + * + * TODO Compute this incrementally as estCost using estRead and + * estCard and then take the running sum as sumEstCost and update + * the JGraph trace appropriately. + */ + private static long getCost(final long sumEstRead, final long sumEstCard) { + final long total; +// total = sumEstCard + sumEstRead; // intermediate results + IO. +// total = sumEstRead; // just IO + total = sumEstCard; // just intermediate results. + return total; } @@ -193,8 +198,12 @@ // sb.append(e.getLabel()); // first = false; // } - sb.append("],cumEstCard=" + sumEstCard - + ",sample=" + edgeSample + "}"); + sb.append("]"); + sb.append(",sumEstRead=" + sumEstRead); + sb.append(",sumEstCard=" + sumEstCard); + sb.append(",sumEstCost=" + sumEstCost); + sb.append(",sample=" + edgeSample); + sb.append("}"); return sb.toString(); } @@ -243,18 +252,18 @@ this.edgeSample = edgeSample; /* - * The estimated cardinality of the cutoff join of (v0,v1). - */ - this.sumEstCard = edgeSample.estCard; - - /* * The expected #of tuples read for the full join of (v0,v1). This is * everything which could be visited for [v0] plus the #of tuples read * from [v1] during the cutoff join times the (adjusted) join hit ratio. */ this.sumEstRead = v0.sample.estCard + edgeSample.estRead; - this.sumEstCost = getTotalCost(this); + /* + * The estimated cardinality of the cutoff join of (v0,v1). + */ + this.sumEstCard = edgeSample.estCard; + + this.sumEstCost = getCost(this.sumEstRead, this.sumEstCard); } @@ -312,7 +321,7 @@ this.sumEstRead = sumEstRead; - this.sumEstCost = getTotalCost(this); + this.sumEstCost = getCost(this.sumEstRead, this.sumEstCard); } @@ -618,29 +627,27 @@ // The new vertex. final Vertex targetVertex = vnew; - /* - * Chain sample the edge. - * - * Note: ROX uses the intermediate result I(p) for the existing path as - * the input when sampling the edge. The corresponding concept for us is - * the sample for this Path, which will have all variable bindings - * produced so far. In order to estimate the cardinality of the new join - * path we have to do a one step cutoff evaluation of the new Edge, - * given the sample available on the current Path. - * - * FIXME It is possible for the resulting edge sample to be empty (no - * solutions). Unless the sample also happens to be exact, this is an - * indication that the estimated cardinality has underflowed. We track - * the estimated cumulative cardinality, so this does not make the join - * path an immediate winner, but it does mean that we can not probe - * further on that join path as we lack any intermediate solutions to - * feed into the downstream joins. To resolve that, we have to increase - * the sample limit (unless the path is the winner, in which case we can - * fully execute the join path segment and materialize the results and - * use those to probe further, but this will require the use of the - * memory manager to keep the materialized intermediate results off of - * the Java heap). - */ + /* + * Chain sample the edge. + * + * Note: ROX uses the intermediate result I(p) for the existing path as + * the input when sampling the edge. The corresponding concept for us is + * the sample for this Path, which will have all variable bindings + * produced so far. In order to estimate the cardinality of the new join + * path we have to do a one step cutoff evaluation of the new Edge, + * given the sample available on the current Path. + * + * Note: It is possible for the resulting edge sample to be empty (no + * solutions). Unless the sample also happens to be exact, this is an + * indication that the estimated cardinality has underflowed. We track + * the estimated cumulative cardinality, so this does not make the join + * path an immediate winner, but it does mean that we can not probe + * further on that join path as we lack any intermediate solutions to + * feed into the downstream joins. To resolve that, we have to increase + * the sample limit (unless the path is the winner, in which case we can + * fully execute the join path segment and materialize the results and + * use those to probe further). + */ // Ordered array of all predicates including the target vertex. final IPredicate<?>[] preds2; @@ -846,15 +853,22 @@ runningQuery.iterator()); while (itr.hasNext()) { bset = itr.next(); - result.add(bset); - nresults++; // TODO break out if cutoff join over produces! + result.add(bset); + if (nresults++ >= limit) { + // Break out if cutoff join over produces! + break; + } } } finally { - // verify no problems. - runningQuery.get(); // TODO CANCEL query once limit is satisfied THEN check the future for errors. + // ensure terminated regardless. + runningQuery.cancel(true/* mayInterruptIfRunning */); } - } finally { - runningQuery.cancel(true/* mayInterruptIfRunning */); + } finally { + // verify no problems. + if (runningQuery.getCause() != null) { + // wrap throwable from abnormal termination. + throw new RuntimeException(runningQuery.getCause()); + } } // The join hit ratio can be computed directly from these stats. Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -809,6 +809,49 @@ } /** + * <code>[5 6 0 2 1 4 3]</code>. + * + * FIXME The above join path produces a false ZERO result for the query and + * all of the join path segments below produce a false exact ZERO (0E) + * cardinality estimate. Figure out why. The final path chosen could have + * been any of the one step extensions of a path with a false 0E cardinality + * estimate. + * + * <pre> + * INFO : 3529 main com.bigdata.bop.joinGraph.rto.JGraph.expand(JGraph.java:1116): + * ** round=4: paths{in=14,considered=26,out=6} + * path srcCard * f ( in sumRgCt tplsRead out limit adjCard) = estRead estCard : sumEstRead sumEstCard sumEstCost joinPath + * 0 0E * 0.00 ( 0 0 0 0 200 0) = 0 0E : 1 0 0 [ 5 6 0 2 1 4 ] + * 1 0E * 0.00 ( 0 0 0 0 200 0) = 0 0E : 1 0 0 [ 5 6 0 2 4 3 ] + * 2 0E * 0.00 ( 0 0 0 0 200 0) = 0 0E : 1 0 0 [ 5 6 0 4 1 3 ] + * 3 0E * 0.00 ( 0 0 0 0 200 0) = 0 0E : 1 0 0 [ 5 6 2 1 4 3 ] + * 4 208 * 1.00 ( 26 26 26 26 400 26) = 26 208 : 16576 1447 1447 [ 5 3 1 2 4 0 ] + * 5 0E * 0.00 ( 0 0 0 0 200 0) = 0 0E : 2 1 1 [ 5 3 6 0 1 2 ] + * </pre> + */ + public void test_attachConstraints_BSBM_Q5_path04() { + + final IPredicate<?>[] path = { p5, p6, p0, p2, p1, p4, p3 }; + + final IConstraint[][] actual = PartitionedJoinGroup + .getJoinGraphConstraints(path, constraints, + null/* knownBoundVars */, true/* pathIsComplete */); + + final Set<IConstraint>[] expected = new Set[] { // + NA, // p5 + asSet(new IConstraint[] { c0, c2 }), // p6 + NA, // p0 + NA, // p2 + NA, // p1 + NA, // p4 + C1, // p3 + }; + + assertSameConstraints(expected, actual); + + } + + /** * Verifies that the right set of constraints is attached at each of the * vertices of a join path. Comparison of {@link IConstraint} instances is * by reference. Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -43,6 +43,7 @@ import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstraint; import com.bigdata.bop.IPredicate; +import com.bigdata.bop.IVariable; import com.bigdata.bop.PipelineOp; import com.bigdata.bop.ap.SampleIndex.SampleType; import com.bigdata.bop.engine.BOpStats; @@ -82,12 +83,8 @@ private Journal jnl; -// protected AbstractTripleStore database; + protected QueryEngine queryEngine; -// private String namespace; - - private QueryEngine queryEngine; - /** The initial sampling limit. */ static private final int limit = 100; @@ -236,12 +233,14 @@ * JVM run using the known solutions produced by the runtime versus * static query optimizers. */ - protected IPredicate<?>[] doTest(final IPredicate<?>[] preds, - final IConstraint[] constraints) throws Exception { + protected IPredicate<?>[] doTest(final boolean distinct, + final IVariable<?>[] selected, final IPredicate<?>[] preds, + final IConstraint[] constraints) throws Exception { if (warmUp) - runQuery("Warmup", queryEngine, runStaticQueryOptimizer( - getQueryEngine(), preds), constraints); + runQuery("Warmup", queryEngine, distinct, selected, + runStaticQueryOptimizer(getQueryEngine(), preds), + constraints); /* * Run the runtime query optimizer once (its cost is not counted @@ -269,8 +268,8 @@ if (runGivenOrder) { final long begin = System.currentTimeMillis(); - final BOpStats stats = runQuery(GIVEN, queryEngine, preds, - constraints); + final BOpStats stats = runQuery(GIVEN, queryEngine, distinct, + selected, preds, constraints); final long nout = stats.unitsOut.get(); if (i == 0) givenSolutions = nout; @@ -285,9 +284,9 @@ if (runStaticQueryOptimizer) { final long begin = System.currentTimeMillis(); - final BOpStats stats = runQuery(STATIC, queryEngine, - runStaticQueryOptimizer(getQueryEngine(), preds), - constraints); + final BOpStats stats = runQuery(STATIC, queryEngine, distinct, + selected, runStaticQueryOptimizer(getQueryEngine(), + preds), constraints); final long nout = stats.unitsOut.get(); if (i == 0) staticSolutions = nout; @@ -311,8 +310,8 @@ // Evaluate the query using the selected join order. final long begin = System.currentTimeMillis(); - final BOpStats stats = runQuery(RUNTIME, queryEngine, - runtimePredOrder, constraints); + final BOpStats stats = runQuery(RUNTIME, queryEngine, distinct, + selected, runtimePredOrder, constraints); final long nout = stats.unitsOut.get(); if (i == 0) runtimeSolutions = nout; @@ -428,14 +427,15 @@ } - /** - * Run a query joining a set of {@link IPredicate}s in the given join order. - * - * @return The stats for the last operator in the pipeline. - */ - private static BOpStats runQuery(final String msg, - final QueryEngine queryEngine, final IPredicate<?>[] predOrder, - final IConstraint[] constraints) throws Exception { + /** + * Run a query joining a set of {@link IPredicate}s in the given join order. + * + * @return The stats for the last operator in the pipeline. + */ + protected static BOpStats runQuery(final String msg, + final QueryEngine queryEngine, final boolean distinct, + final IVariable<?>[] selected, final IPredicate<?>[] predOrder, + final IConstraint[] constraints) throws Exception { if (log.isInfoEnabled()) log.info("Running " + msg); @@ -455,18 +455,16 @@ } final PipelineOp queryOp = PartitionedJoinGroup.getQuery(idFactory, - predOrder, constraints); + distinct, selected, predOrder, constraints); System.out.println(BOpUtility.toString(queryOp)); - // submit query to runtime optimizer. + // run the query, counting results and chunks. + long nout = 0; + long nchunks = 0; final IRunningQuery q = queryEngine.eval(queryOp); - try { - // drain the query results. - long nout = 0; - long nchunks = 0; final IAsynchronousIterator<IBindingSet[]> itr = q.iterator(); try { while (itr.hasNext()) { @@ -477,25 +475,26 @@ } finally { itr.close(); } + } finally { + // ensure terminated. + q.cancel(true/* mayInterruptIfRunning */); + } - // check the Future for the query. - q.get(); + // Check the Future for the query. + if (q.getCause() != null) { + // Wrap Throwable from abnormal termination. + throw new RuntimeException(q.getCause()); + } - // show the results. - final BOpStats stats = q.getStats().get(queryOp.getId()); + // show the results. + final BOpStats stats = q.getStats().get(queryOp.getId()); - System.err.println(msg + " : ids=" + Arrays.toString(ids) - + ", elapsed=" + q.getElapsed() + ", nout=" + nout - + ", nchunks=" + nchunks + ", stats=" + stats); + System.err.println(msg + " : ids=" + Arrays.toString(ids) + + ", elapsed=" + q.getElapsed() + ", nout=" + nout + + ", nchunks=" + nchunks + ", stats=" + stats); - return stats; + return stats; - } finally { - - q.cancel(true/* mayInterruptIfRunning */); - - } - } /** Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -18,6 +18,7 @@ import com.bigdata.bop.NV; import com.bigdata.bop.Var; import com.bigdata.bop.IPredicate.Annotations; +import com.bigdata.bop.engine.QueryEngine; import com.bigdata.bop.engine.QueryLog; import com.bigdata.bop.joinGraph.rto.JoinGraph; import com.bigdata.journal.ITx; @@ -273,6 +274,8 @@ } } + final boolean distinct = true; + final IVariable<?>[] selected; final IConstraint[] constraints; final IPredicate[] preds; final IPredicate p0, p1, p2, p3, p4, p5, p6; @@ -285,6 +288,8 @@ final IVariable origProperty1 = Var.var("origProperty1"); final IVariable origProperty2 = Var.var("origProperty2"); + selected = new IVariable[] { product, productLabel }; + // The name space for the SPO relation. final String[] spoRelation = new String[] { namespace + ".spo" }; @@ -473,9 +478,10 @@ test_bsbm_q5 : Total times: static=8871, runtime=8107, delta(static-runtime)=764 */ - final IPredicate<?>[] runtimeOrder = doTest(preds, null/* constraints */); - assertEquals("runtimeOrder", new int[] { 1, 2, 0, 4, 6, 3, 5 }, - BOpUtility.getPredIds(runtimeOrder)); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, + preds, null/* constraints */); + assertEquals("runtimeOrder", new int[] { 1, 2, 0, 4, 6, 3, 5 }, + BOpUtility.getPredIds(runtimeOrder)); } // Run w/ constraints. @@ -508,11 +514,32 @@ test_bsbm_q5 : Total times: static=7312, runtime=3305, delta(static-runtime)=4007 */ - final IPredicate<?>[] runtimeOrder = doTest(preds, constraints); - assertEquals("runtimeOrder", new int[] { 1, 2, 4, 3, 6, 5, 0 }, - BOpUtility.getPredIds(runtimeOrder)); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, + preds, constraints); + /* + * FIXME The RTO produces join paths on some runs which appear to + * have no solutions. I've written a unit test for constraint + * attachment for the case below, but the constraints appear to be + * attached correctly. I've also run the "bad" join path directly + * (see below) and it finds the correct #of solutions. This is + * pretty weird. + */ + // [5, 3, 1, 2, 4, 6, 0] - Ok and faster. + // [5, 3, 1, 2, 4, 6, 0] - Ok and faster (8828 vs 3621) + // [5, 6, 0, 2, 1, 4, 3] - no results!!! + // [5, 6, 0, 2, 1, 4, 3] - again, no results. + assertEquals("runtimeOrder", new int[] { 1, 2, 4, 3, 6, 5, 0 }, + BOpUtility.getPredIds(runtimeOrder)); } + if(false){ + // Run some fixed order. +// final IPredicate<?>[] path = { p5, p6, p0, p2, p1, p4, p3 }; + final IPredicate<?>[] path = { p5, p3, p1, p2, p4, p6, p0 }; + runQuery("FIXED ORDER", queryEngine, distinct, selected, path, + constraints); + } + } /** @@ -600,6 +627,8 @@ } } + final boolean distinct = false; + final IVariable<?>[] selected; final IConstraint[] constraints; final IPredicate[] preds; final IPredicate p0, p1, p2, p3, p4, p5, p6; @@ -611,6 +640,8 @@ final IVariable p3Var = Var.var("p3"); final IVariable testVar = Var.var("testVar"); + selected = new IVariable[]{product,label}; + // The name space for the SPO relation. final String[] spoRelation = new String[] { namespace + ".spo" }; @@ -730,8 +761,8 @@ * FIXME The optional join group is part of the tail plan and can not be * fed into the RTO right now. */ - final IPredicate<?>[] runtimeOrder = doTest(preds, new IConstraint[] { - c0, c1 }); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, preds, + new IConstraint[] { c0, c1 }); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBarData.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBarData.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBarData.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -265,15 +265,16 @@ } } - final IPredicate[] preds; - final IPredicate p0, p1, p2, p3, p4, p5; + final IVariable<?>[] selected; + final IPredicate<?>[] preds; + final IPredicate<?> p0, p1, p2, p3, p4, p5; { // a, value, d, b, f - final IVariable<?> a = Var.var("a"); - final IVariable<?> value = Var.var("value"); - final IVariable<?> d = Var.var("d"); - final IVariable<?> b = Var.var("b"); - final IVariable<?> f = Var.var("f"); + final IVariable<?> a = Var.var("a"); // ?item + final IVariable<?> value = Var.var("value"); // ?value + final IVariable<?> d = Var.var("d"); // ?type + final IVariable<?> b = Var.var("b"); // ?order + final IVariable<?> f = Var.var("f"); // ?employeeNum final IVariable<?> g0 = Var.var("g0"); final IVariable<?> g1 = Var.var("g1"); @@ -281,13 +282,14 @@ final IVariable<?> g3 = Var.var("g3"); final IVariable<?> g4 = Var.var("g4"); final IVariable<?> g5 = Var.var("g5"); - + selected = new IVariable[]{f,d}; + // The name space for the SPO relation. final String[] spoRelation = new String[] { namespace + ".spo" }; - // The name space for the Lexicon relation. - final String[] lexRelation = new String[] { namespace + ".lex" }; +// // The name space for the Lexicon relation. +// final String[] lexRelation = new String[] { namespace + ".lex" }; final long timestamp = database.getIndexManager().getLastCommitTime(); @@ -348,7 +350,10 @@ } - final IPredicate<?>[] runtimeOrder = doTest(preds, null/* constraints */); + // TODO Should use GROUP BY with SELECT expression rather than DISTINCT. + final boolean distinct = true; + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, preds, + null/* constraints */); { /* @@ -360,8 +365,10 @@ */ // after the refactor. - final int[] expected = new int[]{0, 1, 2, 3, 4, 5}; - + final int[] expected; + expected = new int[] { 0, 1, 2, 3, 4, 5 }; // estCard +// expected = new int[] { 0, 1, 2, 3, 4, 5 }; // estRead + // before the refactor. // final int[] expected = new int[] { 0, 1, 3, 2, 4, 5 }; Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java 2011-02-28 16:24:53 UTC (rev 4259) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java 2011-02-28 16:26:37 UTC (rev 4260) @@ -368,13 +368,17 @@ } } - final IPredicate[] preds; - final IPredicate p0, p1, p2, p3, p4, p5; + final boolean distinct = false; + final IVariable<?>[] selected; + final IPredicate<?>[] preds; + final IPredicate<?> p0, p1, p2, p3, p4, p5; { final IVariable<?> x = Var.var("x"); final IVariable<?> y = Var.var("y"); final IVariable<?> z = Var.var("z"); + selected = new IVariable[] { x, y, z }; + // The name space for the SPO relation. final String[] relation = new String[] { namespace + ".spo" }; @@ -437,29 +441,34 @@ preds = new IPredicate[] { p0, p1, p2, p3, p4, p5 }; } - final IPredicate<?>[] runtimeOrder = doTest(preds, null/* constraints */); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, preds, + null/* constraints */); - if(!useExistingJournal) { - /* - * Verify that the runtime optimizer produced the expected join - * path. - * - * Note: There are no solutions for this query against U1. The - * optimizer is only providing the fastest path to prove that. We - * have to use a larger data set if we want to verify the optimizers - * join path for a query which produces solutions in the data. - */ - + final int[] expected; + if(useExistingJournal) { + // after refactor on U50 + expected = new int[] {2, 4, 5, 3, 0, 1}; // based on estCard +// expected = new int[] {1, 4, 5, 3, 0, 2}; // based on estRead + } else { // order produced after refactor - final int[] expected = new int[] { 4, 5, 0, 1, 2, 3 }; + expected = new int[] { 4, 5, 0, 1, 2, 3 }; // order produced before refactor. -// final int[] expected = new int[] { 4, 5, 0, 3, 1, 2 }; +// expected = new int[] { 4, 5, 0, 3, 1, 2 }; + } - assertEquals("runtimeOrder", expected, BOpUtility - .getPredIds(runtimeOrder)); - } + /* + * Verify that the runtime optimizer produced the expected join path. + * + * Note: There are no solutions for this query against U1. The optimizer + * is only providing the fastest path to prove that. We have to use a + * larger data set if we want to verify the optimizers join path for a + * query which produces solutions in the data. + */ + assertEquals("runtimeOrder", expected, BOpUtility + .getPredIds(runtimeOrder)); + } // LUBM_Q2 /** @@ -527,13 +536,17 @@ } } - final IPredicate[] preds; - final IPredicate p0, p1, p2, p3, p4; + final boolean distinct = false; + final IVariable<?>[] selected; + final IPredicate<?>[] preds; + final IPredicate<?> p0, p1, p2, p3, p4; { final IVariable<?> x = Var.var("x"); final IVariable<?> y = Var.var("y"); final IVariable<?> z = Var.var("z"); + selected = new IVariable[]{x,y,z}; + // The name space for the SPO relation. final String[] relation = new String[] { namespace + ".spo" }; @@ -588,17 +601,18 @@ preds = new IPredicate[] { p0, p1, p2, p3, p4 }; } - final IPredicate<?>[] runtimeOrder = doTest(preds, null/* constraints */); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, preds, + null/* constraints */); - if (!useExistingJournal) { - /* - * Verify that the runtime optimizer produced the expected join - * path. - */ - assertEquals("runtimeOrder", new int[] { 3, 0, 2, 1, 4 }, - BOpUtility.getPredIds(runtimeOrder)); - } - + /* + * Verify that the runtime optimizer produced the expected join path. + */ + final int[] expected; + expected = new int[] { 3, 0, 2, 1, 4 };// estCard +// expected = new int[] { 3, 0, 2, 1, 4 };// estRead + assertEquals("runtimeOrder", expected, BOpUtility + .getPredIds(runtimeOrder)); + } // LUBM Q8 /** @@ -667,13 +681,17 @@ } } - final IPredicate[] preds; - final IPredicate p0, p1, p2, p3, p4, p5; + final boolean distinct = false; + final IVariable<?>[] selected; + final IPredicate<?>[] preds; + final IPredicate<?> p0, p1, p2, p3, p4, p5; { final IVariable<?> x = Var.var("x"); final IVariable<?> y = Var.var("y"); final IVariable<?> z = Var.var("z"); + selected = new IVariable[] { x, y, z }; + // The name space for the SPO relation. final String[] relation = new String[] { namespace + ".spo" }; @@ -736,23 +754,26 @@ preds = new IPredicate[] { p0, p1, p2, p3, p4, p5 }; } - final IPredicate<?>[] runtimeOrder = doTest(preds, null/* constraints */); + final IPredicate<?>[] runtimeOrder = doTest(distinct, selected, preds, + null/* constraints */); - if (!useExistingJournal) { - /* - * Verify that the runtime optimiz... [truncated message content] |
From: <tho...@us...> - 2011-02-28 16:24:59
|
Revision: 4259 http://bigdata.svn.sourceforge.net/bigdata/?rev=4259&view=rev Author: thompsonbry Date: 2011-02-28 16:24:53 +0000 (Mon, 28 Feb 2011) Log Message: ----------- Fixed parameter checks for DistinctBindingSetOp Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/solutions/TestDistinctBindingSets.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java 2011-02-27 21:14:54 UTC (rev 4258) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/solutions/DistinctBindingSetOp.java 2011-02-28 16:24:53 UTC (rev 4259) @@ -1,5 +1,6 @@ package com.bigdata.bop.solutions; +import java.util.Arrays; import java.util.LinkedList; import java.util.List; import java.util.Map; @@ -7,6 +8,8 @@ import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.FutureTask; +import org.apache.log4j.Logger; + import com.bigdata.bop.BOp; import com.bigdata.bop.BOpContext; import com.bigdata.bop.ConcurrentHashMapAnnotations; @@ -32,6 +35,9 @@ */ public class DistinctBindingSetOp extends PipelineOp { + private final static transient Logger log = Logger + .getLogger(DistinctBindingSetOp.class); + /** * */ @@ -74,11 +80,16 @@ } // shared state is used to share the hash table. - if (isSharedState()) { + if (!isSharedState()) { throw new UnsupportedOperationException(Annotations.SHARED_STATE + "=" + isSharedState()); } - + + final IVariable<?>[] vars = (IVariable[]) getProperty(Annotations.VARIABLES); + + if (vars == null || vars.length == 0) + throw new IllegalArgumentException(); + } /** @@ -266,8 +277,14 @@ final Solution s = new Solution(r); + if (log.isTraceEnabled()) + log.trace("considering: " + Arrays.toString(r)); + final boolean distinct = map.putIfAbsent(s, s) == null; + if (distinct && log.isDebugEnabled()) + log.debug("accepted: " + Arrays.toString(r)); + return distinct ? r : null; } @@ -297,8 +314,6 @@ for (IBindingSet bset : a) { -// System.err.println("considering: " + bset); - /* * Test to see if this solution is distinct from those * already seen. @@ -315,9 +330,6 @@ * this operator. */ -// System.err.println("accepted: " -// + Arrays.toString(vals)); - final ListBindingSet tmp = new ListBindingSet(); for (int i = 0; i < vars.length; i++) { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/solutions/TestDistinctBindingSets.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/solutions/TestDistinctBindingSets.java 2011-02-27 21:14:54 UTC (rev 4258) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/solutions/TestDistinctBindingSets.java 2011-02-28 16:24:53 UTC (rev 4259) @@ -42,6 +42,7 @@ import com.bigdata.bop.IConstant; import com.bigdata.bop.IVariable; import com.bigdata.bop.NV; +import com.bigdata.bop.PipelineOp; import com.bigdata.bop.Var; import com.bigdata.bop.bindingSet.ArrayBindingSet; import com.bigdata.bop.bindingSet.HashBindingSet; @@ -160,7 +161,66 @@ data = null; } + + public void test_ctor_correctRejection() { + + final Var<?> x = Var.var("x"); + final int distinctId = 1; + + // w/o variables. + try { + new DistinctBindingSetOp(new BOp[]{}, + NV.asMap(new NV[]{// + new NV(DistinctBindingSetOp.Annotations.BOP_ID,distinctId),// +// new NV(DistinctBindingSetOp.Annotations.VARIABLES,new IVariable[]{x}),// + new NV(PipelineOp.Annotations.EVALUATION_CONTEXT, + BOpEvaluationContext.CONTROLLER),// + new NV(PipelineOp.Annotations.SHARED_STATE, + true),// + })); + fail("Expecting: "+IllegalArgumentException.class); + } catch(IllegalArgumentException ex) { + if(log.isInfoEnabled()) + log.info("Ignoring expected exception: "+ex); + } + + // w/o evaluation on the query controller. + try { + new DistinctBindingSetOp(new BOp[]{}, + NV.asMap(new NV[]{// + new NV(DistinctBindingSetOp.Annotations.BOP_ID,distinctId),// + new NV(DistinctBindingSetOp.Annotations.VARIABLES,new IVariable[]{x}),// +// new NV(PipelineOp.Annotations.EVALUATION_CONTEXT, +// BOpEvaluationContext.CONTROLLER),// + new NV(PipelineOp.Annotations.SHARED_STATE, + true),// + })); + fail("Expecting: "+UnsupportedOperationException.class); + } catch(UnsupportedOperationException ex) { + if(log.isInfoEnabled()) + log.info("Ignoring expected exception: "+ex); + } + + // w/o shared state. + try { + new DistinctBindingSetOp(new BOp[]{}, + NV.asMap(new NV[]{// + new NV(DistinctBindingSetOp.Annotations.BOP_ID,distinctId),// + new NV(DistinctBindingSetOp.Annotations.VARIABLES,new IVariable[]{x}),// + new NV(PipelineOp.Annotations.EVALUATION_CONTEXT, + BOpEvaluationContext.CONTROLLER),// +// new NV(PipelineOp.Annotations.SHARED_STATE, +// true),// + })); + fail("Expecting: "+UnsupportedOperationException.class); + } catch(UnsupportedOperationException ex) { + if(log.isInfoEnabled()) + log.info("Ignoring expected exception: "+ex); + } + + } + /** * Unit test for distinct. * @@ -179,10 +239,10 @@ NV.asMap(new NV[]{// new NV(DistinctBindingSetOp.Annotations.BOP_ID,distinctId),// new NV(DistinctBindingSetOp.Annotations.VARIABLES,new IVariable[]{x}),// - new NV(MemorySortOp.Annotations.EVALUATION_CONTEXT, + new NV(PipelineOp.Annotations.EVALUATION_CONTEXT, BOpEvaluationContext.CONTROLLER),// -// new NV(MemorySortOp.Annotations.SHARED_STATE, -// true),// + new NV(PipelineOp.Annotations.SHARED_STATE, + true),// })); // the expected solutions This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-27 21:15:01
|
Revision: 4258 http://bigdata.svn.sourceforge.net/bigdata/?rev=4258&view=rev Author: thompsonbry Date: 2011-02-27 21:14:54 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Modified the non-SPARQL constraint to not trap any exceptions since there are no strong semantics for that. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-02-27 21:14:12 UTC (rev 4257) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/java/com/bigdata/rdf/internal/constraints/Constraint.java 2011-02-27 21:14:54 UTC (rev 4258) @@ -34,6 +34,7 @@ import com.bigdata.bop.IValueExpression; import com.bigdata.rdf.error.SparqlTypeErrorException; import com.bigdata.rdf.internal.IV; +import com.bigdata.util.InnerCause; /** * BOpConstraint that wraps a {@link EBVBOp}, which itself computes the @@ -85,22 +86,30 @@ return (EBVBOp) super.get(i); } - public boolean accept(final IBindingSet bs) { - - try { - - // evaluate the EBV operator - return get(0).get(bs).booleanValue(); - - } catch (SparqlTypeErrorException ex) { - - // trap the type error and filter out the solution - if (log.isInfoEnabled()) - log.info("discarding solution due to type error: " + bs); - return false; - - } - - } - + public boolean accept(final IBindingSet bs) { + + try { + + // evaluate the EBV operator + return get(0).get(bs).booleanValue(); + + } catch (Throwable t) { + + if (InnerCause.isInnerCause(t, SparqlTypeErrorException.class)) { + + // trap the type error and filter out the solution + if (log.isInfoEnabled()) + log.info("discarding solution due to type error: " + bs + + " : " + t); + + return false; + + } + + throw new RuntimeException(t); + + } + + } + } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-27 21:14:18
|
Revision: 4257 http://bigdata.svn.sourceforge.net/bigdata/?rev=4257&view=rev Author: thompsonbry Date: 2011-02-27 21:14:12 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Missed on the last commit. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java 2011-02-27 21:14:03 UTC (rev 4256) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnLubm.java 2011-02-27 21:14:12 UTC (rev 4257) @@ -141,7 +141,7 @@ * large data set to assess the relative performance of the static and * runtime query optimizers). */ - static private final boolean useExistingJournal = false; + static private final boolean useExistingJournal = true; protected Journal getJournal(final Properties properties) throws Exception { This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-27 21:14:11
|
Revision: 4256 http://bigdata.svn.sourceforge.net/bigdata/?rev=4256&view=rev Author: thompsonbry Date: 2011-02-27 21:14:03 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Progress on the runtime optimizer. - Tracking more statistics in the JGraph trace. - Tracking the expected #of tuples read as well as the expected cardinality. This let's us look at a proxy for IO costs, but real IO estimates are tricker. - Made the limit dynamic on a per-patch basis and responsive when there is cardinality estimate underflow. - Never prune a path if the cardinality estimate has underflowed. Instead, increase the sample limit more quickly. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EdgeSample.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EstimatedCardinalityComparator.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/SampleBase.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Vertex.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/VertexSample.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBSBMData.java branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/TestJoinGraphOnBarData.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EdgeSample.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EdgeSample.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EdgeSample.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -45,6 +45,11 @@ public final int inputCount; /** + * The #of tuples read from the access path when processing the cutoff join. + */ + public final long tuplesRead; + + /** * The #of binding sets generated before the join was cutoff. * <p> * Note: If the outputCount is zero then this is a good indicator that there @@ -53,10 +58,11 @@ */ public final long outputCount; - /** - * The #of tuples read from the access path when processing the cutoff join. - */ - public final long tuplesRead; + /** + * The adjusted cardinality estimate for the cutoff join (this is + * {@link #outputCount} as adjusted for a variety of edge conditions). + */ + public final long adjCard; /** * The ratio of the #of input samples consumed to the #of output samples @@ -64,37 +70,71 @@ */ public final double f; - /** - * Create an object which encapsulates a sample of an edge. - * - * @param sourceSample - * The input sample. - * @param limit - * The limit used to sample the edge (this is the limit on the - * #of solutions generated by the cutoff join used when this - * sample was taken). - * @param inputCount - * The #of binding sets out of the source sample vertex sample - * which were consumed. - * @param outputCount - * The #of binding sets generated before the join was cutoff. - * @param tuplesRead - * The #of tuples read from the access path when processing the - * cutoff join. - */ + /** + * The sum of the fast range count for each access path tested. + * <p> + * Note: We use pipeline joins to sample cutoff joins so there will be one + * access path read for each solution in. However, a hash join could be used + * when the operator is fully executed. The hash join will have one access + * path on which we read for all input solutions and the range count of the + * access path will be larger since the access path will be less + * constrained. + */ + public final long sumRangeCount; + + /** + * Estimated tuples read if the operator were fully executed. This is in + * contrast to {@link SampleBase#estCard}, which is the estimated output + * cardinality if the operator were fully executed. + * + * TODO The actual IOs depend on the join type (hash join versus pipeline + * join) and whether or not the file has index order (segment versus + * journal). A hash join will read once on the AP. A pipeline join will read + * once per input solution. A key-range read on a segment uses multi-block + * IO while a key-range read on a journal uses random IO. Also, remote + * access path reads are more expensive than sharded or hash partitioned + * access path reads in scale-out. + */ + public final long estRead; + + /** + * Create an object which encapsulates a sample of an edge. + * + * @param sourceSample + * The input sample. + * @param limit + * The limit used to sample the edge (this is the limit on the + * #of solutions generated by the cutoff join used when this + * sample was taken). + * @param inputCount + * The #of binding sets out of the source sample vertex sample + * which were consumed. + * @param tuplesRead + * The #of tuples read from the access path when processing the + * cutoff join. + * @param outputCount + * The #of binding sets generated before the join was cutoff. + * @param adjustedCard + * The adjusted cardinality estimate for the cutoff join (this is + * <i>outputCount</i> as adjusted for a variety of edge + * conditions). + */ EdgeSample(final SampleBase sourceSample,// final int inputCount, // + final long tuplesRead,// + final long sumRangeCount,// final long outputCount,// - final long tuplesRead,// + final long adjustedCard,// final double f, // // args to SampleBase - final long estimatedCardinality,// + final long estCard,// + final long estRead,// final int limit,// final EstimateEnum estimateEnum,// final IBindingSet[] sample// ) { - super(estimatedCardinality, limit, estimateEnum, sample); + super(estCard, limit, estimateEnum, sample); if (sourceSample == null) throw new IllegalArgumentException(); @@ -103,22 +143,31 @@ this.inputCount = inputCount; + this.tuplesRead = tuplesRead; + + this.sumRangeCount = sumRangeCount; + this.outputCount = outputCount; + + this.adjCard = adjustedCard; - this.tuplesRead = tuplesRead; + this.f = f; - this.f = f; + this.estRead = estRead; } @Override protected void toString(final StringBuilder sb) { - sb.append(", sourceEstimatedCardinality=" - + sourceSample.estimatedCardinality); - sb.append(", sourceEstimateEnum=" + sourceSample.estimateEnum); - sb.append(", inputCount=" + inputCount); - sb.append(", outputCount=" + outputCount); - sb.append(", f=" + f); - } + sb.append(", sourceEstCard=" + sourceSample.estCard); + sb.append(", sourceEstimateEnum=" + sourceSample.estimateEnum); + sb.append(", inputCount=" + inputCount); + sb.append(", tuplesRead=" + tuplesRead); + sb.append(", sumRangeCount=" + sumRangeCount); + sb.append(", outputCount=" + outputCount); + sb.append(", adjustedCard=" + adjCard); + sb.append(", f=" + f); + sb.append(", estRead=" + estRead); + } } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EstimatedCardinalityComparator.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EstimatedCardinalityComparator.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/EstimatedCardinalityComparator.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -50,8 +50,8 @@ // o2 is not weighted. sort o2 to the end. return -1; } - final long id1 = o1.edgeSample.estimatedCardinality; - final long id2 = o2.edgeSample.estimatedCardinality; + final long id1 = o1.edgeSample.estCard; + final long id2 = o2.edgeSample.estCard; if (id1 < id2) return -1; if (id1 > id2) Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/JGraph.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -83,9 +83,9 @@ * dominated paths. * * TODO Compare the cumulative expected cardinality of a join path with the - * expected cost of a join path. The latter allows us to also explore - * alternative join strategies, such as the parallel subquery versus scan and - * filter decision for named graph and default graph SPARQL queries. + * tuples read of a join path. The latter allows us to also explore alternative + * join strategies, such as the parallel subquery versus scan and filter + * decision for named graph and default graph SPARQL queries. * * TODO Coalescing duplicate access paths can dramatically reduce the work * performed by a pipelined nested index subquery. (A hash join eliminates all @@ -93,14 +93,6 @@ * pipeline nested index subquery join, then should the runtime query optimizer * prefer paths with duplicate access paths? * - * TODO How can we handle things like lexicon joins. A lexicon join is is only - * evaluated when the dynamic type of a variable binding indicates that the RDF - * Value must be materialized by a join against the ID2T index. Binding sets - * having inlined values can simply be routed around the join against the ID2T - * index. Routing around saves network IO in scale-out where otherwise we would - * route binding sets having identifiers which do not need to be materialized to - * the ID2T shards. - * * @todo Examine the overhead of the runtime optimizer. Look at ways to prune * its costs. For example, by pruning the search, by recognizing when the * query is simple enough to execute directly, by recognizing when we have @@ -217,8 +209,10 @@ */ public class JGraph { - private static final transient Logger log = Logger.getLogger(JGraph.class); + private static final String NA = "N/A"; + private static final transient Logger log = Logger.getLogger(JGraph.class); + /** * Vertices of the join graph. */ @@ -254,32 +248,37 @@ return sb.toString(); } - /** - * - * @param v - * The vertices of the join graph. These are - * {@link IPredicate}s associated with required joins. - * @param constraints - * The constraints of the join graph (optional). Since all - * joins in the join graph are required, constraints are - * dynamically attached to the first join in which all of - * their variables are bound. - * - * @throws IllegalArgumentException - * if the vertices is <code>null</code>. - * @throws IllegalArgumentException - * if the vertices is an empty array. - * @throws IllegalArgumentException - * if any element of the vertices is <code>null</code>. - * @throws IllegalArgumentException - * if any constraint uses a variable which is never bound by - * the given predicates. - * @throws IllegalArgumentException - * if <i>sampleType</i> is <code>null</code>. - * - * @todo unit test for a constraint using a variable which is never - * bound. - */ + /** + * + * @param v + * The vertices of the join graph. These are {@link IPredicate}s + * associated with required joins. + * @param constraints + * The constraints of the join graph (optional). Since all joins + * in the join graph are required, constraints are dynamically + * attached to the first join in which all of their variables are + * bound. + * + * @throws IllegalArgumentException + * if the vertices is <code>null</code>. + * @throws IllegalArgumentException + * if the vertices is an empty array. + * @throws IllegalArgumentException + * if any element of the vertices is <code>null</code>. + * @throws IllegalArgumentException + * if any constraint uses a variable which is never bound by the + * given predicates. + * @throws IllegalArgumentException + * if <i>sampleType</i> is <code>null</code>. + * + * @todo unit test for a constraint using a variable which is never bound. + * the constraint should be attached at the last vertex in the join + * path. this will cause the query to fail unless the variable was + * already bound, e.g., by a parent query or in the solutions pumped + * into the {@link JoinGraph} operator. + * + * @todo unit test when the join graph has a single vertex. + */ public JGraph(final IPredicate<?>[] v, final IConstraint[] constraints, final SampleType sampleType) { @@ -557,30 +556,35 @@ */ sampleAllVertices(queryEngine, limit); - if (log.isDebugEnabled()) { - final StringBuilder sb = new StringBuilder(); - sb.append("Vertices:\n"); - for (Vertex v : V) { - sb.append(v.toString()); - sb.append("\n"); - } - log.debug(sb.toString()); - } + if (log.isInfoEnabled()) { + final StringBuilder sb = new StringBuilder(); + sb.append("Sampled vertices:\n"); + for (Vertex v : V) { + if (v.sample != null) { + sb.append("id="+v.pred.getId()+" : "); + sb.append(v.sample.toString()); + sb.append("\n"); + } + } + log.info(sb.toString()); + } /* * Estimate the cardinality for each edge. */ final Path[] a = estimateInitialEdgeWeights(queryEngine, limit); - if (log.isDebugEnabled()) { - final StringBuilder sb = new StringBuilder(); - sb.append("All possible initial paths:\n"); - for (Path x : a) { - sb.append(x.toString()); - sb.append("\n"); - } - log.debug(sb.toString()); - } +// if (log.isDebugEnabled()) { +// final StringBuilder sb = new StringBuilder(); +// sb.append("All possible initial paths:\n"); +// for (Path x : a) { +// sb.append(x.toString()); +// sb.append("\n"); +// } +// log.debug(sb.toString()); +// } + if (log.isInfoEnabled()) + log.info("\n*** Initial Paths\n" + JGraph.showTable(a)); /* * Choose the initial set of paths. @@ -681,12 +685,11 @@ if (a.length == 0) throw new IllegalArgumentException(); - // increment the limit by itself in each round. - final int limit = (round + 1) * limitIn; +// // increment the limit by itself in each round. +// final int limit = (round + 1) * limitIn; if (log.isDebugEnabled()) - log.debug("round=" + round + ", limit=" + limit - + ", #paths(in)=" + a.length); + log.debug("round=" + round + ", #paths(in)=" + a.length); /* * Re-sample the vertices which are the initial vertex of any of the @@ -703,10 +706,12 @@ * a NOP if the vertex has been fully materialized. */ if (log.isDebugEnabled()) - log.debug("Re-sampling in-use vertices: limit=" + limit); + log.debug("Re-sampling in-use vertices."); for (Path x : a) { + final int limit = x.getNewLimit(limitIn); + x.vertices[0].sample(queryEngine, limit, sampleType); } @@ -725,11 +730,22 @@ * a given path prefix more than once per round. */ if (log.isDebugEnabled()) - log.debug("Re-sampling in-use path segments: limit=" + limit); + log.debug("Re-sampling in-use path segments."); for (Path x : a) { - // The cutoff join sample of the one step shorter path segment. + /* + * Get the new sample limit for the path. + * + * TODO We only need to increase the sample limit starting at the + * vertex where we have a cardinality underflow or variability in + * the cardinality estimate. This is increasing the limit in each + * round of expansion, which means that we are reading more data + * than we really need to read. + */ + final int limit = x.getNewLimit(limitIn); + + // The cutoff join sample of the one step shorter path segment. EdgeSample priorEdgeSample = null; for (int segmentLength = 2; segmentLength <= x.vertices.length; segmentLength++) { @@ -775,6 +791,7 @@ queryEngine, limit,// x.getPathSegment(2),// 1st edge. C,// constraints + V.length == 2,// pathIsComplete x.vertices[0].sample// source sample. ); @@ -812,6 +829,7 @@ limit,// x.getPathSegment(ids.length()),// C, // constraints + V.length == ids.length(), // pathIsComplete priorEdgeSample// ); @@ -844,14 +862,19 @@ */ if (log.isDebugEnabled()) - log.debug("Expanding paths: limit=" + limit + ", #paths(in)=" - + a.length); + log.debug("Expanding paths: #paths(in)=" + a.length); final List<Path> tmp = new LinkedList<Path>(); for (Path x : a) { - /* + /* + * We already increased the sample limit for the path in the loop + * above. + */ + final int limit = x.edgeSample.limit; + + /* * The set of vertices used to expand this path in this round. */ final Set<Vertex> used = new LinkedHashSet<Vertex>(); @@ -916,9 +939,10 @@ // add the new vertex to the set of used vertices. used.add(tVertex); - // Extend the path to the new vertex. - final Path p = x.addEdge(queryEngine, limit, - tVertex, /*dynamicEdge,*/ C); + // Extend the path to the new vertex. + final Path p = x + .addEdge(queryEngine, limit, tVertex, /* dynamicEdge, */ + C, x.getVertexCount() + 1 == V.length/* pathIsComplete */); // Add to the set of paths for this round. tmp.add(p); @@ -954,8 +978,9 @@ final Vertex tVertex = nothingShared.iterator().next(); // Extend the path to the new vertex. - final Path p = x.addEdge(queryEngine, limit, - tVertex,/*dynamicEdge*/ C); + final Path p = x + .addEdge(queryEngine, limit, tVertex,/* dynamicEdge */ + C, x.getVertexCount() + 1 == V.length/* pathIsComplete */); // Add to the set of paths for this round. tmp.add(p); @@ -981,12 +1006,12 @@ final Path[] paths_tp1_pruned = pruneJoinPaths(paths_tp1, edgeSamples); if (log.isDebugEnabled()) - log.debug("\n*** round=" + round + ", limit=" + limit + log.debug("\n*** round=" + round + " : generated paths\n" + JGraph.showTable(paths_tp1, paths_tp1_pruned)); if (log.isInfoEnabled()) - log.info("\n*** round=" + round + ", limit=" + limit + log.info("\n*** round=" + round + ": paths{in=" + a.length + ",considered=" + paths_tp1.length + ",out=" + paths_tp1_pruned.length + "}\n" + JGraph.showTable(paths_tp1_pruned)); @@ -1012,15 +1037,31 @@ return null; } - /** - * Obtain a sample and estimated cardinality (fast range count) for each - * vertex. - * - * @param queryEngine - * The query engine. - * @param limit - * The sample size. - */ + /** + * Obtain a sample and estimated cardinality (fast range count) for each + * vertex. + * + * @param queryEngine + * The query engine. + * @param limit + * The sample size. + * + * TODO Only sample vertices with an index. + * + * TODO Consider other cases where we can avoid sampling a vertex + * or an initial edge. + * <p> + * Be careful about rejecting high cardinality vertices here as + * they can lead to good solutions (see the "bar" data set + * example). + * <p> + * BSBM Q5 provides a counter example where (unless we translate + * it into a key-range constraint on an index) some vertices do + * not share a variable directly and hence will materialize the + * full cross product before filtering which is *really* + * expensive. + * + */ public void sampleAllVertices(final QueryEngine queryEngine, final int limit) { for (Vertex v : V) { @@ -1068,7 +1109,9 @@ * create a join path with a single edge (v,vp) using the sample * obtained from the cutoff join. */ - + + final boolean pathIsComplete = 2 == V.length; + for (int i = 0; i < V.length; i++) { final Vertex v1 = V[i]; @@ -1106,7 +1149,7 @@ */ final Vertex v, vp; - if (v1.sample.estimatedCardinality < v2.sample.estimatedCardinality) { + if (v1.sample.estCard < v2.sample.estCard) { v = v1; vp = v2; } else { @@ -1143,6 +1186,7 @@ limit, // sample limit preds, // ordered path segment. C, // constraints + pathIsComplete,// v.sample // sourceSample ); @@ -1181,6 +1225,7 @@ */ public Path[] pruneJoinPaths(final Path[] a, final Map<PathIds, EdgeSample> edgeSamples) { + final boolean neverPruneUnderflow = true; /* * Find the length of the longest path(s). All shorter paths are * dropped in each round. @@ -1198,7 +1243,12 @@ final Path Pi = a[i]; if (Pi.edgeSample == null) throw new RuntimeException("Not sampled: " + Pi); - if (Pi.vertices.length < maxPathLen) { + if (neverPruneUnderflow + && Pi.edgeSample.estimateEnum == EstimateEnum.Underflow) { + // Do not prune if path has cardinality underflow. + continue; + } + if (Pi.vertices.length < maxPathLen) { /* * Only the most recently generated set of paths survive to * the next round. @@ -1214,16 +1264,21 @@ final Path Pj = a[j]; if (Pj.edgeSample == null) throw new RuntimeException("Not sampled: " + Pj); - if (pruned.contains(Pj)) + if (neverPruneUnderflow + && Pj.edgeSample.estimateEnum == EstimateEnum.Underflow) { + // Do not prune if path has cardinality underflow. + continue; + } + if (pruned.contains(Pj)) continue; final boolean isPiSuperSet = Pi.isUnorderedVariant(Pj); if (!isPiSuperSet) { // Can not directly compare these join paths. continue; } - final long costPi = Pi.cumulativeEstimatedCardinality; - final long costPj = Pj.cumulativeEstimatedCardinality; - final boolean lte = costPi <= costPj; + final long costPi = Pi.sumEstCard; + final long costPj = Pj.sumEstCard; + final boolean lte = costPi <= costPj; List<Integer> prunedByThisPath = null; if (lte) { prunedByThisPath = new LinkedList<Integer>(); @@ -1363,17 +1418,24 @@ static public String showTable(final Path[] a,final Path[] pruned) { final StringBuilder sb = new StringBuilder(); final Formatter f = new Formatter(sb); - f.format("%-6s %10s%1s * %10s (%6s %6s %6s) = %10s%1s : %10s %10s", + f.format("%-4s %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s %10s", "path",// - "sourceCard",// + "srcCard",// "",// sourceSampleExact "f",// + // ( "in",// - "read",// + "sumRgCt",//sumRangeCount + "tplsRead",// "out",// + "limit",// + "adjCard",// + // ) = + "estRead",// "estCard",// "",// estimateIs(Exact|LowerBound|UpperBound) - "sumEstCard",// + "sumEstRead",// sumEstimatedTuplesRead + "sumEstCard",// sumEstimatedCardinality "joinPath\n" ); for (int i = 0; i < a.length; i++) { @@ -1391,21 +1453,27 @@ } final EdgeSample edgeSample = x.edgeSample; if (edgeSample == null) { - f.format("%6d %10s%1s * %10s (%6s %6s %6s) = %10s%1s : %10s",// - i, "N/A", "", "N/A", "N/A", "N/A", "N/A", "N/A", "", - "N/A"); + f.format("%4d %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s",// + i, NA, "", NA, NA, NA, NA, NA, NA, NA, NA, NA, "", NA, + NA); } else { - f.format("%6d %10d%1s * % 10.2f (%6d %6d %6d) = % 10d%1s : % 10d", // + f.format("%4d %10d%1s * % 10.2f (%8d %8d %8d %8d %8d %8d) = %10d % 10d%1s : % 10d % 10d", // i,// - edgeSample.sourceSample.estimatedCardinality,// + edgeSample.sourceSample.estCard,// edgeSample.sourceSample.estimateEnum.getCode(),// edgeSample.f,// edgeSample.inputCount,// + edgeSample.sumRangeCount,// edgeSample.tuplesRead,// edgeSample.outputCount,// - edgeSample.estimatedCardinality,// + edgeSample.limit,// + edgeSample.adjCard,// + // = + edgeSample.estRead,// + edgeSample.estCard,// edgeSample.estimateEnum.getCode(),// - x.cumulativeEstimatedCardinality// + x.sumEstRead,// + x.sumEstCard// ); } sb.append(" ["); @@ -1443,70 +1511,103 @@ /* * @todo show limit on samples? */ - f.format("%6s %10s%1s * %10s (%6s %6s %6s) = %10s%1s : %10s",// - "vertex", - "sourceCard",// + f.format("%4s %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s",// %10s %10s",// + "vert", + "srcCard",// "",// sourceSampleExact "f",// + // ( "in",// - "read",// + "sumRgCt",// sumRangeCount + "tplsRead",// tuplesRead "out",// + "limit",// + "adjCard",// + // ) = + "estRead",// "estCard",// "",// estimateIs(Exact|LowerBound|UpperBound) + "sumEstRead",// "sumEstCard"// ); - long sumEstCard = 0; + long sumEstRead = 0; // sum(estRead), where estRead := tuplesRead*f + long sumEstCard = 0; // sum(estCard) +// double sumEstCost = 0; // sum(f(estCard,estRead)) for (int i = 0; i < x.vertices.length; i++) { final int[] ids = BOpUtility .getPredIds(x.getPathSegment(i + 1)); final int predId = x.vertices[i].pred.getId(); final SampleBase sample; - if(i==0) { - sample = x.vertices[i].sample; - } else { - // edge sample from the caller's map. - sample = edgeSamples.get(new PathIds(ids)); - } - if (sample != null) { - sumEstCard += sample.estimatedCardinality; - if (sample instanceof EdgeSample) - sumEstCard += ((EdgeSample) sample).tuplesRead; - } + if (i == 0) { + sample = x.vertices[i].sample; + if (sample != null) { + sumEstRead = sample.estCard; // dbls as estRead for vtx + sumEstCard = sample.estCard; + } + } else { + // edge sample from the caller's map. + sample = edgeSamples.get(new PathIds(ids)); + if (sample != null) { + sumEstRead+= ((EdgeSample) sample).estRead; + sumEstCard += ((EdgeSample) sample).estCard; + } + } sb.append("\n"); if (sample == null) { - f.format("% 6d %10s%1s * %10s (%6s %6s %6s) = %10s%1s : %10s",// + f.format("% 4d %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = %10s %10s%1s : %10s %10s",// %10s %10s",// predId,// - "N/A", "", "N/A", "N/A", "N/A", "N/A", "N/A", "", "N/A"); + NA, "", NA, NA, NA, NA, NA, NA, NA, NA, NA, "", NA, NA);//,NA,NA); } else if(sample instanceof VertexSample) { - // Show the vertex sample for the initial vertex. - f.format("% 6d %10s%1s * %10s (%6s %6s %6s) = % 10d%1s : %10d",// + /* + * Show the vertex sample for the initial vertex. + * + * Note: we do not store all fields for a vertex sample + * which are stored for an edge sample because so many of + * the values are redundant for a vertex sample. Therefore, + * this sets up local variables which are equivalent to the + * various edge sample columns that we will display. + */ + final long sumRangeCount = sample.estCard; + final long estRead = sample.estCard; + final long tuplesRead = Math.min(sample.estCard, sample.limit); + final long outputCount = Math.min(sample.estCard, sample.limit); + final long adjCard = Math.min(sample.estCard, sample.limit); + f.format("% 4d %10s%1s * %10s (%8s %8s %8s %8s %8s %8s) = % 10d % 10d%1s : %10d %10d",// %10d %10s",// predId,// - "N/A",//sample.sourceSample.estimatedCardinality,// - " ",//sample.sourceSample.isExact() ? "E" : "",// + " ",//srcSample.estCard + " ",//srcSample.estimateEnum " ",//sample.f,// - "N/A",//sample.inputCount,// - "N/A",//sample.tuplesRead,// - "N/A",//sample.outputCount,// - sample.estimatedCardinality,// + " ",//sample.inputCount, + sumRangeCount,// + tuplesRead,// + outputCount,// + sample.limit,// limit + adjCard,// adjustedCard + estRead,// estRead + sample.estCard,// estCard sample.estimateEnum.getCode(),// + sumEstRead,// sumEstCard// -// e.cumulativeEstimatedCardinality// ); } else { // Show the sample for a cutoff join with the 2nd+ vertex. final EdgeSample edgeSample = (EdgeSample)sample; - f.format("% 6d %10d%1s * % 10.2f (%6d %6d %6d) = % 10d%1s : %10d",// + f.format("% 4d %10d%1s * % 10.2f (%8d %8d %8d %8d %8d %8d) = % 10d % 10d%1s : %10d %10d",// %10d %10",// predId,// - edgeSample.sourceSample.estimatedCardinality,// + edgeSample.sourceSample.estCard,// edgeSample.sourceSample.estimateEnum.getCode(),// edgeSample.f,// edgeSample.inputCount,// + edgeSample.sumRangeCount,// edgeSample.tuplesRead,// edgeSample.outputCount,// - edgeSample.estimatedCardinality,// + edgeSample.limit,// + edgeSample.adjCard,// + edgeSample.estRead,// + edgeSample.estCard,// edgeSample.estimateEnum.getCode(),// + sumEstRead,// sumEstCard// -// e.cumulativeEstimatedCardinality// ); } } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -91,19 +91,43 @@ * is computed by the constructor and cached as it is used repeatedly. */ private final IPredicate<?>[] preds; + + /** + * The sample obtained by the step-wise cutoff evaluation of the ordered + * edges of the path. + * <p> + * Note: This sample is generated one edge at a time rather than by + * attempting the cutoff evaluation of the entire join path (the latter + * approach does allow us to limit the amount of work to be done to satisfy + * the cutoff). + * <p> + * Note: This is updated when we resample the path prior to expanding the + * path with another vertex. + */ + EdgeSample edgeSample;// TODO rename pathSample? + + /** + * Examine the path. If there is a cardinality underflow, then boost the + * sampling limit. Otherwise, increase the sample by the caller's value. + * + * @param limitIn + * The default increment for the sample limit. + * + * @return The limit to use when resampling this path. + */ + public int getNewLimit(final int limitIn) { + + if (edgeSample.estimateEnum == EstimateEnum.Underflow) { + + return edgeSample.limit * 2; + + } + + return edgeSample.limit + limitIn; + + } /** - * The sample obtained by the step-wise cutoff evaluation of the ordered - * edges of the path. - * <p> - * Note: This sample is generated one edge at a time rather than by - * attempting the cutoff evaluation of the entire join path (the latter - * approach does allow us to limit the amount of work to be done to - * satisfy the cutoff). - */ - EdgeSample edgeSample; - - /** * The cumulative estimated cardinality of the path. This is zero for an * empty path. For a path consisting of a single edge, this is the estimated * cardinality of that edge. When creating a new path by adding an edge to @@ -118,29 +142,36 @@ * vertex in path order. The EdgeSamples are maintained in a map * managed by JGraph during optimization. */ - final public long cumulativeEstimatedCardinality; + final public long sumEstCard; + /** + * The cumulative estimated #of tuples that would be read for this path if + * it were to be fully executed (sum(tuplesRead*f) for each step in the + * path). + */ + final public long sumEstRead; + + /** + * The expected cost of this join path if it were to be fully executed. This + * is a function of {@link #sumEstCard} and {@link #sumEstRead}. The + * former reflects the #of intermediate solutions generated. The latter + * reflects the #of tuples read from the disk. These two measures are + * tracked separately and then combined into the {@link #sumEstCost}. + */ + final public double sumEstCost; + /** - * Combine the cumulative estimated cost of the source path with the cost of - * the edge sample and return the cumulative estimated cost of the new path. + * Combine the cumulative expected cardinality and the cumulative expected + * tuples read to produce an overall measure of the expected cost of the + * join path if it were to be fully executed. * - * @param cumulativeEstimatedCardinality - * The cumulative estimated cost of the source path. - * @param edgeSample - * The cost estimate for the cutoff join required to extend the - * source path to the new path. - * @return The cumulative estimated cost of the new path. - * - * FIXME Figure out how to properly combine/weight the #of tuples - * read and the #of solutions produced! + * @return The cumulative estimated cost of the join path. */ - static private long add(final long cumulativeEstimatedCardinality, - final EdgeSample edgeSample) { + private static double getTotalCost(final Path p) { - final long total = cumulativeEstimatedCardinality + // - edgeSample.estimatedCardinality // -// + edgeSample.tuplesRead // - ; + final long total = p.sumEstCard + // + p.sumEstRead// + ; return total; @@ -162,7 +193,7 @@ // sb.append(e.getLabel()); // first = false; // } - sb.append("],cumEstCard=" + cumulativeEstimatedCardinality + sb.append("],cumEstCard=" + sumEstCard + ",sample=" + edgeSample + "}"); return sb.toString(); } @@ -205,48 +236,52 @@ if (edgeSample.getSample() == null) throw new IllegalArgumentException(); -// this.edges = Collections.singletonList(e); + this.vertices = new Vertex[]{v0,v1}; - this.vertices = new Vertex[]{v0,v1};//getVertices(edges); - this.preds = getPredicates(vertices); this.edgeSample = edgeSample; - this.cumulativeEstimatedCardinality = add(0L/*cumulativeEstimatedCardinality*/,edgeSample); -// edgeSample.estimatedCardinality +// -// edgeSample.tuplesRead// this is part of the cost too. -// ; + /* + * The estimated cardinality of the cutoff join of (v0,v1). + */ + this.sumEstCard = edgeSample.estCard; -// this.cumulativeEstimatedCardinality = // -// edgeSample.estimatedCardinality +// -// edgeSample.tuplesRead// this is part of the cost too. -// ; + /* + * The expected #of tuples read for the full join of (v0,v1). This is + * everything which could be visited for [v0] plus the #of tuples read + * from [v1] during the cutoff join times the (adjusted) join hit ratio. + */ + this.sumEstRead = v0.sample.estCard + edgeSample.estRead; + this.sumEstCost = getTotalCost(this); + } - /** - * Private constructor used when we extend a path. - * - * @param vertices - * The ordered array of vertices in the new path. The last entry - * in this array is the vertex which was used to extend the path. - * @param preds - * The ordered array of predicates in the new path (correlated - * with the vertices and passed in since it is already computed - * by the caller). - * @param cumulativeEstimatedCardinality - * The cumulative estimated cardinality of the new path. - * @param edgeSample - * The sample from the cutoff join of the last vertex added to - * this path. - */ + /** + * Private constructor used when we extend a path. + * + * @param vertices + * The ordered array of vertices in the new path. The last entry + * in this array is the vertex which was used to extend the path. + * @param preds + * The ordered array of predicates in the new path (correlated + * with the vertices and passed in since it is already computed + * by the caller). + * @param edgeSample + * The sample from the cutoff join of the last vertex added to + * this path. + * @param sumEstCard + * The cumulative estimated cardinality of the new path. + * @param sumEstRead + * The cumulative estimated tuples read of the new path. + */ private Path(// final Vertex[] vertices,// final IPredicate<?>[] preds,// -// final List<Edge> edges,// - final long cumulativeEstimatedCardinality,// - final EdgeSample edgeSample// + final EdgeSample edgeSample,// + final long sumEstCard,// + final long sumEstRead// ) { if (vertices == null) @@ -258,7 +293,7 @@ if (vertices.length != preds.length) throw new IllegalArgumentException(); - if (cumulativeEstimatedCardinality < 0) + if (sumEstCard < 0) throw new IllegalArgumentException(); if (edgeSample == null) @@ -267,15 +302,17 @@ if (edgeSample.getSample() == null) throw new IllegalArgumentException(); -// this.edges = Collections.unmodifiableList(edges); + this.vertices = vertices; - this.vertices = vertices; - - this.preds = preds; - - this.cumulativeEstimatedCardinality = cumulativeEstimatedCardinality; + this.preds = preds; - this.edgeSample = edgeSample; + this.edgeSample = edgeSample; + + this.sumEstCard = sumEstCard; + + this.sumEstRead = sumEstRead; + + this.sumEstCost = getTotalCost(this); } @@ -552,6 +589,9 @@ * The new vertex. * @param constraints * The join graph constraints (if any). + * @param pathIsComplete + * <code>true</code> iff all vertices in the join graph are + * incorporated into this path. * * @return The new path. The materialized sample for the new path is the * sample obtained by the cutoff join for the edge added to the @@ -560,7 +600,8 @@ * @throws Exception */ public Path addEdge(final QueryEngine queryEngine, final int limit, - final Vertex vnew, final IConstraint[] constraints) + final Vertex vnew, final IConstraint[] constraints, + final boolean pathIsComplete) throws Exception { if (vnew == null) @@ -626,59 +667,63 @@ limit, // preds2,// constraints,// + pathIsComplete,// this.edgeSample // the source sample. ); - { - final long cumulativeEstimatedCardinality = add( - this.cumulativeEstimatedCardinality, edgeSample2); + // Extend the path. + final Path tmp = new Path(// + vertices2,// + preds2,// + edgeSample2,// + this.sumEstCard + edgeSample2.estCard,// sumEstCard + this.sumEstRead + edgeSample2.estRead // sumEstRead + ); - // Extend the path. - final Path tmp = new Path(vertices2, preds2, - cumulativeEstimatedCardinality, edgeSample2); + return tmp; - return tmp; - - } - } - /** - * Cutoff join of the last vertex in the join path. - * <p> - * <strong>The caller is responsible for protecting against needless - * re-sampling.</strong> This includes cases where a sample already exists - * at the desired sample limit and cases where the sample is already exact. - * - * @param queryEngine - * The query engine. - * @param limit - * The limit for the cutoff join. - * @param path - * The path segment, which must include the target vertex as the - * last component of the path segment. - * @param constraints - * The constraints declared for the join graph (if any). The - * appropriate constraints will be applied based on the variables - * which are known to be bound as of the cutoff join for the last - * vertex in the path segment. - * @param sourceSample - * The input sample for the cutoff join. When this is a one-step - * estimation of the cardinality of the edge, then this sample is - * taken from the {@link VertexSample}. When the edge (vSource, - * vTarget) extends some {@link Path}, then this is taken from - * the {@link EdgeSample} for that {@link Path}. - * - * @return The result of sampling that edge. - * - * @throws Exception - */ + /** + * Cutoff join of the last vertex in the join path. + * <p> + * <strong>The caller is responsible for protecting against needless + * re-sampling.</strong> This includes cases where a sample already exists + * at the desired sample limit and cases where the sample is already exact. + * + * @param queryEngine + * The query engine. + * @param limit + * The limit for the cutoff join. + * @param path + * The path segment, which must include the target vertex as the + * last component of the path segment. + * @param constraints + * The constraints declared for the join graph (if any). The + * appropriate constraints will be applied based on the variables + * which are known to be bound as of the cutoff join for the last + * vertex in the path segment. + * @param pathIsComplete + * <code>true</code> iff all vertices in the join graph are + * incorporated into this path. + * @param sourceSample + * The input sample for the cutoff join. When this is a one-step + * estimation of the cardinality of the edge, then this sample is + * taken from the {@link VertexSample}. When the edge (vSource, + * vTarget) extends some {@link Path}, then this is taken from + * the {@link EdgeSample} for that {@link Path}. + * + * @return The result of sampling that edge. + * + * @throws Exception + */ static public EdgeSample cutoffJoin(// final QueryEngine queryEngine,// final int limit,// final IPredicate<?>[] path,// final IConstraint[] constraints,// + final boolean pathIsComplete,// final SampleBase sourceSample// ) throws Exception { @@ -702,8 +747,8 @@ // Figure out which constraints attach to each predicate. final IConstraint[][] constraintAttachmentArray = PartitionedJoinGroup - .getJoinGraphConstraints(path, constraints,null/*knownVariables*/, - false/*FIXME pathIsComplete*/); + .getJoinGraphConstraints(path, constraints, null/*knownBound*/, + pathIsComplete); // The constraint(s) (if any) for this join. final IConstraint[] c = constraintAttachmentArray[path.length - 1]; @@ -793,6 +838,7 @@ final List<IBindingSet> result = new LinkedList<IBindingSet>(); try { + int nresults = 0; try { IBindingSet bset = null; // Figure out the #of source samples consumed. @@ -801,10 +847,11 @@ while (itr.hasNext()) { bset = itr.next(); result.add(bset); + nresults++; // TODO break out if cutoff join over produces! } } finally { // verify no problems. - runningQuery.get(); + runningQuery.get(); // TODO CANCEL query once limit is satisfied THEN check the future for errors. } } finally { runningQuery.cancel(true/* mayInterruptIfRunning */); @@ -822,8 +869,11 @@ final int inputCount = (int) joinStats.inputSolutions.get(); // #of solutions out. - long outputCount = joinStats.outputSolutions.get(); + final long outputCount = joinStats.outputSolutions.get(); + // #of solutions out as adjusted for various edge conditions. + final long adjustedCard; + // cumulative range count of the sampled access paths. final long sumRangeCount = joinStats.accessPathRangeCount.get(); @@ -838,6 +888,7 @@ * number of output solutions. */ estimateEnum = EstimateEnum.Exact; + adjustedCard = outputCount; } else if (inputCount == 1 && outputCount == limit) { /* * If the inputCount is ONE (1) and the outputCount is the limit, @@ -856,11 +907,11 @@ * are really better to be dropped. */ // replace outputCount with the sum of the sampled range counts. - outputCount = sumRangeCount; + adjustedCard = sumRangeCount; estimateEnum = EstimateEnum.LowerBound; } else if ((sourceSample.estimateEnum != EstimateEnum.Exact) - && inputCount == Math.min(sourceSample.limit, - sourceSample.estimatedCardinality) && outputCount == 0) { + /*&& inputCount == Math.min(sourceSample.limit, + sourceSample.estimatedCardinality) */ && outputCount == 0) { /* * When the source sample was not exact, the inputCount is EQ to the * lesser of the source range count and the source sample limit, and @@ -874,10 +925,16 @@ * Note: An apparent join hit ratio of zero does NOT imply that the * join will be empty (unless the source vertex sample is actually * the fully materialized access path - this case is covered above). + * + * path sourceCard * f ( in read out limit adjCard) = estCard : sumEstCard joinPath + * 15 4800L * 0.00 ( 200 200 0 300 0) = 0 : 3633 [ 3 1 6 5 ] + */ estimateEnum = EstimateEnum.Underflow; + adjustedCard = outputCount; } else { estimateEnum = EstimateEnum.Normal; + adjustedCard = outputCount; } /* @@ -891,20 +948,43 @@ * read. */ final long tuplesRead = joinStats.accessPathUnitsIn.get(); - - final double f = outputCount == 0 ? 0 - : (outputCount / (double) inputCount); - final long estimatedCardinality = (long) (sourceSample.estimatedCardinality * f); + /* + * Compute the hit-join ratio based on the adjusted cardinality + * estimate. + */ + final double f = adjustedCard == 0 ? 0 + : (adjustedCard / (double) inputCount); +// final double f = outputCount == 0 ? 0 +// : (outputCount / (double) inputCount); + // estimated output cardinality of fully executed operator. + final long estCard = (long) (sourceSample.estCard * f); + + /* + * estimated tuples read for fully executed operator + * + * TODO The actual IOs depend on the join type (hash join versus + * pipeline join) and whether or not the file has index order (segment + * versus journal). A hash join will read once on the AP. A pipeline + * join will read once per input solution. A key-range read on a segment + * uses multi-block IO while a key-range read on a journal uses random + * IO. Also, remote access path reads are more expensive than sharded + * or hash partitioned access path reads in scale-out. + */ + final long estRead = (long) (sumRangeCount * f); + final EdgeSample edgeSample = new EdgeSample(// sourceSample,// inputCount,// + tuplesRead,// + sumRangeCount,// outputCount, // - tuplesRead,// + adjustedCard,// f, // // args to SampleBase - estimatedCardinality, // + estCard, // estimated output cardinality if fully executed. + estRead, // estimated tuples read if fully executed. limit, // estimateEnum,// result.toArray(new IBindingSet[result.size()])); Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/SampleBase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/SampleBase.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/SampleBase.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -56,11 +56,17 @@ private static final transient Logger log = Logger .getLogger(SampleBase.class); - /** - * The estimated cardinality of the underlying access path (for a vertex) or - * the join (for a cutoff join). - */ - public final long estimatedCardinality; + /** + * The total estimated cardinality of the underlying access path (for a + * vertex) or the join path segment (for a cutoff join). + * + * TODO When using a non-perfect index, the estimated cardinality is only + * part of the cost. The #of tuples scanned is also important. Even when + * scanning and filtering in key order this could trigger random IOs unless + * the file has index order (an IndexSegment file has index order but a + * BTree on a journal does not). + */ + public final long estCard; /** * The limit used to produce the {@link #getSample() sample}. @@ -156,7 +162,7 @@ if (sample == null) throw new IllegalArgumentException(); - this.estimatedCardinality = estimatedCardinality; + this.estCard = estimatedCardinality; this.limit = limit; @@ -180,7 +186,7 @@ public String toString() { final StringBuilder sb = new StringBuilder(); sb.append(getClass().getSimpleName()); - sb.append("{estimatedCardinality=" + estimatedCardinality); + sb.append("{estCard=" + estCard); sb.append(",limit=" + limit); sb.append(",estimateEnum=" + estimateEnum); { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Vertex.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Vertex.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Vertex.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -153,7 +153,7 @@ final IAccessPath ap = context.getAccessPath(r, pred); final long rangeCount = oldSample == null ? ap - .rangeCount(false/* exact */) : oldSample.estimatedCardinality; + .rangeCount(false/* exact */) : oldSample.estCard; if (rangeCount <= limit) { Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/VertexSample.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/VertexSample.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/VertexSample.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -37,7 +37,7 @@ * historical view or even a time scale of query which is significantly * faster than update). * - * @param estimatedCardinality + * @param estCard * The estimated cardinality. * @param limit * The cutoff limit used to make that cardinality estimate. @@ -49,10 +49,10 @@ * @param sample * The sample. */ - public VertexSample(final long estimatedCardinality, final int limit, + public VertexSample(final long estCard, final int limit, final EstimateEnum estimateEnum, final IBindingSet[] sample) { - super(estimatedCardinality, limit, estimateEnum, sample); + super(estCard, limit, estimateEnum, sample); switch (estimateEnum) { case Normal: Modified: branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-02-27 21:11:23 UTC (rev 4255) +++ branches/QUADS_QUERY_BRANCH/bigdata-rdf/src/test/com/bigdata/bop/rdf/joinGraph/AbstractJoinGraphTestCase.java 2011-02-27 21:14:03 UTC (rev 4256) @@ -39,6 +39,7 @@ import com.bigdata.bop.BOpContextBase; import com.bigdata.bop.BOpIdFactory; +import com.bigdata.bop.BOpUtility; import com.bigdata.bop.IBindingSet; import com.bigdata.bop.IConstraint; import com.bigdata.bop.IPredicate; @@ -89,8 +90,17 @@ /** The initial sampling limit. */ static private final int limit = 100; - - /** The #of edges considered for the initial paths. */ + + /** + * The #of edges considered for the initial paths. + * + * FIXME We need to consider all of the low cardinality vertices, e.g., BSBM + * Q5 has 3 such vertices. Also, we should not consider vertices when + * looking for the initial edges which are relatively unconstrained (e.g., + * 1-bound). This could be handled by sampling the top-N vertices in reverse + * rank order of their cardinality and any with a cardinality LT 10x the + * initial sample limit. + */ static private final int nedges = 2; static private final SampleType sampleType = SampleType.EVEN; @@ -240,9 +250,14 @@ final IPredicate<?>[] runtimePredOrder = runRuntimeQueryOptimizer( getQueryEngine(), limit, nedges, sampleType, preds, constraints); + long totalGivenTime = 0; long totalRuntimeTime = 0; long totalStaticTime = 0; + long givenSolutions = 0; + long runtimeSolutions = 0; + long staticSolutions = ... [truncated message content] |
From: <tho...@us...> - 2011-02-27 21:11:29
|
Revision: 4255 http://bigdata.svn.sourceforge.net/bigdata/?rev=4255&view=rev Author: thompsonbry Date: 2011-02-27 21:11:23 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Removed a place holder for a unit test. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java 2011-02-27 21:11:03 UTC (rev 4254) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java 2011-02-27 21:11:23 UTC (rev 4255) @@ -467,12 +467,13 @@ } - /** - * @todo test with headPlan. - */ - public void test_something() { - fail("write tests"); - } +// /** +// * @todo test with headPlan (actually, I think that we will remove +// * the head plan from the PartitionedJoinGraph). +// */ +// public void test_something_headPlan() { +// fail("write tests"); +// } /** * Verifies that the iterator visits the specified objects in some arbitrary This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-27 21:11:09
|
Revision: 4254 http://bigdata.svn.sourceforge.net/bigdata/?rev=4254&view=rev Author: thompsonbry Date: 2011-02-27 21:11:03 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Modified the SPARQL Constraint operator to handle nested type errors. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java 2011-02-27 15:09:56 UTC (rev 4253) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/constraint/Constraint.java 2011-02-27 21:11:03 UTC (rev 4254) @@ -81,19 +81,27 @@ public boolean accept(final IBindingSet bs) { - try { +// try { // evaluate the BVE operator return ((BooleanValueExpression) get(0)).get(bs); - - } catch (Exception ex) { - - // trap the type error and filter out the solution - if (log.isInfoEnabled()) - log.info("discarding solution due to error: " + bs); - return false; - - } + +// } catch (Throwable t) { +// +// if (InnerCause.isInnerCause(t, SparqlTypeErrorException.class)) { +// +// // trap the type error and filter out the solution +// if (log.isInfoEnabled()) +// log.info("discarding solution due to type error: " + bs +// + " : " + t); +// +// return false; +// +// } +// +// throw new RuntimeException(t); +// +// } } Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-27 15:09:56 UTC (rev 4253) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-27 21:11:03 UTC (rev 4254) @@ -1061,6 +1061,8 @@ }) // ); +// final PipelineOp queryOp = lastOp; + return queryOp; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-27 15:10:02
|
Revision: 4253 http://bigdata.svn.sourceforge.net/bigdata/?rev=4253&view=rev Author: thompsonbry Date: 2011-02-27 15:09:56 +0000 (Sun, 27 Feb 2011) Log Message: ----------- Wrote some more unit tests to verify correct constraint attachment for various join paths constructed from the BSBM Q5 join graph. I did not identify any problems with these unit tests even though the query results for the RTO and static solutions are different. This suggests that the problem may somehow lie in the manner in which the query is being evaluated. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java 2011-02-25 23:01:42 UTC (rev 4252) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup_canJoinUsingConstraints.java 2011-02-27 15:09:56 UTC (rev 4253) @@ -27,6 +27,8 @@ package com.bigdata.bop.joinGraph; +import java.util.Arrays; +import java.util.Collections; import java.util.LinkedHashSet; import java.util.Map; import java.util.Random; @@ -49,7 +51,6 @@ import com.bigdata.bop.constraint.AND; import com.bigdata.bop.constraint.BooleanValueExpression; import com.bigdata.bop.constraint.Constraint; -import com.bigdata.bop.joinGraph.PartitionedJoinGroup; import com.bigdata.bop.joinGraph.rto.JGraph; /** @@ -716,4 +717,144 @@ } + /* + * Unit tests for attaching constraints using a specific join path. + */ + + private final Set<IConstraint> asSet(IConstraint[] a) { + + return new LinkedHashSet<IConstraint>(Arrays.asList(a)); + + } + + /** no constraints. */ + final Set<IConstraint> NA = Collections.emptySet(); + + /** {@link #c0} attaches when any of [p0,p2,p4,p6] runs for the 1st time. */ + final Set<IConstraint> C0 = asSet(new IConstraint[]{c0}); + + /** {@link #c1} attaches when 2nd of (p3,p4) runs (in any order). */ + final Set<IConstraint> C1 = asSet(new IConstraint[] { c1 }); + + /** {@link #c2} attaches when 2nd of (p5,p6) runs (in any order). */ + final Set<IConstraint> C2 = asSet(new IConstraint[] { c2 }); + + /** <code>path = [1, 2, 4, 6, 0, 3, 5]</code> */ + public void test_attachConstraints_BSBM_Q5_path01() { + + final IPredicate<?>[] path = { p1, p2, p4, p6, p0, p3, p5 }; + + final IConstraint[][] actual = PartitionedJoinGroup + .getJoinGraphConstraints(path, constraints, + null/* knownBoundVars */, true/* pathIsComplete */); + + final Set<IConstraint>[] expected = new Set[] { // + NA, // p1 + C0, // p2 + NA, // p4 + NA, // p6 + NA, // p0 + C1, // p3 + C2, // p5 + }; + + assertSameConstraints(expected, actual); + + } + + /** <code>[5, 3, 1, 0, 2, 4, 6]</code>. */ + public void test_attachConstraints_BSBM_Q5_path02() { + + final IPredicate<?>[] path = { p5, p3, p1, p0, p2, p4, p6 }; + + final IConstraint[][] actual = PartitionedJoinGroup + .getJoinGraphConstraints(path, constraints, + null/* knownBoundVars */, true/* pathIsComplete */); + + final Set<IConstraint>[] expected = new Set[] { // + NA, // p5 + NA, // p3 + NA, // p1 + C0, // p0 + NA, // p2 + C1, // p4 + C2, // p6 + }; + + assertSameConstraints(expected, actual); + + } + + /** <code>[3, 4, 5, 6, 1, 2, 0]</code> (key-range constraint variant). */ + public void test_attachConstraints_BSBM_Q5_path03() { + + final IPredicate<?>[] path = { p3, p4, p5, p6, p1, p2, p0 }; + + final IConstraint[][] actual = PartitionedJoinGroup + .getJoinGraphConstraints(path, constraints, + null/* knownBoundVars */, true/* pathIsComplete */); + + final Set<IConstraint>[] expected = new Set[] { // + NA, // p3 + asSet(new IConstraint[]{c0,c1}), // p4 + NA, // p5 + C2, // p6 + NA, // p1 + NA, // p2 + NA, // p0 + }; + + assertSameConstraints(expected, actual); + + } + + /** + * Verifies that the right set of constraints is attached at each of the + * vertices of a join path. Comparison of {@link IConstraint} instances is + * by reference. + * + * @param expected + * @param actual + */ + static void assertSameConstraints(final Set<IConstraint>[] expected, + final IConstraint[][] actual) { + + assertEquals("length", expected.length, actual.length); + + for (int i = 0; i < expected.length; i++) { + + final Set<IConstraint> e = expected[i]; + final IConstraint[] a = actual[i]; + + if (e.size() != a.length) { + fail("Differs at expected[" + i + "] : expecting " + e.size() + + ", not " + a.length + " elements: " + + Arrays.toString(a)); + } + + for (int j = 0; j < a.length; j++) { + + boolean foundRef = false; + for (IConstraint t : e) { + + if (t == a[j]) { + foundRef = true; + break; + } + + } + + if (!foundRef) { + + fail("Differs at expected[" + i + "][" + j + "] : actual=" + + a[j]); + + } + + } + + } + + } + } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-25 23:01:48
|
Revision: 4252 http://bigdata.svn.sourceforge.net/bigdata/?rev=4252&view=rev Author: thompsonbry Date: 2011-02-25 23:01:42 +0000 (Fri, 25 Feb 2011) Log Message: ----------- Fixing a compile error in the build. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-25 21:20:03 UTC (rev 4251) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/rto/Path.java 2011-02-25 23:01:42 UTC (rev 4252) @@ -702,7 +702,8 @@ // Figure out which constraints attach to each predicate. final IConstraint[][] constraintAttachmentArray = PartitionedJoinGroup - .getJoinGraphConstraints(path, constraints); + .getJoinGraphConstraints(path, constraints,null/*knownVariables*/, + false/*FIXME pathIsComplete*/); // The constraint(s) (if any) for this join. final IConstraint[] c = constraintAttachmentArray[path.length - 1]; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <tho...@us...> - 2011-02-25 21:20:10
|
Revision: 4251 http://bigdata.svn.sourceforge.net/bigdata/?rev=4251&view=rev Author: thompsonbry Date: 2011-02-25 21:20:03 +0000 (Fri, 25 Feb 2011) Log Message: ----------- Added an pathIsComplete boolean to PartitionedJoinGroup. When true, all constraints are attached no later than the last predicate. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-25 21:18:01 UTC (rev 4250) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/bop/joinGraph/PartitionedJoinGroup.java 2011-02-25 21:20:03 UTC (rev 4251) @@ -134,46 +134,52 @@ .toArray(new IConstraint[joinGraphConstraints.size()]); } - /** - * Return the set of constraints which should be attached to the last join - * in the given the join path. All joins in the join path must be - * non-optional joins (that is, part of either the head plan or the join - * graph). - * <p> - * The rule followed by this method is that each constraint will be attached - * to the first non-optional join at which all of its variables are known to - * be bound. It is assumed that constraints are attached to each join in the - * join path by a consistent logic, e.g., as dictated by this method. - * - * @param joinPath - * An ordered array of predicate identifiers representing a - * specific sequence of non-optional joins. - * - * @return The constraints which should be attached to the last join in the - * join path. - * - * @throws IllegalArgumentException - * if the join path is <code>null</code>. - * @throws IllegalArgumentException - * if the join path is empty. - * @throws IllegalArgumentException - * if any element of the join path is <code>null</code>. - * @throws IllegalArgumentException - * if any predicate specified in the join path is not known to - * this class. - * @throws IllegalArgumentException - * if any predicate specified in the join path is optional. - * - * @todo Implement (or refactor) the logic to decide which variables need to - * be propagated and which can be dropped. This decision logic will - * need to be available to the runtime query optimizer. - * - * @todo This does not pay attention to the head plan. If there can be - * constraints on the head plan then either this should be modified - * such that it can decide where they attach or we need to have a - * method which does the same thing for the head plan. - */ - public IConstraint[] getJoinGraphConstraints(final int[] pathIds) { + /** + * Return the set of constraints which should be attached to the last join + * in the given the join path. All joins in the join path must be + * non-optional joins (that is, part of either the head plan or the join + * graph). + * <p> + * The rule followed by this method is that each constraint will be attached + * to the first non-optional join at which all of its variables are known to + * be bound. It is assumed that constraints are attached to each join in the + * join path by a consistent logic, e.g., as dictated by this method. + * + * @param joinPath + * An ordered array of predicate identifiers representing a + * specific sequence of non-optional joins. + * @param pathIsComplete + * <code>true</code> iff the <i>path</i> represents a complete + * join path. When <code>true</code>, any constraints which have + * not already been attached will be attached to the last + * predicate in the join path. + * + * @return The constraints which should be attached to the last join in the + * join path. + * + * @throws IllegalArgumentException + * if the join path is <code>null</code>. + * @throws IllegalArgumentException + * if the join path is empty. + * @throws IllegalArgumentException + * if any element of the join path is <code>null</code>. + * @throws IllegalArgumentException + * if any predicate specified in the join path is not known to + * this class. + * @throws IllegalArgumentException + * if any predicate specified in the join path is optional. + * + * @todo Implement (or refactor) the logic to decide which variables need to + * be propagated and which can be dropped. This decision logic will + * need to be available to the runtime query optimizer. + * + * @todo This does not pay attention to the head plan. If there can be + * constraints on the head plan then either this should be modified + * such that it can decide where they attach or we need to have a + * method which does the same thing for the head plan. + */ + public IConstraint[] getJoinGraphConstraints(final int[] pathIds, + final boolean pathIsComplete) { /* * Verify arguments and resolve bopIds to predicates. @@ -212,57 +218,59 @@ } - return getJoinGraphConstraints(path, joinGraphConstraints - .toArray(new IConstraint[joinGraphConstraints.size()]))[pathIds.length - 1]; + final IConstraint[] constraints = joinGraphConstraints + .toArray(new IConstraint[joinGraphConstraints.size()]); + + final IConstraint[][] attachedConstraints = getJoinGraphConstraints( + path, constraints, null/* knownBound */, pathIsComplete); + return attachedConstraints[pathIds.length - 1]; + } - static public IConstraint[][] getJoinGraphConstraints( - final IPredicate<?>[] path, final IConstraint[] joinGraphConstraints) { +// static public IConstraint[][] getJoinGraphConstraints( +// final IPredicate<?>[] path, final IConstraint[] joinGraphConstraints) { +// +// return getJoinGraphConstraints(path, joinGraphConstraints, null/*knownBound*/); +// +// } - return getJoinGraphConstraints(path, joinGraphConstraints, null); - - } - - /** - * Given a join path, return the set of constraints to be associated with - * each join in that join path. Only those constraints whose variables are - * known to be bound will be attached. - * - * @param path - * The join path. - * @param joinGraphConstraints - * The constraints to be applied to the join path (optional). - * @param knownBoundVars - * Variables that are known to be bound as inputs to this - * join graph (parent queries). - * - * @return The constraints to be paired with each element of the join path. - * - * @throws IllegalArgumentException - * if the join path is <code>null</code>. - * @throws IllegalArgumentException - * if the join path is empty. - * @throws IllegalArgumentException - * if any element of the join path is <code>null</code>. - * @throws IllegalArgumentException - * if any element of the join graph constraints is - * <code>null</code>. - * - * @todo It should be an error if a variable appear in a constraint is not - * bound by any possible join path. However, it may not be possible to - * determine this by local examination of a join graph since we do not - * know which variables may be presented as already bound when the - * join graph is evaluated (but we can only run the join graph - * currently against static source binding sets and for that case this - * is knowable). - * - * FIXME Unit tests. - */ + /** + * Given a join path, return the set of constraints to be associated with + * each join in that join path. Only those constraints whose variables are + * known to be bound will be attached. + * + * @param path + * The join path. + * @param joinGraphConstraints + * The constraints to be applied to the join path (optional). + * @param knownBoundVars + * Variables that are known to be bound as inputs to this join + * graph (parent queries). + * @param pathIsComplete + * <code>true</code> iff the <i>path</i> represents a complete + * join path. When <code>true</code>, any constraints which have + * not already been attached will be attached to the last predicate + * in the join path. + * + * @return The constraints to be paired with each element of the join path. + * + * @throws IllegalArgumentException + * if the join path is <code>null</code>. + * @throws IllegalArgumentException + * if the join path is empty. + * @throws IllegalArgumentException + * if any element of the join path is <code>null</code>. + * @throws IllegalArgumentException + * if any element of the join graph constraints is + * <code>null</code>. + */ static public IConstraint[][] getJoinGraphConstraints( - final IPredicate<?>[] path, - final IConstraint[] joinGraphConstraints, - final IVariable<?>[] knownBoundVars) { + final IPredicate<?>[] path,// + final IConstraint[] joinGraphConstraints,// + final IVariable<?>[] knownBoundVars,// + final boolean pathIsComplete// + ) { if (path == null) throw new IllegalArgumentException(); @@ -343,7 +351,7 @@ boolean attach = false; - if (i == path.length-1) { + if (pathIsComplete && i == path.length - 1) { // attach all unused constraints to last predicate attach = true; @@ -700,12 +708,11 @@ /* * Find the constraints that will run with each vertex of the new * join path. - * - * TODO This is a forward reference to a different package, so maybe - * move the canJoinWithConstraints() method to that package? */ final IConstraint[][] constraintRunArray = getJoinGraphConstraints( - newPath, constraints); + newPath, constraints, null/*knownBound*/, + true/*pathIsComplete*/ + ); /* * Consider only the constraints attached to the last vertex in the @@ -997,7 +1004,8 @@ // figure out which constraints are attached to which predicates. final IConstraint[][] assignedConstraints = PartitionedJoinGroup - .getJoinGraphConstraints(preds, constraints); + .getJoinGraphConstraints(preds, constraints, null/*knownBound*/, + true/*pathIsComplete*/); final PipelineJoin<?>[] joins = new PipelineJoin[preds.length]; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java 2011-02-25 21:18:01 UTC (rev 4250) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/bop/joinGraph/TestPartitionedJoinGroup.java 2011-02-25 21:20:03 UTC (rev 4251) @@ -249,25 +249,32 @@ // System.out.println(Arrays.toString(actual)); // c1 is applied when x is bound. x is bound by p0. - assertEquals(new IConstraint[] { c1 }, fixture - .getJoinGraphConstraints(new int[] { p1.getId(), - p0.getId() })); + assertEquals(new IConstraint[] { c1 }, fixture + .getJoinGraphConstraints(// + new int[] { p1.getId(), p0.getId() },// + false// pathIsComplete + )); /* * c1 is applied when x is bound. x is bound by p0. p0 is the * last predicate in this join path, so c1 is attached to p0. */ assertEquals(new IConstraint[] { c1 }, fixture - .getJoinGraphConstraints(new int[] { p0.getId()})); + .getJoinGraphConstraints(// + new int[] { p0.getId()},// + false//pathIsComplete + )); - /* - * c2 is applied when y is bound. y is bound by p1. p1 is the - * last predicate in this join path, p1 is the last predicate in - * this join path so c2 is attached to p1. - */ - assertEquals(new IConstraint[] { c2 }, fixture - .getJoinGraphConstraints(new int[] { p0.getId(), - p1.getId() })); + /* + * c2 is applied when y is bound. y is bound by p1. p1 is the + * last predicate in this join path, p1 is the last predicate in + * this join path so c2 is attached to p1. + */ + assertEquals(new IConstraint[] { c2 }, fixture + .getJoinGraphConstraints(// + new int[] { p0.getId(), p1.getId() },// + false// pathIsComplete + )); } Modified: branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java 2011-02-25 21:18:01 UTC (rev 4250) +++ branches/QUADS_QUERY_BRANCH/bigdata-sails/src/java/com/bigdata/rdf/sail/Rule2BOpUtility.java 2011-02-25 21:20:03 UTC (rev 4251) @@ -586,7 +586,9 @@ // figure out which constraints are attached to which predicates. assignedConstraints = PartitionedJoinGroup.getJoinGraphConstraints( preds, constraints, - knownBound.toArray(new IVariable<?>[knownBound.size()])); + knownBound.toArray(new IVariable<?>[knownBound.size()]), + true// pathIsComplete + ); } /* This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |