From: <tho...@us...> - 2014-01-04 21:59:15
|
Revision: 7726 http://bigdata.svn.sourceforge.net/bigdata/?rev=7726&view=rev Author: thompsonbry Date: 2014-01-04 21:59:08 +0000 (Sat, 04 Jan 2014) Log Message: ----------- The best way to analyze this is to use hint:atOnce to force each operator in the query plan to execute only once. You can then look at the opCount column in the detailed Explain view and see whether the correct number of operator invocations was reported. It should be ONE (1) for each operator where the atOnce hint was correctly applied. If the atOnce hint is applied to an operator, then it sets pipelined:=false as follows: com.bigdata.bop.PipelineOp.pipelined = false I have modified BOPStats to NOT pre-increment opCount. I have modified AbstractRunningQuery.haltOp() to post-increment opCount One consequence is that the opCount will not update until the operator has halted. Thus, it is a "how many times did this operator run successfully counter" rather than a "how many times did this operator start counter". However, it is now reporting the correct values. I cross validated this by logging out the following in ChunkedRunningQuery: {{{ if (log.isInfoEnabled()) log.info("Running task: bop=" + bundle.bopId + (pipelined?"":", atOnceReady=" + atOnceReady) + ", bop=" + bop.toShortString() + ", messages=" + naccepted + ", solutions=" + solutionsAccepted + (log.isDebugEnabled()?", runState=" + runStateString():"")); getQueryEngine().execute(cft); }}} This is the code that actually runs the operator. I then verified that the data on operator invocations was correct. I also cross correlated where pipelined:=false in the query plan with opCount:=1, i.e., if atOnce evaluation was imposed on the operator, then we had only one invocation reported for that operator. See #793. Modified Paths: -------------- branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/AbstractRunningQuery.java branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/BOpStats.java Modified: branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/AbstractRunningQuery.java =================================================================== --- branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/AbstractRunningQuery.java 2014-01-04 21:46:36 UTC (rev 7725) +++ branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/AbstractRunningQuery.java 2014-01-04 21:59:08 UTC (rev 7726) @@ -825,18 +825,38 @@ log.trace(msg.toString()); // update per-operator statistics. - final BOpStats tmp = statsMap.putIfAbsent(msg.getBOpId(), msg.getStats()); + { + // Data race on insert into CHM. + BOpStats tmp = statsMap.putIfAbsent(msg.getBOpId(), + msg.getStats()); - /* - * Combine stats, but do not combine a stats object with itself. - * - * @see https://sourceforge.net/apps/trac/bigdata/ticket/464 (Query - * Statistics do not update correctly on cluster) - */ - if (tmp != null && tmp != msg.getStats()) { - tmp.add(msg.getStats()); + /** + * Combine stats, but do not combine a stats object with itself. + * + * @see <a + * href="https://sourceforge.net/apps/trac/bigdata/ticket/464"> + * Query Statistics do not update correctly on cluster</a> + */ + if (tmp == null) { + // won the data race. + tmp = msg.getStats(); + } else { + // lost the data race. + if (tmp != msg.getStats()) { + tmp.add(msg.getStats()); + } + } + /** + * Post-increment now that we know who one the data race. + * + * @see <a + * href="https://sourceforge.net/apps/trac/bigdata/ticket/793"> + * Explain reports incorrect value for opCount</a> + */ + tmp.opCount.increment(); + // log.warn("bop=" + getBOp(msg.getBOpId()).toShortString() + // + " : stats=" + tmp); } -// log.warn(msg.toString() + " : stats=" + tmp); switch (runState.haltOp(msg)) { case Running: Modified: branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/BOpStats.java =================================================================== --- branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/BOpStats.java 2014-01-04 21:46:36 UTC (rev 7725) +++ branches/BIGDATA_RELEASE_1_3_0/bigdata/src/java/com/bigdata/bop/engine/BOpStats.java 2014-01-04 21:59:08 UTC (rev 7726) @@ -30,7 +30,6 @@ import java.io.Serializable; import com.bigdata.bop.BOp; -import com.bigdata.bop.PipelineOp; import com.bigdata.counters.CAT; /** @@ -55,18 +54,15 @@ */ final public CAT elapsed = new CAT(); - /** - * The #of instances of a given operator which have been created for a given - * query. This provides interesting information about the #of task instances - * for each operator which were required to execute a query. - * - * TODO Due to the way this is incremented, this is always ONE (1) if - * {@link PipelineOp.Annotations#SHARED_STATE} is <code>true</code> (it - * reflects the #of times {@link #add(BOpStats)} was invoked plus one for - * the ctor rather than the #of times the operator task was invoked). This - * should be changed to reflect the #of operator task instances created - * instead. - */ + /** + * The #of instances of a given operator which have been started (and + * successully terminated) for a given query. This provides interesting + * information about the #of task instances for each operator which were + * required to execute a query. + * + * @see <a href="https://sourceforge.net/apps/trac/bigdata/ticket/793"> + * Explain reports incorrect value for opCount</a> + */ final public CAT opCount = new CAT(); /** @@ -127,10 +123,16 @@ /** * Constructor. + * <p> + * Note: Do not pre-increment {@link #opCount}. See {@link #add(BOpStats)} + * and {@link AbstractRunningQuery#haltOp(IHaltOpMessage)}. + * + * @see <a href="https://sourceforge.net/apps/trac/bigdata/ticket/793"> + * Explain reports incorrect value for opCount</a> */ public BOpStats() { - opCount.increment(); +// opCount.increment(); } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |