From: Koichi S. <koi...@gm...> - 2013-08-26 01:44:15
|
I guess the option 1 needs least effort. On the other hand, option 3 will be useful in user functions when we allow more XC-specific functions. If option 1 or 2 can be integrated easily with option 3, I agree to take an option with the least effort. Regards; --- Koichi Suzuki --- Koichi Suzuki 2013/8/23 Ashutosh Bapat <ash...@en...> > Hi All, > Consider the following query, > select * from (select avg(val2), val from tab1) a join (select avg(val2), > val from tab2) b using (val); where tab1 and tab2 are tables distributed on > column val. The query gets shipped completely using FQS. But not through > standard planner. The standard planner, as of now, doesn't have ability to > ship subquery RTEs. I am trying a fix for the same. > > If the subquery is shippable completely i.e. corresponding rel has subplan > with RemoteQuery as top plan which doesn't have any coordinator quals and > does not require projection (doesn't have any unshippable expressions in > the targetlist), it can be used in the same fashion as a table on > datanodes, except that fetching data requires firing query, which should be > same as the query constructed in the RemoteQuery node. While constructing > this query, if there are any aggregates in the query, those need to be > finalised on the datanode (remember, we get transitioned results for > aggregates from the datanodes by default.) In order to specify that the > datanode should finalise the aggregates, we add finalisation function to > the aggregate. This is done during deparsing phase, using flag > fianalise_aggs. This flag was earlier in Query, then moved to RemoteQuery > to avoid changes to PostgreSQL structures. But having it in RemoteQuery > implies that the whole query in that node should have aggregates finalised. > That may not be true once we start reducing subquery relations, since in > such cases, you may not want to finalise aggregates in the top query but do > want to do that for a subquery. Thus it fits to have this switch to be in > Query structure. Now, that we have seen one back and forth of this flag, I > would like to discuss some more options for getting finalised results from > the datanode and have some poll of which one is best. > > (This discussion assumes that readers know that we construct a Query > structure for the query to be passed to the datanode, and then deparse it) > > 1. Use the finalise_aggs flag in Query and while deparsing the query add > finalisation function. Less impact on PG code. > > 2. Add final function node on the top of aggregate nodes in the Query to > be sent to the datanode and deparse this Query structure. No change in PG > code, but we need to add final function nodes on each aggregate node by > pulling those from everywhere in the query, so some coding involved. > Deparsing is expected to take care of properly constructing the query > automatically. > > 3. Add an aggregate directive like "finalise" in line with other > directives like "order by", etc. This requires syntax change in PG, which > can be invasive. > > Any other ideas? > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Postgres Database Company > > > ------------------------------------------------------------------------------ > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today! > http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |