From: Michael P. <mic...@gm...> - 2012-06-27 06:34:39
|
On Wed, Jun 27, 2012 at 3:23 PM, Ashutosh Bapat < ash...@en...> wrote: > Hi Michael, > That's a good catch and good that we came upon it before finalising the > fix. > > While creating the view definition, it tries to parse the SELECT statement > and while doing so, tries to resolve the aggregate function. On datanode, > aggregates have transition type as return type, which in this case is > polymorphic (not acceptable as return type). We manage such aggregates by > not pushing the aggregates down to the datanodes, but in this case don't > look at what can be pushed or not inside the view definition. > Yes, OK. > > What we may want to do, and is hard to do, is to dis-assemble the view > definition at coordinator and send the relevant information (the one stored > in catalogs?) to the datanode to be stored directly (without involving > parsing etc.). The same may need to be done with all the utilities, but > this is a massive change, and something which needs to be thought through > properly. > This is... well... not simple. And not completely related to this fix. > > > On Wed, Jun 27, 2012 at 11:05 AM, Michael Paquier < > mic...@gm...> wrote: > >> Hi, >> >> I wrote a patch enabling the creation of views on Datanodes to get rid of >> this function problem. The fix is attached. >> However, while digging into this issue, I found a problem with types and >> views, for example: >> create table aa (a int); >> create type aa_type as enum ('1','2','3','4','5','6'); >> create view aa_v as select max(a::aa_type) from aa; -- created on all the >> nodes >> ERROR: column "max" has pseudo-type anyenum >> >> This error comes from heap.c, where a check is done on the type of the >> column. >> The problem is that in the case of aggregates, we use the transition type >> on Datanodes, which is a pseudo-type and is by definition forbidden for as >> a column type. >> The aggregate modification comes from here: >> --- a/src/backend/parser/parse_agg.c >> +++ b/src/backend/parser/parse_agg.c >> @@ -209,6 +209,7 @@ transformAggregateCall(ParseState *pstate, Aggref >> *agg, >> aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple); >> agg->aggtrantype = aggform->aggtranstype; >> agg->agghas_collectfn = OidIsValid(aggform->aggcollectfn); >> + //Error comes from this one: >> if (IS_PGXC_DATANODE) >> agg->aggtype = agg->aggtrantype; >> >> Associating a transition type on Datanodes for aggregates is correct, but >> until now we have never created views on Datanodes. >> Btw, a fix for this second issue is included in the patch attached. What >> I simply did was bypassing the error on Datanodes as we may have a >> pseudo-type in the case of an aggregate. Ashutosh, comments on that? >> >> >> >> On Wed, Jun 20, 2012 at 2:59 PM, Ashutosh Bapat < >> ash...@en...> wrote: >> >>> >>> >>> On Wed, Jun 20, 2012 at 10:25 AM, Michael Paquier < >>> mic...@gm...> wrote: >>> >>>> >>>> >>>> On Wed, Jun 20, 2012 at 12:58 PM, Ashutosh Bapat < >>>> ash...@en...> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jun 20, 2012 at 9:18 AM, Michael Paquier < >>>>> mic...@gm...> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 20, 2012 at 12:46 PM, Ashutosh Bapat < >>>>>> ash...@en...> wrote: >>>>>> >>>>>>> One fix, I can think of is to create volatile functions only on >>>>>>> coordinator. Although, I would still take a step back, and find out why we >>>>>>> took the decision not to store views on the datanodes. >>>>>>> >>>>>> View => projection of table data => need distribution type of table >>>>>> => distribution data only available on Coordinator for data distribution => >>>>>> no sense to define views on Datanodes >>>>>> >>>>> >>>>> In the case, where a view type is used as function argument or return >>>>> type, it does make sense to have the view definition on the datanodes. The >>>>> implication behind my question is whether there is any correctness problem >>>>> by creating view and related definitions at the datanodes. >>>>> >>>> By taking this question from another angle: >>>> Are there any problems to push down clauses using views to Datanodes? >>>> >>> >>> Having view definitions on the datanode does not imply that we have to >>> push the clauses using views to the datanodes. In fact, even if we want to, >>> we won't be able to do so, as the view resolution happens even before we >>> take into consideration the distribution. >>> >>> >>>> Just based on correctness, the answer is no problem. Btw, the function >>>> using a view should be volatile as it reads data, so it will not be used on >>>> Datanodes at all... >>>> >>> >>> We are not using view here, we are using datatype which corresponds to >>> the view result. Using such datatype does not necessarily mean that we >>> touch any of the data. For example, see the function (modified version of >>> the example given by Dimitrije) below >>> >>> CREATE OR REPLACE FUNCTION some_function() RETURNS SETOF some_view AS >>> $body$ >>> BEGIN >>> return (1, 1); >>> END; >>> $body$ >>> LANGUAGE 'plpgsql' >>> COST 100; >>> >>> This function is certainly immutable (certainly not volatile), and thus >>> pushable to the datanodes. For such functions, it having view definitions >>> at the datanodes will be helpful. >>> >>> >>>> -- >>>> Michael Paquier >>>> http://michael.otacoo.com >>>> >>> >>> >>> >>> -- >>> Best Wishes, >>> Ashutosh Bapat >>> EntepriseDB Corporation >>> The Enterprise Postgres Company >>> >>> >> >> >> -- >> Michael Paquier >> http://michael.otacoo.com >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > -- Michael Paquier http://michael.otacoo.com |