From: Ashutosh B. <ash...@en...> - 2012-06-27 06:23:28
|
Hi Michael, That's a good catch and good that we came upon it before finalising the fix. While creating the view definition, it tries to parse the SELECT statement and while doing so, tries to resolve the aggregate function. On datanode, aggregates have transition type as return type, which in this case is polymorphic (not acceptable as return type). We manage such aggregates by not pushing the aggregates down to the datanodes, but in this case don't look at what can be pushed or not inside the view definition. What we may want to do, and is hard to do, is to dis-assemble the view definition at coordinator and send the relevant information (the one stored in catalogs?) to the datanode to be stored directly (without involving parsing etc.). The same may need to be done with all the utilities, but this is a massive change, and something which needs to be thought through properly. On Wed, Jun 27, 2012 at 11:05 AM, Michael Paquier <mic...@gm... > wrote: > Hi, > > I wrote a patch enabling the creation of views on Datanodes to get rid of > this function problem. The fix is attached. > However, while digging into this issue, I found a problem with types and > views, for example: > create table aa (a int); > create type aa_type as enum ('1','2','3','4','5','6'); > create view aa_v as select max(a::aa_type) from aa; -- created on all the > nodes > ERROR: column "max" has pseudo-type anyenum > > This error comes from heap.c, where a check is done on the type of the > column. > The problem is that in the case of aggregates, we use the transition type > on Datanodes, which is a pseudo-type and is by definition forbidden for as > a column type. > The aggregate modification comes from here: > --- a/src/backend/parser/parse_agg.c > +++ b/src/backend/parser/parse_agg.c > @@ -209,6 +209,7 @@ transformAggregateCall(ParseState *pstate, Aggref *agg, > aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple); > agg->aggtrantype = aggform->aggtranstype; > agg->agghas_collectfn = OidIsValid(aggform->aggcollectfn); > + //Error comes from this one: > if (IS_PGXC_DATANODE) > agg->aggtype = agg->aggtrantype; > > Associating a transition type on Datanodes for aggregates is correct, but > until now we have never created views on Datanodes. > Btw, a fix for this second issue is included in the patch attached. What I > simply did was bypassing the error on Datanodes as we may have a > pseudo-type in the case of an aggregate. Ashutosh, comments on that? > > > > On Wed, Jun 20, 2012 at 2:59 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> >> >> On Wed, Jun 20, 2012 at 10:25 AM, Michael Paquier < >> mic...@gm...> wrote: >> >>> >>> >>> On Wed, Jun 20, 2012 at 12:58 PM, Ashutosh Bapat < >>> ash...@en...> wrote: >>> >>>> >>>> >>>> On Wed, Jun 20, 2012 at 9:18 AM, Michael Paquier < >>>> mic...@gm...> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jun 20, 2012 at 12:46 PM, Ashutosh Bapat < >>>>> ash...@en...> wrote: >>>>> >>>>>> One fix, I can think of is to create volatile functions only on >>>>>> coordinator. Although, I would still take a step back, and find out why we >>>>>> took the decision not to store views on the datanodes. >>>>>> >>>>> View => projection of table data => need distribution type of table => >>>>> distribution data only available on Coordinator for data distribution => no >>>>> sense to define views on Datanodes >>>>> >>>> >>>> In the case, where a view type is used as function argument or return >>>> type, it does make sense to have the view definition on the datanodes. The >>>> implication behind my question is whether there is any correctness problem >>>> by creating view and related definitions at the datanodes. >>>> >>> By taking this question from another angle: >>> Are there any problems to push down clauses using views to Datanodes? >>> >> >> Having view definitions on the datanode does not imply that we have to >> push the clauses using views to the datanodes. In fact, even if we want to, >> we won't be able to do so, as the view resolution happens even before we >> take into consideration the distribution. >> >> >>> Just based on correctness, the answer is no problem. Btw, the function >>> using a view should be volatile as it reads data, so it will not be used on >>> Datanodes at all... >>> >> >> We are not using view here, we are using datatype which corresponds to >> the view result. Using such datatype does not necessarily mean that we >> touch any of the data. For example, see the function (modified version of >> the example given by Dimitrije) below >> >> CREATE OR REPLACE FUNCTION some_function() RETURNS SETOF some_view AS >> $body$ >> BEGIN >> return (1, 1); >> END; >> $body$ >> LANGUAGE 'plpgsql' >> COST 100; >> >> This function is certainly immutable (certainly not volatile), and thus >> pushable to the datanodes. For such functions, it having view definitions >> at the datanodes will be helpful. >> >> >>> -- >>> Michael Paquier >>> http://michael.otacoo.com >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Enterprise Postgres Company >> >> > > > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |