|
From: Michael P. <mic...@gm...> - 2012-06-27 06:34:39
|
On Wed, Jun 27, 2012 at 3:23 PM, Ashutosh Bapat <
ash...@en...> wrote:
> Hi Michael,
> That's a good catch and good that we came upon it before finalising the
> fix.
>
> While creating the view definition, it tries to parse the SELECT statement
> and while doing so, tries to resolve the aggregate function. On datanode,
> aggregates have transition type as return type, which in this case is
> polymorphic (not acceptable as return type). We manage such aggregates by
> not pushing the aggregates down to the datanodes, but in this case don't
> look at what can be pushed or not inside the view definition.
>
Yes, OK.
>
> What we may want to do, and is hard to do, is to dis-assemble the view
> definition at coordinator and send the relevant information (the one stored
> in catalogs?) to the datanode to be stored directly (without involving
> parsing etc.). The same may need to be done with all the utilities, but
> this is a massive change, and something which needs to be thought through
> properly.
>
This is... well... not simple. And not completely related to this fix.
>
>
> On Wed, Jun 27, 2012 at 11:05 AM, Michael Paquier <
> mic...@gm...> wrote:
>
>> Hi,
>>
>> I wrote a patch enabling the creation of views on Datanodes to get rid of
>> this function problem. The fix is attached.
>> However, while digging into this issue, I found a problem with types and
>> views, for example:
>> create table aa (a int);
>> create type aa_type as enum ('1','2','3','4','5','6');
>> create view aa_v as select max(a::aa_type) from aa; -- created on all the
>> nodes
>> ERROR: column "max" has pseudo-type anyenum
>>
>> This error comes from heap.c, where a check is done on the type of the
>> column.
>> The problem is that in the case of aggregates, we use the transition type
>> on Datanodes, which is a pseudo-type and is by definition forbidden for as
>> a column type.
>> The aggregate modification comes from here:
>> --- a/src/backend/parser/parse_agg.c
>> +++ b/src/backend/parser/parse_agg.c
>> @@ -209,6 +209,7 @@ transformAggregateCall(ParseState *pstate, Aggref
>> *agg,
>> aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
>> agg->aggtrantype = aggform->aggtranstype;
>> agg->agghas_collectfn = OidIsValid(aggform->aggcollectfn);
>> + //Error comes from this one:
>> if (IS_PGXC_DATANODE)
>> agg->aggtype = agg->aggtrantype;
>>
>> Associating a transition type on Datanodes for aggregates is correct, but
>> until now we have never created views on Datanodes.
>> Btw, a fix for this second issue is included in the patch attached. What
>> I simply did was bypassing the error on Datanodes as we may have a
>> pseudo-type in the case of an aggregate. Ashutosh, comments on that?
>>
>>
>>
>> On Wed, Jun 20, 2012 at 2:59 PM, Ashutosh Bapat <
>> ash...@en...> wrote:
>>
>>>
>>>
>>> On Wed, Jun 20, 2012 at 10:25 AM, Michael Paquier <
>>> mic...@gm...> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 20, 2012 at 12:58 PM, Ashutosh Bapat <
>>>> ash...@en...> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 20, 2012 at 9:18 AM, Michael Paquier <
>>>>> mic...@gm...> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 20, 2012 at 12:46 PM, Ashutosh Bapat <
>>>>>> ash...@en...> wrote:
>>>>>>
>>>>>>> One fix, I can think of is to create volatile functions only on
>>>>>>> coordinator. Although, I would still take a step back, and find out why we
>>>>>>> took the decision not to store views on the datanodes.
>>>>>>>
>>>>>> View => projection of table data => need distribution type of table
>>>>>> => distribution data only available on Coordinator for data distribution =>
>>>>>> no sense to define views on Datanodes
>>>>>>
>>>>>
>>>>> In the case, where a view type is used as function argument or return
>>>>> type, it does make sense to have the view definition on the datanodes. The
>>>>> implication behind my question is whether there is any correctness problem
>>>>> by creating view and related definitions at the datanodes.
>>>>>
>>>> By taking this question from another angle:
>>>> Are there any problems to push down clauses using views to Datanodes?
>>>>
>>>
>>> Having view definitions on the datanode does not imply that we have to
>>> push the clauses using views to the datanodes. In fact, even if we want to,
>>> we won't be able to do so, as the view resolution happens even before we
>>> take into consideration the distribution.
>>>
>>>
>>>> Just based on correctness, the answer is no problem. Btw, the function
>>>> using a view should be volatile as it reads data, so it will not be used on
>>>> Datanodes at all...
>>>>
>>>
>>> We are not using view here, we are using datatype which corresponds to
>>> the view result. Using such datatype does not necessarily mean that we
>>> touch any of the data. For example, see the function (modified version of
>>> the example given by Dimitrije) below
>>>
>>> CREATE OR REPLACE FUNCTION some_function() RETURNS SETOF some_view AS
>>> $body$
>>> BEGIN
>>> return (1, 1);
>>> END;
>>> $body$
>>> LANGUAGE 'plpgsql'
>>> COST 100;
>>>
>>> This function is certainly immutable (certainly not volatile), and thus
>>> pushable to the datanodes. For such functions, it having view definitions
>>> at the datanodes will be helpful.
>>>
>>>
>>>> --
>>>> Michael Paquier
>>>> http://michael.otacoo.com
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>>
>>
>>
>> --
>> Michael Paquier
>> http://michael.otacoo.com
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>
>
--
Michael Paquier
http://michael.otacoo.com
|