Thread: [Postgres-xc-developers] Shippability of statement triggers

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-developers

[Postgres-xc-developers] Shippability of statement triggers

From: Amit K. <ami...@en...> - 2013-05-10 05:04:22

We can consider applying the usual row-trigger rules of trigger
shippability to statement triggers:
If the trigger function is shippable, execute the trigger on datanode, else
on coordinator.

It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is
executed on datanode for each row to be processed. So if a user updates 10
rows with a non-shippable query, the coordinator will execute a
parameterized remote update query on datanode for each of the 10 ctids
found using the quals. And if we execute shippable statement triggers on
datanode, the statement trigger will be executed 10 times on datanode. Is
this expected from the user ?

>From the user's perspective, the statement is executed once,  so the
statement trigger should be fired only once. Typical use case is that the
user queries need to be logged/audited. So we need to prevent firing
statement triggers on datanode for non-FQS'ed query. But should the user
define the stmt trigger function as immutable in such a case ? May be not
in this auditing scenario. But it is not very clear what would a shippable
statement trigger mean to the user exactly. If the function is really one
which does not access the database as per the immutable definition, then it
anyway does not matter how many times it gets executed on datanode for a
given statement.

I think the solution is to *always* fire statement triggers on
*coordinator* regardless of shippability or whether it is FQS or non-FQS.
For FQS query, we need to explicitly fire stmt trigger before/after the
fqs'ed query node is executed, may be inside the ExecRemoteQuery() function
itself.

Comments ?

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Ashutosh B. <ash...@en...> - 2013-05-10 05:59:15

Hi Amit,
This looks fine from correctness perspective and may be ok for this
release. I hope, we can still FQS the query and not ship the triggers. How
would it affect performance?

Michael,
Can you comment on this, since you are the one who implemented statement
triggers?


On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar <
ami...@en...> wrote:

> We can consider applying the usual row-trigger rules of trigger
> shippability to statement triggers:
> If the trigger function is shippable, execute the trigger on datanode,
> else on coordinator.
>
> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is
> executed on datanode for each row to be processed. So if a user updates 10
> rows with a non-shippable query, the coordinator will execute a
> parameterized remote update query on datanode for each of the 10 ctids
> found using the quals. And if we execute shippable statement triggers on
> datanode, the statement trigger will be executed 10 times on datanode. Is
> this expected from the user ?
>
> From the user's perspective, the statement is executed once,  so the
> statement trigger should be fired only once. Typical use case is that the
> user queries need to be logged/audited. So we need to prevent firing
> statement triggers on datanode for non-FQS'ed query. But should the user
> define the stmt trigger function as immutable in such a case ? May be not
> in this auditing scenario. But it is not very clear what would a shippable
> statement trigger mean to the user exactly. If the function is really one
> which does not access the database as per the immutable definition, then it
> anyway does not matter how many times it gets executed on datanode for a
> given statement.
>
> I think the solution is to *always* fire statement triggers on
> *coordinator* regardless of shippability or whether it is FQS or non-FQS.
> For FQS query, we need to explicitly fire stmt trigger before/after the
> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function
> itself.
>
> Comments ?
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Amit K. <ami...@en...> - 2013-05-10 06:07:21

On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote:

>
> Hi Amit,
> This looks fine from correctness perspective and may be ok for this
> release. I hope, we can still FQS the query and not ship the triggers.
>

Yes, that's the idea. We handle statement triggers explicitly on
coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS,
and then call the BS and AS triggers in ExecInitRemoteQuery() and
ExecEndRemoteQuery() if resultRelInfo is set.


How would it affect performance?
>

Since the query would still be FQSed even for non-shippable stmt trigger,
this does not impact performance. Stmt trigger would be fired only once.

Michael,

> Can you comment on this, since you are the one who implemented statement
> triggers?
>
>
> On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar <
> ami...@en...> wrote:
>
>> We can consider applying the usual row-trigger rules of trigger
>> shippability to statement triggers:
>> If the trigger function is shippable, execute the trigger on datanode,
>> else on coordinator.
>>
>> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is
>> executed on datanode for each row to be processed. So if a user updates 10
>> rows with a non-shippable query, the coordinator will execute a
>> parameterized remote update query on datanode for each of the 10 ctids
>> found using the quals. And if we execute shippable statement triggers on
>> datanode, the statement trigger will be executed 10 times on datanode. Is
>> this expected from the user ?
>>
>> From the user's perspective, the statement is executed once,  so the
>> statement trigger should be fired only once. Typical use case is that the
>> user queries need to be logged/audited. So we need to prevent firing
>> statement triggers on datanode for non-FQS'ed query. But should the user
>> define the stmt trigger function as immutable in such a case ? May be not
>> in this auditing scenario. But it is not very clear what would a shippable
>> statement trigger mean to the user exactly. If the function is really one
>> which does not access the database as per the immutable definition, then it
>> anyway does not matter how many times it gets executed on datanode for a
>> given statement.
>>
>> I think the solution is to *always* fire statement triggers on
>> *coordinator* regardless of shippability or whether it is FQS or non-FQS.
>> For FQS query, we need to explicitly fire stmt trigger before/after the
>> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function
>> itself.
>>
>> Comments ?
>>
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Koichi S. <koi...@gm...> - 2013-05-10 10:30:30

I think this is a reasonable point to start with.
However, in general, TRIGGER is an expensive action so I believe we should
provide more sophisiticated optimization of TRIGGER action.   I understand
this is not an issue of TRIGGER.   We need to look into rewriter, function
shippability (current shippability is safe but not optimal -- I'd like to
discuss this shortly), and statement shippability,  I mean partial statment
shippability.

I undertand trigger action needs careful and sophisticated design and has
complicated relationship with other component of query processing.


Best;

----------
Koichi Suzuki


2013/5/10 Amit Khandekar <ami...@en...>

>
>
> On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote:
>
>>
>> Hi Amit,
>> This looks fine from correctness perspective and may be ok for this
>> release. I hope, we can still FQS the query and not ship the triggers.
>>
>
> Yes, that's the idea. We handle statement triggers explicitly on
> coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS,
> and then call the BS and AS triggers in ExecInitRemoteQuery() and
> ExecEndRemoteQuery() if resultRelInfo is set.
>
>
> How would it affect performance?
>>
>
> Since the query would still be FQSed even for non-shippable stmt trigger,
> this does not impact performance. Stmt trigger would be fired only once.
>
> Michael,
>
>> Can you comment on this, since you are the one who implemented statement
>> triggers?
>>
>>
>> On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> We can consider applying the usual row-trigger rules of trigger
>>> shippability to statement triggers:
>>> If the trigger function is shippable, execute the trigger on datanode,
>>> else on coordinator.
>>>
>>> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is
>>> executed on datanode for each row to be processed. So if a user updates 10
>>> rows with a non-shippable query, the coordinator will execute a
>>> parameterized remote update query on datanode for each of the 10 ctids
>>> found using the quals. And if we execute shippable statement triggers on
>>> datanode, the statement trigger will be executed 10 times on datanode. Is
>>> this expected from the user ?
>>>
>>> From the user's perspective, the statement is executed once,  so the
>>> statement trigger should be fired only once. Typical use case is that the
>>> user queries need to be logged/audited. So we need to prevent firing
>>> statement triggers on datanode for non-FQS'ed query. But should the user
>>> define the stmt trigger function as immutable in such a case ? May be not
>>> in this auditing scenario. But it is not very clear what would a shippable
>>> statement trigger mean to the user exactly. If the function is really one
>>> which does not access the database as per the immutable definition, then it
>>> anyway does not matter how many times it gets executed on datanode for a
>>> given statement.
>>>
>>> I think the solution is to *always* fire statement triggers on
>>> *coordinator* regardless of shippability or whether it is FQS or non-FQS.
>>> For FQS query, we need to explicitly fire stmt trigger before/after the
>>> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function
>>> itself.
>>>
>>> Comments ?
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>> _______________________________________________
>>> Postgres-xc-developers mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>
>>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Postgres Database Company
>>
>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Amit K. <ami...@en...> - 2013-05-10 12:10:02

On 10 May 2013 16:00, Koichi Suzuki <koi...@gm...> wrote:

> I think this is a reasonable point to start with.
> However, in general, TRIGGER is an expensive action so I believe we should
> provide more sophisiticated optimization of TRIGGER action.   I understand
> this is not an issue of TRIGGER.   We need to look into rewriter, function
> shippability (current shippability is safe but not optimal -- I'd like to
> discuss this shortly), and statement shippability,  I mean partial statment
> shippability.
>
>
That's a long standing issue that we have been discussing as to what are
the various ways in which a user can make trigger functions execute safely
on datanode. Actually the performance impact is particularly critical for
row triggers, not so much for statement triggers since they execute only
once.

I undertand trigger action needs careful and sophisticated design and has
> complicated relationship with other component of query processing.
>

>
> Best;
>
> ----------
> Koichi Suzuki
>
>
> 2013/5/10 Amit Khandekar <ami...@en...>
>
>>
>>
>> On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote:
>>
>>>
>>> Hi Amit,
>>> This looks fine from correctness perspective and may be ok for this
>>> release. I hope, we can still FQS the query and not ship the triggers.
>>>
>>
>> Yes, that's the idea. We handle statement triggers explicitly on
>> coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS,
>> and then call the BS and AS triggers in ExecInitRemoteQuery() and
>> ExecEndRemoteQuery() if resultRelInfo is set.
>>
>>
>> How would it affect performance?
>>>
>>
>> Since the query would still be FQSed even for non-shippable stmt trigger,
>> this does not impact performance. Stmt trigger would be fired only once.
>>
>> Michael,
>>
>>> Can you comment on this, since you are the one who implemented statement
>>> triggers?
>>>
>>>
>>> On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>> We can consider applying the usual row-trigger rules of trigger
>>>> shippability to statement triggers:
>>>> If the trigger function is shippable, execute the trigger on datanode,
>>>> else on coordinator.
>>>>
>>>> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML
>>>> is executed on datanode for each row to be processed. So if a user updates
>>>> 10 rows with a non-shippable query, the coordinator will execute a
>>>> parameterized remote update query on datanode for each of the 10 ctids
>>>> found using the quals. And if we execute shippable statement triggers on
>>>> datanode, the statement trigger will be executed 10 times on datanode. Is
>>>> this expected from the user ?
>>>>
>>>> From the user's perspective, the statement is executed once,  so the
>>>> statement trigger should be fired only once. Typical use case is that the
>>>> user queries need to be logged/audited. So we need to prevent firing
>>>> statement triggers on datanode for non-FQS'ed query. But should the user
>>>> define the stmt trigger function as immutable in such a case ? May be not
>>>> in this auditing scenario. But it is not very clear what would a shippable
>>>> statement trigger mean to the user exactly. If the function is really one
>>>> which does not access the database as per the immutable definition, then it
>>>> anyway does not matter how many times it gets executed on datanode for a
>>>> given statement.
>>>>
>>>> I think the solution is to *always* fire statement triggers on
>>>> *coordinator* regardless of shippability or whether it is FQS or non-FQS.
>>>> For FQS query, we need to explicitly fire stmt trigger before/after the
>>>> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function
>>>> itself.
>>>>
>>>> Comments ?
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>> their applications. This 200-page book is written by three acclaimed
>>>> leaders in the field. The early access version is available now.
>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>> _______________________________________________
>>>> Postgres-xc-developers mailing list
>>>> Pos...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Postgres Database Company
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Michael P. <mic...@gm...> - 2013-05-10 12:29:25

On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat <
ash...@en...> wrote:

> Can you comment on this, since you are the one who implemented statement
> triggers?
>
About the shippability of triggers, here is some food for brain:
http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/

In short, a trigger can be fired on a remote node only if:
- the query that triggered it is FQSed
- the fired procedure is immutable. There might be cases where a trigger
with a stable procedure could be shippable but this would be dangerous...
There are already some APIs I implemented that you can use if they are not
used already (trigger.c or something in the shippability APIs of the
optimizer I don't recall precisely).

Just knowing that 99% of triggers do not use immutable functions is enough
to shoot all the triggers on Coordinators, you will be bad performance for
row triggers but if you don't do that data consistency is badly endangered.
However for correctness you should open the open to immutable triggers
being shippable.
-- 
Michael

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Ashutosh B. <ash...@en...> - 2013-05-10 12:31:25

On Fri, May 10, 2013 at 5:59 PM, Michael Paquier
<mic...@gm...>wrote:

>
>
>
> On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat <
> ash...@en...> wrote:
>
>> Can you comment on this, since you are the one who implemented statement
>> triggers?
>>
> About the shippability of triggers, here is some food for brain:
> http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/
>
> In short, a trigger can be fired on a remote node only if:
> - the query that triggered it is FQSed
>

That's busted, and hence this whole mail tread.


> - the fired procedure is immutable. There might be cases where a trigger
> with a stable procedure could be shippable but this would be dangerous...
> There are already some APIs I implemented that you can use if they are not
> used already (trigger.c or something in the shippability APIs of the
> optimizer I don't recall precisely).
>

Unfortunately those APIs are found to be heavily erroneous and Amit is
currently working on writing correct APIs.


>
> Just knowing that 99% of triggers do not use immutable functions is enough
> to shoot all the triggers on Coordinators, you will be bad performance for
> row triggers but if you don't do that data consistency is badly endangered.
> However for correctness you should open the open to immutable triggers
> being shippable.
> --
> Michael
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Amit K. <ami...@en...> - 2013-05-10 13:01:01

On 10 May 2013 17:59, Michael Paquier <mic...@gm...> wrote:

>
>
>
> On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat <
> ash...@en...> wrote:
>
>> Can you comment on this, since you are the one who implemented statement
>> triggers?
>>
> About the shippability of triggers, here is some food for brain:
> http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/
>
> In short, a trigger can be fired on a remote node only if:
> - the query that triggered it is FQSed
> - the fired procedure is immutable. There might be cases where a trigger
> with a stable procedure could be shippable but this would be dangerous...
> There are already some APIs I implemented that you can use if they are not
> used already (trigger.c or something in the shippability APIs of the
> optimizer I don't recall precisely).
>
> Just knowing that 99% of triggers do not use immutable functions is enough
> to shoot all the triggers on Coordinators, you will be bad performance for
> row triggers but if you don't do that data consistency is badly endangered.
> However for correctness you should open the open to immutable triggers
> being shippable.
>

Actually I don't want to go into general shippability of statements in
trigger context; that is another issue having a bigger scope. My point for
this mail thread is for this specific case: What to do about statement
triggers, whether we should indeed run *statement* triggers on datanode on
the basis that the trigger function is safe to be run on datanode. Here is
the key point I made :

"For a non-FQS'ed DML, a DML is executed on datanode for each row to be
processed. So if a user updates 10 rows with a non-shippable query, the
coordinator will execute a parameterized remote update query on datanode
for each of the 10 ctids found using the quals. And if we execute shippable
statement triggers on datanode, the statement trigger will be executed 10
times on datanode, which is not expected from the user. That's the reason
we should fire statement triggers always on coordinator regardless of
anything"

Let me know if you have any comments on this specific point.

 --
> Michael
>

Re: [Postgres-xc-developers] Shippability of statement triggers

From: Koichi S. <koi...@gm...> - 2013-05-10 15:38:37

----------
Koichi Suzuki


2013/5/10 Amit Khandekar <ami...@en...>

>
>
> On 10 May 2013 17:59, Michael Paquier <mic...@gm...> wrote:
>
>>
>>
>>
>> On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat <
>> ash...@en...> wrote:
>>
>>> Can you comment on this, since you are the one who implemented statement
>>> triggers?
>>>
>> About the shippability of triggers, here is some food for brain:
>> http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/
>>
>> In short, a trigger can be fired on a remote node only if:
>> - the query that triggered it is FQSed
>> - the fired procedure is immutable. There might be cases where a trigger
>> with a stable procedure could be shippable but this would be dangerous...
>> There are already some APIs I implemented that you can use if they are
>> not used already (trigger.c or something in the shippability APIs of the
>> optimizer I don't recall precisely).
>>
>> Just knowing that 99% of triggers do not use immutable functions is
>> enough to shoot all the triggers on Coordinators, you will be bad
>> performance for row triggers but if you don't do that data consistency is
>> badly endangered. However for correctness you should open the open to
>> immutable triggers being shippable.
>>
>
> Actually I don't want to go into general shippability of statements in
> trigger context; that is another issue having a bigger scope. My point for
> this mail thread is for this specific case: What to do about statement
> triggers, whether we should indeed run *statement* triggers on datanode on
> the basis that the trigger function is safe to be run on datanode. Here is
> the key point I made :
>

I agree that the general function shippability is another issue to solve.

>
> "For a non-FQS'ed DML, a DML is executed on datanode for each row to be
> processed. So if a user updates 10 rows with a non-shippable query, the
> coordinator will execute a parameterized remote update query on datanode
> for each of the 10 ctids found using the quals. And if we execute shippable
> statement triggers on datanode, the statement trigger will be executed 10
> times on datanode, which is not expected from the user. That's the reason
> we should fire statement triggers always on coordinator regardless of
> anything"
>
> Let me know if you have any comments on this specific point.
>

This is quite reasonable approach, I think.   Statement trigger should be
fired only once and because only a coordinator is aware of such statements,
it is quite natural that statment trigger should be fired on the
coordinator.

Best;
---
Koichi Suzuki


>
>  --
>> Michael
>>
>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>