From: Amit K. <ami...@en...> - 2013-05-10 05:04:22
|
We can consider applying the usual row-trigger rules of trigger shippability to statement triggers: If the trigger function is shippable, execute the trigger on datanode, else on coordinator. It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is executed on datanode for each row to be processed. So if a user updates 10 rows with a non-shippable query, the coordinator will execute a parameterized remote update query on datanode for each of the 10 ctids found using the quals. And if we execute shippable statement triggers on datanode, the statement trigger will be executed 10 times on datanode. Is this expected from the user ? >From the user's perspective, the statement is executed once, so the statement trigger should be fired only once. Typical use case is that the user queries need to be logged/audited. So we need to prevent firing statement triggers on datanode for non-FQS'ed query. But should the user define the stmt trigger function as immutable in such a case ? May be not in this auditing scenario. But it is not very clear what would a shippable statement trigger mean to the user exactly. If the function is really one which does not access the database as per the immutable definition, then it anyway does not matter how many times it gets executed on datanode for a given statement. I think the solution is to *always* fire statement triggers on *coordinator* regardless of shippability or whether it is FQS or non-FQS. For FQS query, we need to explicitly fire stmt trigger before/after the fqs'ed query node is executed, may be inside the ExecRemoteQuery() function itself. Comments ? |
From: Ashutosh B. <ash...@en...> - 2013-05-10 05:59:15
|
Hi Amit, This looks fine from correctness perspective and may be ok for this release. I hope, we can still FQS the query and not ship the triggers. How would it affect performance? Michael, Can you comment on this, since you are the one who implemented statement triggers? On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar < ami...@en...> wrote: > We can consider applying the usual row-trigger rules of trigger > shippability to statement triggers: > If the trigger function is shippable, execute the trigger on datanode, > else on coordinator. > > It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is > executed on datanode for each row to be processed. So if a user updates 10 > rows with a non-shippable query, the coordinator will execute a > parameterized remote update query on datanode for each of the 10 ctids > found using the quals. And if we execute shippable statement triggers on > datanode, the statement trigger will be executed 10 times on datanode. Is > this expected from the user ? > > From the user's perspective, the statement is executed once, so the > statement trigger should be fired only once. Typical use case is that the > user queries need to be logged/audited. So we need to prevent firing > statement triggers on datanode for non-FQS'ed query. But should the user > define the stmt trigger function as immutable in such a case ? May be not > in this auditing scenario. But it is not very clear what would a shippable > statement trigger mean to the user exactly. If the function is really one > which does not access the database as per the immutable definition, then it > anyway does not matter how many times it gets executed on datanode for a > given statement. > > I think the solution is to *always* fire statement triggers on > *coordinator* regardless of shippability or whether it is FQS or non-FQS. > For FQS query, we need to explicitly fire stmt trigger before/after the > fqs'ed query node is executed, may be inside the ExecRemoteQuery() function > itself. > > Comments ? > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company |
From: Amit K. <ami...@en...> - 2013-05-10 06:07:21
|
On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote: > > Hi Amit, > This looks fine from correctness perspective and may be ok for this > release. I hope, we can still FQS the query and not ship the triggers. > Yes, that's the idea. We handle statement triggers explicitly on coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS, and then call the BS and AS triggers in ExecInitRemoteQuery() and ExecEndRemoteQuery() if resultRelInfo is set. How would it affect performance? > Since the query would still be FQSed even for non-shippable stmt trigger, this does not impact performance. Stmt trigger would be fired only once. Michael, > Can you comment on this, since you are the one who implemented statement > triggers? > > > On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar < > ami...@en...> wrote: > >> We can consider applying the usual row-trigger rules of trigger >> shippability to statement triggers: >> If the trigger function is shippable, execute the trigger on datanode, >> else on coordinator. >> >> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is >> executed on datanode for each row to be processed. So if a user updates 10 >> rows with a non-shippable query, the coordinator will execute a >> parameterized remote update query on datanode for each of the 10 ctids >> found using the quals. And if we execute shippable statement triggers on >> datanode, the statement trigger will be executed 10 times on datanode. Is >> this expected from the user ? >> >> From the user's perspective, the statement is executed once, so the >> statement trigger should be fired only once. Typical use case is that the >> user queries need to be logged/audited. So we need to prevent firing >> statement triggers on datanode for non-FQS'ed query. But should the user >> define the stmt trigger function as immutable in such a case ? May be not >> in this auditing scenario. But it is not very clear what would a shippable >> statement trigger mean to the user exactly. If the function is really one >> which does not access the database as per the immutable definition, then it >> anyway does not matter how many times it gets executed on datanode for a >> given statement. >> >> I think the solution is to *always* fire statement triggers on >> *coordinator* regardless of shippability or whether it is FQS or non-FQS. >> For FQS query, we need to explicitly fire stmt trigger before/after the >> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function >> itself. >> >> Comments ? >> >> >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and >> their applications. This 200-page book is written by three acclaimed >> leaders in the field. The early access version is available now. >> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Postgres Database Company > |
From: Koichi S. <koi...@gm...> - 2013-05-10 10:30:30
|
I think this is a reasonable point to start with. However, in general, TRIGGER is an expensive action so I believe we should provide more sophisiticated optimization of TRIGGER action. I understand this is not an issue of TRIGGER. We need to look into rewriter, function shippability (current shippability is safe but not optimal -- I'd like to discuss this shortly), and statement shippability, I mean partial statment shippability. I undertand trigger action needs careful and sophisticated design and has complicated relationship with other component of query processing. Best; ---------- Koichi Suzuki 2013/5/10 Amit Khandekar <ami...@en...> > > > On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote: > >> >> Hi Amit, >> This looks fine from correctness perspective and may be ok for this >> release. I hope, we can still FQS the query and not ship the triggers. >> > > Yes, that's the idea. We handle statement triggers explicitly on > coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS, > and then call the BS and AS triggers in ExecInitRemoteQuery() and > ExecEndRemoteQuery() if resultRelInfo is set. > > > How would it affect performance? >> > > Since the query would still be FQSed even for non-shippable stmt trigger, > this does not impact performance. Stmt trigger would be fired only once. > > Michael, > >> Can you comment on this, since you are the one who implemented statement >> triggers? >> >> >> On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar < >> ami...@en...> wrote: >> >>> We can consider applying the usual row-trigger rules of trigger >>> shippability to statement triggers: >>> If the trigger function is shippable, execute the trigger on datanode, >>> else on coordinator. >>> >>> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML is >>> executed on datanode for each row to be processed. So if a user updates 10 >>> rows with a non-shippable query, the coordinator will execute a >>> parameterized remote update query on datanode for each of the 10 ctids >>> found using the quals. And if we execute shippable statement triggers on >>> datanode, the statement trigger will be executed 10 times on datanode. Is >>> this expected from the user ? >>> >>> From the user's perspective, the statement is executed once, so the >>> statement trigger should be fired only once. Typical use case is that the >>> user queries need to be logged/audited. So we need to prevent firing >>> statement triggers on datanode for non-FQS'ed query. But should the user >>> define the stmt trigger function as immutable in such a case ? May be not >>> in this auditing scenario. But it is not very clear what would a shippable >>> statement trigger mean to the user exactly. If the function is really one >>> which does not access the database as per the immutable definition, then it >>> anyway does not matter how many times it gets executed on datanode for a >>> given statement. >>> >>> I think the solution is to *always* fire statement triggers on >>> *coordinator* regardless of shippability or whether it is FQS or non-FQS. >>> For FQS query, we need to explicitly fire stmt trigger before/after the >>> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function >>> itself. >>> >>> Comments ? >>> >>> >>> ------------------------------------------------------------------------------ >>> Learn Graph Databases - Download FREE O'Reilly Book >>> "Graph Databases" is the definitive new guide to graph databases and >>> their applications. This 200-page book is written by three acclaimed >>> leaders in the field. The early access version is available now. >>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Postgres Database Company >> > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Amit K. <ami...@en...> - 2013-05-10 12:10:02
|
On 10 May 2013 16:00, Koichi Suzuki <koi...@gm...> wrote: > I think this is a reasonable point to start with. > However, in general, TRIGGER is an expensive action so I believe we should > provide more sophisiticated optimization of TRIGGER action. I understand > this is not an issue of TRIGGER. We need to look into rewriter, function > shippability (current shippability is safe but not optimal -- I'd like to > discuss this shortly), and statement shippability, I mean partial statment > shippability. > > That's a long standing issue that we have been discussing as to what are the various ways in which a user can make trigger functions execute safely on datanode. Actually the performance impact is particularly critical for row triggers, not so much for statement triggers since they execute only once. I undertand trigger action needs careful and sophisticated design and has > complicated relationship with other component of query processing. > > > Best; > > ---------- > Koichi Suzuki > > > 2013/5/10 Amit Khandekar <ami...@en...> > >> >> >> On 10 May 2013 11:29, Ashutosh Bapat <ash...@en...>wrote: >> >>> >>> Hi Amit, >>> This looks fine from correctness perspective and may be ok for this >>> release. I hope, we can still FQS the query and not ship the triggers. >>> >> >> Yes, that's the idea. We handle statement triggers explicitly on >> coordinator for FQS. We define a resultRelInfo in RemoteQuery if it is FQS, >> and then call the BS and AS triggers in ExecInitRemoteQuery() and >> ExecEndRemoteQuery() if resultRelInfo is set. >> >> >> How would it affect performance? >>> >> >> Since the query would still be FQSed even for non-shippable stmt trigger, >> this does not impact performance. Stmt trigger would be fired only once. >> >> Michael, >> >>> Can you comment on this, since you are the one who implemented statement >>> triggers? >>> >>> >>> On Fri, May 10, 2013 at 10:33 AM, Amit Khandekar < >>> ami...@en...> wrote: >>> >>>> We can consider applying the usual row-trigger rules of trigger >>>> shippability to statement triggers: >>>> If the trigger function is shippable, execute the trigger on datanode, >>>> else on coordinator. >>>> >>>> It is not as trivial as it sounds though. For a non-FQS'ed DML, a DML >>>> is executed on datanode for each row to be processed. So if a user updates >>>> 10 rows with a non-shippable query, the coordinator will execute a >>>> parameterized remote update query on datanode for each of the 10 ctids >>>> found using the quals. And if we execute shippable statement triggers on >>>> datanode, the statement trigger will be executed 10 times on datanode. Is >>>> this expected from the user ? >>>> >>>> From the user's perspective, the statement is executed once, so the >>>> statement trigger should be fired only once. Typical use case is that the >>>> user queries need to be logged/audited. So we need to prevent firing >>>> statement triggers on datanode for non-FQS'ed query. But should the user >>>> define the stmt trigger function as immutable in such a case ? May be not >>>> in this auditing scenario. But it is not very clear what would a shippable >>>> statement trigger mean to the user exactly. If the function is really one >>>> which does not access the database as per the immutable definition, then it >>>> anyway does not matter how many times it gets executed on datanode for a >>>> given statement. >>>> >>>> I think the solution is to *always* fire statement triggers on >>>> *coordinator* regardless of shippability or whether it is FQS or non-FQS. >>>> For FQS query, we need to explicitly fire stmt trigger before/after the >>>> fqs'ed query node is executed, may be inside the ExecRemoteQuery() function >>>> itself. >>>> >>>> Comments ? >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Learn Graph Databases - Download FREE O'Reilly Book >>>> "Graph Databases" is the definitive new guide to graph databases and >>>> their applications. This 200-page book is written by three acclaimed >>>> leaders in the field. The early access version is available now. >>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may >>>> _______________________________________________ >>>> Postgres-xc-developers mailing list >>>> Pos...@li... >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> >>>> >>> >>> >>> -- >>> Best Wishes, >>> Ashutosh Bapat >>> EntepriseDB Corporation >>> The Postgres Database Company >>> >> >> >> >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and >> their applications. This 200-page book is written by three acclaimed >> leaders in the field. The early access version is available now. >> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > |
From: Michael P. <mic...@gm...> - 2013-05-10 12:29:25
|
On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat < ash...@en...> wrote: > Can you comment on this, since you are the one who implemented statement > triggers? > About the shippability of triggers, here is some food for brain: http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/ In short, a trigger can be fired on a remote node only if: - the query that triggered it is FQSed - the fired procedure is immutable. There might be cases where a trigger with a stable procedure could be shippable but this would be dangerous... There are already some APIs I implemented that you can use if they are not used already (trigger.c or something in the shippability APIs of the optimizer I don't recall precisely). Just knowing that 99% of triggers do not use immutable functions is enough to shoot all the triggers on Coordinators, you will be bad performance for row triggers but if you don't do that data consistency is badly endangered. However for correctness you should open the open to immutable triggers being shippable. -- Michael |
From: Ashutosh B. <ash...@en...> - 2013-05-10 12:31:25
|
On Fri, May 10, 2013 at 5:59 PM, Michael Paquier <mic...@gm...>wrote: > > > > On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Can you comment on this, since you are the one who implemented statement >> triggers? >> > About the shippability of triggers, here is some food for brain: > http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/ > > In short, a trigger can be fired on a remote node only if: > - the query that triggered it is FQSed > That's busted, and hence this whole mail tread. > - the fired procedure is immutable. There might be cases where a trigger > with a stable procedure could be shippable but this would be dangerous... > There are already some APIs I implemented that you can use if they are not > used already (trigger.c or something in the shippability APIs of the > optimizer I don't recall precisely). > Unfortunately those APIs are found to be heavily erroneous and Amit is currently working on writing correct APIs. > > Just knowing that 99% of triggers do not use immutable functions is enough > to shoot all the triggers on Coordinators, you will be bad performance for > row triggers but if you don't do that data consistency is badly endangered. > However for correctness you should open the open to immutable triggers > being shippable. > -- > Michael > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company |
From: Amit K. <ami...@en...> - 2013-05-10 13:01:01
|
On 10 May 2013 17:59, Michael Paquier <mic...@gm...> wrote: > > > > On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Can you comment on this, since you are the one who implemented statement >> triggers? >> > About the shippability of triggers, here is some food for brain: > http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/ > > In short, a trigger can be fired on a remote node only if: > - the query that triggered it is FQSed > - the fired procedure is immutable. There might be cases where a trigger > with a stable procedure could be shippable but this would be dangerous... > There are already some APIs I implemented that you can use if they are not > used already (trigger.c or something in the shippability APIs of the > optimizer I don't recall precisely). > > Just knowing that 99% of triggers do not use immutable functions is enough > to shoot all the triggers on Coordinators, you will be bad performance for > row triggers but if you don't do that data consistency is badly endangered. > However for correctness you should open the open to immutable triggers > being shippable. > Actually I don't want to go into general shippability of statements in trigger context; that is another issue having a bigger scope. My point for this mail thread is for this specific case: What to do about statement triggers, whether we should indeed run *statement* triggers on datanode on the basis that the trigger function is safe to be run on datanode. Here is the key point I made : "For a non-FQS'ed DML, a DML is executed on datanode for each row to be processed. So if a user updates 10 rows with a non-shippable query, the coordinator will execute a parameterized remote update query on datanode for each of the 10 ctids found using the quals. And if we execute shippable statement triggers on datanode, the statement trigger will be executed 10 times on datanode, which is not expected from the user. That's the reason we should fire statement triggers always on coordinator regardless of anything" Let me know if you have any comments on this specific point. -- > Michael > |
From: Koichi S. <koi...@gm...> - 2013-05-10 15:38:37
|
---------- Koichi Suzuki 2013/5/10 Amit Khandekar <ami...@en...> > > > On 10 May 2013 17:59, Michael Paquier <mic...@gm...> wrote: > >> >> >> >> On Fri, May 10, 2013 at 2:59 PM, Ashutosh Bapat < >> ash...@en...> wrote: >> >>> Can you comment on this, since you are the one who implemented statement >>> triggers? >>> >> About the shippability of triggers, here is some food for brain: >> http://michael.otacoo.com/postgresql-2/triggers-in-a-cluster-database/ >> >> In short, a trigger can be fired on a remote node only if: >> - the query that triggered it is FQSed >> - the fired procedure is immutable. There might be cases where a trigger >> with a stable procedure could be shippable but this would be dangerous... >> There are already some APIs I implemented that you can use if they are >> not used already (trigger.c or something in the shippability APIs of the >> optimizer I don't recall precisely). >> >> Just knowing that 99% of triggers do not use immutable functions is >> enough to shoot all the triggers on Coordinators, you will be bad >> performance for row triggers but if you don't do that data consistency is >> badly endangered. However for correctness you should open the open to >> immutable triggers being shippable. >> > > Actually I don't want to go into general shippability of statements in > trigger context; that is another issue having a bigger scope. My point for > this mail thread is for this specific case: What to do about statement > triggers, whether we should indeed run *statement* triggers on datanode on > the basis that the trigger function is safe to be run on datanode. Here is > the key point I made : > I agree that the general function shippability is another issue to solve. > > "For a non-FQS'ed DML, a DML is executed on datanode for each row to be > processed. So if a user updates 10 rows with a non-shippable query, the > coordinator will execute a parameterized remote update query on datanode > for each of the 10 ctids found using the quals. And if we execute shippable > statement triggers on datanode, the statement trigger will be executed 10 > times on datanode, which is not expected from the user. That's the reason > we should fire statement triggers always on coordinator regardless of > anything" > > Let me know if you have any comments on this specific point. > This is quite reasonable approach, I think. Statement trigger should be fired only once and because only a coordinator is aware of such statements, it is quite natural that statment trigger should be fired on the coordinator. Best; --- Koichi Suzuki > > -- >> Michael >> > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |