From: Ashutosh B. <ash...@en...> - 2013-03-01 07:39:39
|
Hi Amit, Your write-up summarizes the requirements well. Thanks for the write-up. Let me comment on the open issues written below Issue: Whether we should fetch the whole from the datanode (OLD row) and not just ctid and node_id and required columns and store it at the coordinator for the processing OR whether we should fetch each row (OLD and NEW variants) while processing each row. Both of them have performance impacts - the first one has disk impact for large number of rows whereas the second has network impact for querying rows. Is it possible to do some analytical assessment as to which of them would be better? If you can come up with something concrete (may be numbers or formulae) we will be able to judge better as to which one to pick up. See few other comments inline On Tue, Feb 19, 2013 at 10:08 AM, Amit Khandekar < ami...@en...> wrote: > Below are details that cover various points that need to be > considered for the implemention of BEFORE UPDATE and AFTER UPDATE row > triggers. Because UPDATE covers issues that might occur for both > DELETE as well as INSERT, I have targetted UPDATE first. Attached is > the WIP patch if you need to peek into specifics. > > AFTER triggers > -------------- > > AFTER triggers are executed only at the end of statement/transaction. > The trigger data is saved in a trigger event queue along with the OLD > and NEW ctid. And just before the query/transaction end, the OLD and > NEW tuples corresponding to the OLD and NEW ctid are fetched using > heap_fetch(SnapshotAny), and each of the saved trigger events are > executed. For XC (in the patch), we fetch the tuple remotely from the > datanode. And, in addition to ctid, we store the xc_node_id as well in > the trigger event data. Created a function to do the remote fetch; > this function would be called instead of heap_fetch(). > > To get NEW row ctid, we add junk attributes ctid and node_id in the > RETURNING target list of UPDATE. > To get OLD row ctid, we already have the ctid in the SELECT subplan. > > Currently we fetch all attributes in the SELECT subplans. I have > created another patch to fetch only the required attribtues, but have > not merged that into this patch. Ultimately, when triggers are > present, we want the SELECT to look like this: > SELECT ctid, xc_node_id from tab1 > > Another way to do the AFTER ROW triggers would be to store all the new > and old rows in the coordinator memory instead of fetching it when > needed. This has a potential of disk writes for large number of update > rows. > > When there is a WHEN condition on OLD and NEW row for AFTER triggers, > the WHEN condition is evaluated immediately, and is not deferred until > the query/transaction end, and all the triggers that are evaluated to > false are not added in the event queue. So if WHEN is present, we need > to make it available by fetching the OLD row using SELECT sublan. For > this we add another junk attribute wholerow in the SELECT target list. > So in this case, the SELECT will look like this: > SELECT ctid, xc_node_id, tab1.*::tab1 from tab1 > > > > BEFORE triggers > --------------- > > These are executed just before the row is inserted/updated. There can > be multiple triggers on the same table. The return ROW of a trigger > function is the NEW row of the next trigger, while the OLD row remains > same across all of the triggers. So the NEW row can keep changing > because the trigger function can modify the NEW row before returning. > Ultimately the last NEW row is the one that gets into the table. > > We need to add wholerow junk attribute in SELECT for fetching the OLD > row. And the NEW row is created by overwriting the OLD row values by > the modified column values. Once we have OLD and NEW tuples, the rest > of the work is taken care of usign existing trigger PG implementation. > > We have been using the subplan slot to generate the data for BINDing > the parameter values for UPDATE. Now for BEFORE ROW triggers, the data > of NEW row is the one that is to be BOUND. And this NEW tuple slot > does not have the ctid and node_id junk attribute values. So we need > to generate the BIND data using the data from NEW row and the junk > attributes from the source data. > Similarly there are changes in the way we generate the parameters > markers for the remote UPDATE statement. > > > TODOs: > ====== > > 1. Current implementation patch has a very basic working model. There > are various TODOs marked in the patch. > > 2. One important TODO for BEFORE trigger is this: Just before > invoking the trigger functions, in PG, the tuple is row-locked > (exclusive) by GetTupleTrigger() and the locked version is fetched > from the table. So it is made sure that while all the triggers for > that table are executed, no one can update that particular row. > In the patch, we haven't locked the row. We need to lock it either by > executing : > 1. SELECT * from tab1 where ctid = <ctid_val> FOR UPDATE, and then > use the returned ROW as the OLD row. > OR > 2. The UPDATE subplan itself should have SELECT for UPDATE so that > the row is already locked, and we don't have to lock it again. > #2 is simple though it might cause some amount of longer waits in general. > Using #1, though the locks would be acquired only when the particular > row is updated, the locks would be released only after transaction > end, so #1 might not be worth implementing. > Also #1 requires another explicit remote fetch for the > lock-and-get-latest-version operation. > I am more inclined towards #2. > > The option #2 however, has problem of locking too many rows if there are coordinator quals in the subplans IOW the number of rows finally updated are lesser than the number of rows fetched from the datanode. It can cause unwanted deadlocks. Unless there is a way to release these extra locks, I am afraid this option will be a problem. > 3. The BEFORE trigger function can change the distribution column > itself. We need to add a check at the end of the trigger executions. > > Good, you thought about that. Yes we should check it. > 4. Fetching OLD row for WHEN clause handling. > > 5. Testing with mix of Shippable and non-shippable ROW triggers > > 6. Other types of triggers. INSTEAD triggers are anticipated to work > without significant changes, but they are yet to be tested. > INSERT/DELETE triggers: Most of the infrastructure has been done while > implementing UPDATE triggers. But some changes specific to INSERT and > DELETE are yet to be done. > Deferred triggers to be tested. > > 7. Regression analysis. There are some new failures. Will post another > fair version of the patch after regression analysis and fixing various > TODOs. > > Comments welcome. > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |