From: Ashutosh B. <ash...@en...> - 2013-03-01 08:56:51
|
On Fri, Mar 1, 2013 at 1:53 PM, Nikhil Sontakke <ni...@st...> wrote: > > > > Issue: Whether we should fetch the whole from the datanode (OLD row) and > not > > just ctid and node_id and required columns and store it at the > coordinator > > for the processing OR whether we should fetch each row (OLD and NEW > > variants) while processing each row. > > > > Both of them have performance impacts - the first one has disk impact for > > large number of rows whereas the second has network impact for querying > > rows. Is it possible to do some analytical assessment as to which of them > > would be better? If you can come up with something concrete (may be > numbers > > or formulae) we will be able to judge better as to which one to pick up. > > > > Or we can consider a hybrid approach of getting the rows in batches of > 1000 or so if possible as well. That ways they get into coordinator > memory in one shot and can be processed in batches. Obviously this > should be considered if it's not going to be a complicated > implementation. > > >> Currently we fetch all attributes in the SELECT subplans. I have > >> created another patch to fetch only the required attribtues, but have > >> not merged that into this patch. > > Do we have other places where we unnecessary fetch all attributes? > ISTM, this should be fixed as a performance improvement first ahead of > everything else. > > >> 2. One important TODO for BEFORE trigger is this: Just before > >> invoking the trigger functions, in PG, the tuple is row-locked > >> (exclusive) by GetTupleTrigger() and the locked version is fetched > >> from the table. So it is made sure that while all the triggers for > >> that table are executed, no one can update that particular row. > >> In the patch, we haven't locked the row. We need to lock it either by > >> executing : > >> 1. SELECT * from tab1 where ctid = <ctid_val> FOR UPDATE, and then > >> use the returned ROW as the OLD row. > >> OR > >> 2. The UPDATE subplan itself should have SELECT for UPDATE so that > >> the row is already locked, and we don't have to lock it again. > >> #2 is simple though it might cause some amount of longer waits in > general. > >> Using #1, though the locks would be acquired only when the particular > >> row is updated, the locks would be released only after transaction > >> end, so #1 might not be worth implementing. > >> Also #1 requires another explicit remote fetch for the > >> lock-and-get-latest-version operation. > >> I am more inclined towards #2. > >> > > The option #2 however, has problem of locking too many rows if there are > > coordinator quals in the subplans IOW the number of rows finally updated > are > > lesser than the number of rows fetched from the datanode. It can cause > > unwanted deadlocks. Unless there is a way to release these extra locks, > I am > > afraid this option will be a problem. > > > Deadlocks? ISTM, we can get more lock waits because of this but I do > not see deadlock scenarios.. > :) Simple rule is: if DBMS is locking transaction-long resources which application doesn't expect, there are bound to be deadlocks. The application induced deadlocks. An application would think that statement A would not lock row rA because it's not being updated, but actually it gets locked for UPDATE because of option #2 it locks it. The same application would assume that statement B would not lock row rB the same way. WIth this context consider following sequence of events Statement A updates row rB Statement B updates row rA Statement A tries to update row rM, but XC tries to lock rA (because of option #2) and waits Statement B tries to update row rN, but XC tries to lock rB (because of option #2) and waits None of A and B can proceed and thus deadlock, even if the application doesn't expect those to deadlock. > > With the FQS shipping work being done by Ashutosh, will we also ship > major chunks of subplans to the datanodes? If yes, then row locking > will only involve required tuples (hopefully) from the coordinator's > point of view. > > The push-down will work only when there shippable subplans, but if they are not ... > Also, something radical is can be invent a new type of FOR [NODE] > UPDATE type lock to minimize the impact of such locking of rows on > datanodes? > > Regards, > Nikhils > > >> > >> 3. The BEFORE trigger function can change the distribution column > >> itself. We need to add a check at the end of the trigger executions. > >> > > > > Good, you thought about that. Yes we should check it. > > > >> > >> 4. Fetching OLD row for WHEN clause handling. > >> > >> 5. Testing with mix of Shippable and non-shippable ROW triggers > >> > >> 6. Other types of triggers. INSTEAD triggers are anticipated to work > >> without significant changes, but they are yet to be tested. > >> INSERT/DELETE triggers: Most of the infrastructure has been done while > >> implementing UPDATE triggers. But some changes specific to INSERT and > >> DELETE are yet to be done. > >> Deferred triggers to be tested. > >> > >> 7. Regression analysis. There are some new failures. Will post another > >> fair version of the patch after regression analysis and fixing various > >> TODOs. > >> > >> Comments welcome. > >> > >> > >> > ------------------------------------------------------------------------------ > >> Everyone hates slow websites. So do we. > >> Make your web apps faster with AppDynamics > >> Download AppDynamics Lite for free today: > >> http://p.sf.net/sfu/appdyn_d2d_feb > >> _______________________________________________ > >> Postgres-xc-developers mailing list > >> Pos...@li... > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> > > > > > > > > -- > > Best Wishes, > > Ashutosh Bapat > > EntepriseDB Corporation > > The Enterprise Postgres Company > > > > > ------------------------------------------------------------------------------ > > Everyone hates slow websites. So do we. > > Make your web apps faster with AppDynamics > > Download AppDynamics Lite for free today: > > http://p.sf.net/sfu/appdyn_d2d_feb > > _______________________________________________ > > Postgres-xc-developers mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud > Postgres-XC Support and Service > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |