From: Amit K. <ami...@en...> - 2013-04-19 11:33:25
|
On 17 April 2013 16:46, Ashutosh Bapat <ash...@en...>wrote: > Hi Amit, > Thanks for completing this tedius work. It's pretty complicated. > > As I understand it right, the patch deals with following things > > For after row triggers, PG stores the fireable triggers as events with > ctid of the row as pointer to the row on which the event should be carried > out. For INSERT and DELETE there is only one ctid viz. new or old resp. For > UPDATE, ctids of both the new and old row are stored. For some reason (I am > not clear about the reasons) after row triggers are fired after queueing > the events and thus we need some storage on coordinator to store the > affected tuples, so that they can be retrieved while firing the triggers. > We do not save the entire tuple in the trigger event, to save memory if > there are multiple events that need the same tuple. In PG, ctid of the row > suffices to fetch the row from the heap, which acts as the storage itself. > In XC, however, we need some storage to store the tuples to be fed to > trigger events and need a pointer for each row stored. This pointer will be > saved in the trigger event, and will be used to fetch the row. Your patch > uses two tuplestores to store old and new rows resp. For Update we will use > both the tuplestores, but for INSERT we will use only one of them. > > Here are my comments > 1. As I understand, the tuplestore has a different kind of pointer than > ctid and thus you have created a union in Trigger event structure. Can we > use hash based storage instead of tuplestore? The hash based storage will > have advantages like a. existing ctid, nodeoid combination can be used as > the key in hash store, thus not requiring any union (but will need to add > an OID member). The same ItemPointer structure can be then used, instead of > creating prototypes for XC. b. Hash is ideally a random access storage > unlike tuplestore, which needs some kind of sequential access. c. At places > there is code to first get a pointer in Tuplestore before actually adding > the row, which complicates the code. Hash storage will not have this > problem since the key is independent of the position in hash storage. > > 2. Using two separate tuplestores for new and old tuples is waste of > memory. A tuplestore allocates 1K memory by default, thus having two tuple > stores requires double the amount of memory. If in worst case, the > tuplestore overflows to disk, we will have two files created on file > system, causing random sequential writes on disk, which will affect the > performance. This will mean that the same row pointer can not be used for > OLD and NEW, but that should be fine, as PG itself doesn't need that > condition. > > 3. The tuple store management code is too much tied with the static > structures in trigger.c. We need to isolate this code in a separate file, > so that this approach can be used for other features like constraints if > required. Please separate this code into a separate file with well defined > interfaces like function to add a row to storage, get its pointer, fetch > the row from storage, delete the row from storage (?), destroy the storage > etc. and use them for trigger functionality. In the same file, we need a > prologue describing the need of these interfaces and description of the > interfaces itself. In fact, if this infrastructure is also needed in PG, we > should put it in PG. > > 4. While using two tuplestores we have hardcoded the tuplestore indices as > 0 and 1. Instead of that can we use some macros OR even better use > different variables for both of them. Same goes for all 2 sized arrays that > are defined for the same purpose. > > 5. Please look at the current trigger regression tests. If they do not > cover all the possible test scenarios please add them in the regression. > Testing all the scenarios (various combinations of type triggers, DMLs) is > critical here. > > If you find that the current implementation is working fine, all the above > points can be taken up later after the 1.1 release. The testing can be take > up between beta and GA, and others can be taken up in next release. But > it's important to at least study these approaches. > Thanks Ashutosh for the valuable comments and for your patience in reviewing this work. I will come back to the above points once I commit the trigger support. > Some specific comments > 1. In function pgxc_ar_init_rowstore(), we have used palloc0 + memcpy + > pfree() instead of repalloc + memzero new entries. Repalloc allows to > extend the existing memory allocation without moving the contents (if > possible) and has advantage that it wouldn't fail if sum of allocated > memory and required memory is greater than available memory but required > memory is less than the available memory. So, it's always advantageous to > use Repalloc. Why haven't we used repalloc here? > > I had thought palloc0 + pfree would be simpler, but repalloc code turned out to be as simple. I have changed the code to use repalloc. Thanks. > 2. Can we extend pgxc_ar_goto_end(), to be goto_anywhere function, where > end is a special position. E.g pgxc_ar_goto(ARTupInfo, Pos), where Pos can > be a valid index OR a special position END. pgxc_ar_goto(ARTupInfo, END) > would act as pgxc_ar_goto_end() and pgxc_ar_goto(ARTupInfo. Pos != END) > would replace the tuplestore advance loop in pgxc_ar_dofetch(). The > function may accept a flag backwards if that is required. > So suppose in pgxc_ar_dofetch() I extract first part of scanning the tuplestore upto the fetchpos into a new function pgxc_ar_goto(). And then continue the actual fetch part in the same function pgxc_ar_dofetch(). Then in pgxc_ar_goto() we would have to special-case this goto_end() functionality. The usage of advance_by counter to get to the required position cannot be used. Because for going to the end, we need tuplestore_eof() condition and that condition cannot be mixed up in that code. Also, the advance-by related code has to be skipped. So, since anyways we have two different codes for two scenarios, I think it is better to keep a different function pgxc_ar_gotoend() for the scan-upto-end scenario. > > On Mon, Apr 15, 2013 at 9:54 AM, Amit Khandekar < > ami...@en...> wrote: > >> >>> On Fri, Apr 5, 2013 at 2:38 PM, Amit Khandekar < >>> ami...@en...> wrote: >>> >>>> FYI .. I will use the following document to keep updating the >>>> implementation details for "Saving AR trigger rows in tuplestore" : >>>> >>>> >>>> https://docs.google.com/document/d/158IPS9npmfNsOWPN6ZYgPy91aowTUNP7L7Fl9zBBGqs/edit?usp=sharing >>>> >>> >> Attached is the patch to support after-row triggers. The above doc is >> updated. Yet to analyse the regression tests. The attached test.sql is the >> one I used for unit testing, it is not yet ready to be inserted into >> regression suite. I will be working next on the regression and Ashutosh's >> comments on before-row triggers >> >> Also I haven't yet rebased the rowtriggers branch over the new >> merge-related changes in the master branch. This patch is over the >> rowtriggers branch; I did not push this patch onto the rowtriggers branch >> as well, although I intended to do it, but suspected of some possible >> issues if I push the rowtriggers branch after the recent merge-related >> changes going on in the repository. First I will rebase all the rowtriggers >> branch changes onto the new master branch. >> >> >> >> >> >> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Minimize network downtime and maximize team effectiveness. >>>> Reduce network management and security costs.Learn how to hire >>>> the most talented Cisco Certified professionals. Visit the >>>> Employer Resources Portal >>>> http://www.cisco.com/web/learning/employer_resources/index.html >>>> _______________________________________________ >>>> Postgres-xc-developers mailing list >>>> Pos...@li... >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> >>>> >>> >>> >>> -- >>> Pavan Deolasee >>> http://www.linkedin.com/in/pavandeolasee >>> >> >> >> >> ------------------------------------------------------------------------------ >> Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for building >> apps and a phenomenal toolset for data science. Developers can use >> our toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > |