From: Ashutosh B. <ash...@en...> - 2013-04-17 11:16:57
|
Hi Amit, Thanks for completing this tedius work. It's pretty complicated. As I understand it right, the patch deals with following things For after row triggers, PG stores the fireable triggers as events with ctid of the row as pointer to the row on which the event should be carried out. For INSERT and DELETE there is only one ctid viz. new or old resp. For UPDATE, ctids of both the new and old row are stored. For some reason (I am not clear about the reasons) after row triggers are fired after queueing the events and thus we need some storage on coordinator to store the affected tuples, so that they can be retrieved while firing the triggers. We do not save the entire tuple in the trigger event, to save memory if there are multiple events that need the same tuple. In PG, ctid of the row suffices to fetch the row from the heap, which acts as the storage itself. In XC, however, we need some storage to store the tuples to be fed to trigger events and need a pointer for each row stored. This pointer will be saved in the trigger event, and will be used to fetch the row. Your patch uses two tuplestores to store old and new rows resp. For Update we will use both the tuplestores, but for INSERT we will use only one of them. Here are my comments 1. As I understand, the tuplestore has a different kind of pointer than ctid and thus you have created a union in Trigger event structure. Can we use hash based storage instead of tuplestore? The hash based storage will have advantages like a. existing ctid, nodeoid combination can be used as the key in hash store, thus not requiring any union (but will need to add an OID member). The same ItemPointer structure can be then used, instead of creating prototypes for XC. b. Hash is ideally a random access storage unlike tuplestore, which needs some kind of sequential access. c. At places there is code to first get a pointer in Tuplestore before actually adding the row, which complicates the code. Hash storage will not have this problem since the key is independent of the position in hash storage. 2. Using two separate tuplestores for new and old tuples is waste of memory. A tuplestore allocates 1K memory by default, thus having two tuple stores requires double the amount of memory. If in worst case, the tuplestore overflows to disk, we will have two files created on file system, causing random sequential writes on disk, which will affect the performance. This will mean that the same row pointer can not be used for OLD and NEW, but that should be fine, as PG itself doesn't need that condition. 3. The tuple store management code is too much tied with the static structures in trigger.c. We need to isolate this code in a separate file, so that this approach can be used for other features like constraints if required. Please separate this code into a separate file with well defined interfaces like function to add a row to storage, get its pointer, fetch the row from storage, delete the row from storage (?), destroy the storage etc. and use them for trigger functionality. In the same file, we need a prologue describing the need of these interfaces and description of the interfaces itself. In fact, if this infrastructure is also needed in PG, we should put it in PG. 4. While using two tuplestores we have hardcoded the tuplestore indices as 0 and 1. Instead of that can we use some macros OR even better use different variables for both of them. Same goes for all 2 sized arrays that are defined for the same purpose. 5. Please look at the current trigger regression tests. If they do not cover all the possible test scenarios please add them in the regression. Testing all the scenarios (various combinations of type triggers, DMLs) is critical here. If you find that the current implementation is working fine, all the above points can be taken up later after the 1.1 release. The testing can be take up between beta and GA, and others can be taken up in next release. But it's important to at least study these approaches. Some specific comments 1. In function pgxc_ar_init_rowstore(), we have used palloc0 + memcpy + pfree() instead of repalloc + memzero new entries. Repalloc allows to extend the existing memory allocation without moving the contents (if possible) and has advantage that it wouldn't fail if sum of allocated memory and required memory is greater than available memory but required memory is less than the available memory. So, it's always advantageous to use Repalloc. Why haven't we used repalloc here? 2. Can we extend pgxc_ar_goto_end(), to be goto_anywhere function, where end is a special position. E.g pgxc_ar_goto(ARTupInfo, Pos), where Pos can be a valid index OR a special position END. pgxc_ar_goto(ARTupInfo, END) would act as pgxc_ar_goto_end() and pgxc_ar_goto(ARTupInfo. Pos != END) would replace the tuplestore advance loop in pgxc_ar_dofetch(). The function may accept a flag backwards if that is required. Huh, done at last! On Mon, Apr 15, 2013 at 9:54 AM, Amit Khandekar < ami...@en...> wrote: > >> On Fri, Apr 5, 2013 at 2:38 PM, Amit Khandekar < >> ami...@en...> wrote: >> >>> FYI .. I will use the following document to keep updating the >>> implementation details for "Saving AR trigger rows in tuplestore" : >>> >>> >>> https://docs.google.com/document/d/158IPS9npmfNsOWPN6ZYgPy91aowTUNP7L7Fl9zBBGqs/edit?usp=sharing >>> >> > Attached is the patch to support after-row triggers. The above doc is > updated. Yet to analyse the regression tests. The attached test.sql is the > one I used for unit testing, it is not yet ready to be inserted into > regression suite. I will be working next on the regression and Ashutosh's > comments on before-row triggers > > Also I haven't yet rebased the rowtriggers branch over the new > merge-related changes in the master branch. This patch is over the > rowtriggers branch; I did not push this patch onto the rowtriggers branch > as well, although I intended to do it, but suspected of some possible > issues if I push the rowtriggers branch after the recent merge-related > changes going on in the repository. First I will rebase all the rowtriggers > branch changes onto the new master branch. > > > > > > >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Minimize network downtime and maximize team effectiveness. >>> Reduce network management and security costs.Learn how to hire >>> the most talented Cisco Certified professionals. Visit the >>> Employer Resources Portal >>> http://www.cisco.com/web/learning/employer_resources/index.html >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >> >> >> -- >> Pavan Deolasee >> http://www.linkedin.com/in/pavandeolasee >> > > > > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |