Re: [Postgres-xc-developers] Saving AR trigger rows in tuplestore

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 17 April 2013 16:46, Ashutosh Bapat <ash...@en...>wrote:

> Hi Amit,
> Thanks for completing this tedius work. It's pretty complicated.
>
> As I understand it right, the patch deals with following things
>
> For after row triggers, PG stores the fireable triggers as events with
> ctid of the row as pointer to the row on which the event should be carried
> out. For INSERT and DELETE there is only one ctid viz. new or old resp. For
> UPDATE, ctids of both the new and old row are stored. For some reason (I am
> not clear about the reasons) after row triggers are fired after queueing
> the events and thus we need some storage on coordinator to store the
> affected tuples, so that they can be retrieved while firing the triggers.
> We do not save the entire tuple in the trigger event, to save memory if
> there are multiple events that need the same tuple. In PG, ctid of the row
> suffices to fetch the row from the heap, which acts as the storage itself.
> In XC, however, we need some storage to store the tuples to be fed to
> trigger events and need a pointer for each row stored. This pointer will be
> saved in the trigger event, and will be used to fetch the row. Your patch
> uses two tuplestores to store old and new rows resp. For Update we will use
> both the tuplestores, but for INSERT we will use only one of them.
>
> Here are my comments
> 1. As I understand, the tuplestore has a different kind of pointer than
> ctid and thus you have created a union in Trigger event structure. Can we
> use hash based storage instead of tuplestore? The hash based storage will
> have advantages like a. existing ctid, nodeoid combination can be used as
> the key in hash store, thus not requiring any union (but will need to add
> an OID member). The same ItemPointer structure can be then used, instead of
> creating prototypes for XC. b. Hash is ideally a random access storage
> unlike tuplestore, which needs some kind of sequential access. c. At places
> there is code to first get a pointer in Tuplestore before actually adding
> the row, which complicates the code. Hash storage will not have this
> problem since the key is independent of the position in hash storage.
>
> 2. Using two separate tuplestores for new and old tuples is waste of
> memory. A tuplestore allocates 1K memory by default, thus having two tuple
> stores requires double the amount of memory. If in worst case, the
> tuplestore overflows to disk, we will have two files created on file
> system, causing random sequential writes on disk, which will affect the
> performance. This will mean that the same row pointer can not be used for
> OLD and NEW, but that should be fine, as PG itself doesn't need that
> condition.
>
> 3. The tuple store management code is too much tied with the static
> structures in trigger.c. We need to isolate this code in a separate file,
> so that this approach can be used for other features like constraints if
> required. Please separate this code into a separate file with well defined
> interfaces like function to add a row to storage, get its pointer, fetch
> the row from storage, delete the row from storage (?), destroy the storage
> etc. and use them for trigger functionality. In the same file, we need a
> prologue describing the need of these interfaces and description of the
> interfaces itself. In fact, if this infrastructure is also needed in PG, we
> should put it in PG.
>
> 4. While using two tuplestores we have hardcoded the tuplestore indices as
> 0 and 1. Instead of that can we use some macros OR even better use
> different variables for both of them. Same goes for all 2 sized arrays that
> are defined for the same purpose.
>
> 5. Please look at the current trigger regression tests. If they do not
> cover all the possible test scenarios please add them in the regression.
> Testing all the scenarios (various combinations of type triggers, DMLs) is
> critical here.
>
> If you find that the current implementation is working fine, all the above
> points can be taken up later after the 1.1 release. The testing can be take
> up between beta and GA, and others can be taken up in next release. But
> it's important to at least study these approaches.
>

Thanks Ashutosh for the valuable comments and for your patience in
reviewing this work. I will come back to the above points once I commit the
trigger support.

> Some specific comments
> 1. In function pgxc_ar_init_rowstore(), we have used palloc0 + memcpy +
> pfree() instead of repalloc + memzero new entries. Repalloc allows to
> extend the existing memory allocation without moving the contents (if
> possible) and has advantage that it wouldn't fail if sum of allocated
> memory and required memory is greater than available memory but required
> memory is less than the available memory. So, it's always advantageous to
> use Repalloc. Why haven't we used repalloc here?
>
>
I had thought palloc0 + pfree would be simpler, but repalloc code turned
out to be as simple. I have changed the code to use repalloc. Thanks.

> 2. Can we extend pgxc_ar_goto_end(), to be goto_anywhere function, where
> end is a special position. E.g pgxc_ar_goto(ARTupInfo, Pos), where Pos can
> be a valid index OR a special position END. pgxc_ar_goto(ARTupInfo, END)
> would act as pgxc_ar_goto_end() and pgxc_ar_goto(ARTupInfo. Pos != END)
> would replace the tuplestore advance loop in pgxc_ar_dofetch(). The
> function may accept a flag backwards if that is required.
>

So suppose in pgxc_ar_dofetch() I extract first part of scanning the
tuplestore upto the fetchpos into a new function pgxc_ar_goto(). And then
continue the actual fetch part in the same function pgxc_ar_dofetch(). Then
in pgxc_ar_goto() we would have to special-case this goto_end()
functionality. The usage of advance_by counter to get to the required
position cannot be used. Because for going to the end, we need
tuplestore_eof() condition and that condition cannot be mixed up in that
code. Also, the advance-by related code has to be skipped. So, since
anyways we have two different codes for two scenarios, I think it is better
to keep a different function pgxc_ar_gotoend() for the scan-upto-end
scenario.

>
> On Mon, Apr 15, 2013 at 9:54 AM, Amit Khandekar <
> ami...@en...> wrote:
>
>>
>>> On Fri, Apr 5, 2013 at 2:38 PM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>> FYI .. I will use the following document to keep updating the
>>>> implementation details for "Saving AR trigger rows in tuplestore" :
>>>>
>>>>
>>>> https://docs.google.com/document/d/158IPS9npmfNsOWPN6ZYgPy91aowTUNP7L7Fl9zBBGqs/edit?usp=sharing
>>>>
>>>
>> Attached is the patch to support after-row triggers. The above doc is
>> updated. Yet to analyse the regression tests. The attached test.sql is the
>> one I used for unit testing, it is not yet ready to be inserted into
>> regression suite. I will be working next on the regression and Ashutosh's
>> comments on before-row triggers
>>
>> Also I haven't yet rebased the rowtriggers branch over the new
>> merge-related changes in the master branch. This patch is over the
>> rowtriggers branch; I did not push this patch onto the rowtriggers branch
>> as well, although I intended to do it, but suspected of some possible
>> issues if I push the rowtriggers branch after the recent  merge-related
>> changes going on in the repository. First I will rebase all the rowtriggers
>> branch changes onto the new master branch.
>>
>>
>>
>>
>>
>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Minimize network downtime and maximize team effectiveness.
>>>> Reduce network management and security costs.Learn how to hire
>>>> the most talented Cisco Certified professionals. Visit the
>>>> Employer Resources Portal
>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>> _______________________________________________
>>>> Postgres-xc-developers mailing list
>>>> Pos...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>
>>>>
>>>
>>>
>>> --
>>> Pavan Deolasee
>>> http://www.linkedin.com/in/pavandeolasee
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>