postgres-xc-developers Mailing List for Postgres-XC (Page 45)

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-developers — Postgres-XC hackers and developers

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr (10)	May (17)	Jun (3)	Jul	Aug	Sep (8)	Oct (18)	Nov (51)	Dec (74)
2011	Jan (47)	Feb (44)	Mar (44)	Apr (102)	May (35)	Jun (25)	Jul (56)	Aug (69)	Sep (32)	Oct (37)	Nov (31)	Dec (16)
2012	Jan (34)	Feb (127)	Mar (218)	Apr (252)	May (80)	Jun (137)	Jul (205)	Aug (159)	Sep (35)	Oct (50)	Nov (82)	Dec (52)
2013	Jan (107)	Feb (159)	Mar (118)	Apr (163)	May (151)	Jun (89)	Jul (106)	Aug (177)	Sep (49)	Oct (63)	Nov (46)	Dec (7)
2014	Jan (65)	Feb (128)	Mar (40)	Apr (11)	May (4)	Jun (8)	Jul (16)	Aug (11)	Sep (4)	Oct (1)	Nov (5)	Dec (16)
2015	Jan (5)	Feb	Mar (2)	Apr (5)	May (4)	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec (4)
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 43 44 45 46 47 .. 139 > >> (Page 45 of 139)

[Postgres-xc-developers] pgxc_ctl (C version) first commit will be out.

From: Koichi S. <koi...@gm...> - 2013-04-04 09:45:39

Hello;

Candidate of the first commit of pgxc_ctl (should be at contrib) will be
found at
https://github.com/postgres-xc/postgres-xc/tree/pgxc_ctl/contrib/pgxc_ctl

It provides generally all the capabilities of bash-version pgxc_ctl, much
faster becaus most of the real work is done background in parallel.

Next target is to include Abbas's node additon/removal, as well as
gtm_proxy and other slave addition and removal.

Regards;
----------
Koichi Suzuki

[Postgres-xc-developers] Commit 52a8aea4290851 (Merge with 9.2.3)

From: Pavan D. <pav...@gm...> - 2013-04-04 09:25:27

Hello,

While I am sure you must be doing a right thing, but since it struck to me,
I thought I should raise it here. I saw a commit message which has a
message body "Merge branch 'REL_9_2_3' into master". Its kind of a red flag
to me. I hope we are *not* merging any point releases of PostgreSQL in the
master branch of Postgres-XC. In the past, I have spent considerable time
in fixing similar mistakes and we should not be repeating that.

To explain this point further, we should always be merging only master
branch of PostgreSQL. Later, if we make a Postgres-XC release based on a
stable release of PostgreSQL, say 9.2, we should branch of Postgres-XC
repository at the same commit point as PostgreSQL did and do a release. Any
bug fixes on that stable release will then go into only that branch while
the main development would continue on the master branch. If we mistakenly
merge a PostgreSQL's point release such as 9.2.3, then we will have commits
in Postgres-XC master branch which will later conflict terribly with the
master branch commits of PostgreSQL, since the same bug may have been fixed
in PostgreSQL's master branch too.

Sorry, if this is all noise and you are doing the right thing. But then the
commit message should look different.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: [Postgres-xc-developers] Implementation details for BEFORE and AFTER ROW trigger support

From: Amit K. <ami...@en...> - 2013-04-04 06:57:02

On 3 April 2013 15:10, Ashutosh Bapat <ash...@en...>wrote:

> Hi Amit,
> Given the magnitude of the change, I think we should break this work into
> smaller self-sufficient patches and commit them (either on master or in a
> separate branch for trigger work). This will allow us to review and commit
> small amount of work and set it aside, rather than going over everything in
> every round.
>

I have created a new branch "rowtriggers" in
postgres-xc.git.sourceforge.net/gitroot/postgres-xc/postgres-xc, where I
have dumped incremental changes.


>
> On Wed, Apr 3, 2013 at 10:46 AM, Amit Khandekar <
> ami...@en...> wrote:
>
>>
>>
>>
>> On 26 March 2013 15:53, Ashutosh Bapat <ash...@en...>wrote:
>>
>>>
>>>
>>> On Tue, Mar 26, 2013 at 8:56 AM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>>
>>>>
>>>> On 4 March 2013 11:11, Amit Khandekar <ami...@en...>wrote:
>>>>
>>>>> On 1 March 2013 13:53, Nikhil Sontakke <ni...@st...> wrote:
>>>>> >>
>>>>> >> Issue: Whether we should fetch the whole from the datanode (OLD
>>>>> row) and not
>>>>> >> just ctid and node_id and required columns and store it at the
>>>>> coordinator
>>>>> >> for the processing  OR whether we should fetch each row (OLD and NEW
>>>>> >> variants) while processing each row.
>>>>> >>
>>>>> >> Both of them have performance impacts - the first one has disk
>>>>> impact for
>>>>> >> large number of rows whereas the second has network impact for
>>>>> querying
>>>>> >> rows. Is it possible to do some analytical assessment as to which
>>>>> of them
>>>>> >> would be better? If you can come up with something concrete (may be
>>>>> numbers
>>>>> >> or formulae) we will be able to judge better as to which one to
>>>>> pick up.
>>>>>
>>>>> Will check if we can come up with some sensible analysis or figures.
>>>>>
>>>>>
>>>> I have done some analysis on both of these approaches here:
>>>>
>>>> https://docs.google.com/document/d/10QPPq_go_wHqKqhmOFXjJAokfdLR8OaUyZVNDu47GWk/edit?usp=sharing
>>>>
>>>> In practical terms, we anyways would need to implement (B). The reason
>>>> is because when the trigger has conditional execution(WHEN clause) we
>>>> *have* to fetch the rows beforehand, so there is no point in fetching all
>>>> of them again at the end of the statement when we already have them
>>>> locally. So may be it would be too ambitious to have have both
>>>> implementations, at least for this release.
>>>>
>>>>
>>> I agree here. We can certainly optimize for various cases later, but we
>>> should have something which would give all the functionality (albeit at a
>>> lower performance for now).
>>>
>>>
>>>> So I am focussing on (B) right now. We have two options:
>>>>
>>>> 1. Store all rows in palloced memory, and save the HeapTuple pointers
>>>> in the trigger queue, and directly access the OLD and NEW rows using these
>>>> pointers when needed. Here we will have no control over how much memory we
>>>> should use for the old and new records, and this might even hamper system
>>>> performance, let alone XC performance.
>>>> 2. Other option is to use tuplestore. Here, we need to store the
>>>> positions of the records in the tuplestore. So for a particular tigger
>>>> event, fetch by the position. From what I understand, tuplestore can be
>>>> advanced only sequentially in either direction. So when the read pointer is
>>>> at position 6 and we need to fetch a record at position 10, we need to call
>>>> tuplestore_advance() 4 times, and this call involves palloc/pfree overhead
>>>> because it calls tuplestore_gettuple(). But the trigger records are not
>>>> distributed so randomly. In fact a set of trigger events for a particular
>>>> event id are accessed in the same order as the order in which they are
>>>> queued. So for a particular event id, only the first access call will
>>>> require random access. tuplestore supports multiple read pointers, so may
>>>> be we can make use of that to access the first record using the closest
>>>> read pointer.
>>>>
>>>>
>>> Using palloc will be a problem if the size of data fetched is more that
>>> what could fit in memory. Also pallocing frequently is going to be
>>> performance problem. Let's see how does the tuple store approach go.
>>>
>>
>> While I am working on the AFTER ROW optimization, here's a patch that has
>> only BEFORE ROW trigger support, so that it can get you started with first
>> round of review. The regression is not analyzed fully yet. Besides the AR
>> trigger related changes, I have also stripped the logic of whether to run
>> the trigger on datanode or coordinator; this logic depends on both before
>> and after triggers.
>>
>>
>>>
>>>
>>>>
>>>>
>>>>>  >>
>>>>> >
>>>>> > Or we can consider a hybrid approach of getting the rows in batches
>>>>> of
>>>>> > 1000 or so if possible as well. That ways they get into coordinator
>>>>> > memory in one shot and can be processed in batches. Obviously this
>>>>> > should be considered if it's not going to be a complicated
>>>>> > implementation.
>>>>>
>>>>> It just occurred to me that it would not be that hard to optimize the
>>>>> row-fetching-by-ctid as shown below:
>>>>> 1. When it is time to fire the queued triggers at the
>>>>> statement/transaction end, initialize cursors - one cursor per
>>>>> datanode - which would do: SELECT remote_heap_fetch(table_name,
>>>>> '<ctidlist>'); We can form this ctidlist out of the trigger even list.
>>>>> 2. For each trigger event entry in the trigger queue, FETCH NEXT using
>>>>> the appropriate cursor name according to the datanode id to which the
>>>>> trigger entry belongs.
>>>>>
>>>>> >
>>>>> >>> Currently we fetch all attributes in the SELECT subplans. I have
>>>>> >>> created another patch to fetch only the required attribtues, but
>>>>> have
>>>>> >>> not merged that into this patch.
>>>>> >
>>>>> > Do we have other places where we unnecessary fetch all attributes?
>>>>> > ISTM, this should be fixed as a performance improvement first ahead
>>>>> of
>>>>> > everything else.
>>>>>
>>>>> I believe DML subplan is the only remaining place where we fetch all
>>>>> attributes. And yes, this is a must-have for triggers, otherwise, the
>>>>> other optimizations would be of no use.
>>>>>
>>>>> >
>>>>> >>> 2.  One important TODO for BEFORE trigger is this: Just before
>>>>> >>> invoking the trigger functions, in PG, the tuple is row-locked
>>>>> >>> (exclusive) by GetTupleTrigger() and the locked version is fetched
>>>>> >>> from the table. So it is made sure that while all the triggers for
>>>>> >>> that table are executed, no one can update that particular row.
>>>>> >>> In the patch, we haven't locked the row. We need to lock it either
>>>>> by
>>>>> >>> executing :
>>>>> >>>    1. SELECT * from tab1 where ctid = <ctid_val> FOR UPDATE, and
>>>>> then
>>>>> >>> use the returned ROW as the OLD row.
>>>>> >>> OR
>>>>> >>>    2. The UPDATE subplan itself should have SELECT for UPDATE so
>>>>> that
>>>>> >>> the row is already locked, and we don't have to lock it again.
>>>>> >>> #2 is simple though it might cause some amount of longer waits in
>>>>> general.
>>>>> >>> Using #1, though the locks would be acquired only when the
>>>>> particular
>>>>> >>> row is updated, the locks would be released only after transaction
>>>>> >>> end, so #1 might not be worth implementing.
>>>>> >>> Also #1 requires another explicit remote fetch for the
>>>>> >>> lock-and-get-latest-version operation.
>>>>> >>> I am more inclined towards #2.
>>>>> >>>
>>>>> >> The option #2 however, has problem of locking too many rows if
>>>>> there are
>>>>> >> coordinator quals in the subplans IOW the number of rows finally
>>>>> updated are
>>>>> >> lesser than the number of rows fetched from the datanode. It can
>>>>> cause
>>>>> >> unwanted deadlocks. Unless there is a way to release these extra
>>>>> locks, I am
>>>>> >> afraid this option will be a problem.
>>>>>
>>>>> True. Regardless of anything else - whether it is deadlocks or longer
>>>>> waits, we should not lock rows that are not to be updated.
>>>>>
>>>>> There is a more general row-locking issue that we need to solve first
>>>>> : 3606317. I anticipate that solving this will solve the trigger
>>>>> specific lock issue. So for triggers, this is a must-have, and I am
>>>>> going to solve this issue as part of this bug 3606317.
>>>>>
>>>>> >>
>>>>> > Deadlocks? ISTM, we can get more lock waits because of this but I do
>>>>> > not see deadlock scenarios..
>>>>> >
>>>>> > With the FQS shipping work being done by Ashutosh, will we also ship
>>>>> > major chunks of subplans to the datanodes? If yes, then row locking
>>>>> > will only involve required tuples (hopefully) from the coordinator's
>>>>> > point of view.
>>>>> >
>>>>> > Also, something radical is can be invent a new type of FOR [NODE]
>>>>> > UPDATE type lock to minimize the impact of such locking of rows on
>>>>> > datanodes?
>>>>> >
>>>>> > Regards,
>>>>> > Nikhils
>>>>> >
>>>>> >>>
>>>>> >>> 3.  The BEFORE trigger function can change the distribution column
>>>>> >>> itself. We need to add a check at the end of the trigger
>>>>> executions.
>>>>> >>>
>>>>> >>
>>>>> >> Good, you thought about that. Yes we should check it.
>>>>> >>
>>>>> >>>
>>>>> >>> 4. Fetching OLD row for WHEN clause handling.
>>>>> >>>
>>>>> >>> 5. Testing with mix of Shippable and non-shippable ROW triggers
>>>>> >>>
>>>>> >>> 6. Other types of triggers. INSTEAD triggers are anticipated to
>>>>> work
>>>>> >>> without significant changes, but they are yet to be tested.
>>>>> >>> INSERT/DELETE triggers: Most of the infrastructure has been done
>>>>> while
>>>>> >>> implementing UPDATE triggers. But some changes specific to INSERT
>>>>> and
>>>>> >>> DELETE are yet to be done.
>>>>> >>> Deferred triggers to be tested.
>>>>> >>>
>>>>> >>> 7. Regression analysis. There are some new failures. Will post
>>>>> another
>>>>> >>> fair version of the patch after regression analysis and fixing
>>>>> various
>>>>> >>> TODOs.
>>>>> >>>
>>>>> >>> Comments welcome.
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> ------------------------------------------------------------------------------
>>>>> >>> Everyone hates slow websites. So do we.
>>>>> >>> Make your web apps faster with AppDynamics
>>>>> >>> Download AppDynamics Lite for free today:
>>>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>> >>> _______________________________________________
>>>>> >>> Postgres-xc-developers mailing list
>>>>> >>> Pos...@li...
>>>>> >>>
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Best Wishes,
>>>>> >> Ashutosh Bapat
>>>>> >> EntepriseDB Corporation
>>>>> >> The Enterprise Postgres Company
>>>>> >>
>>>>> >>
>>>>> ------------------------------------------------------------------------------
>>>>> >> Everyone hates slow websites. So do we.
>>>>> >> Make your web apps faster with AppDynamics
>>>>> >> Download AppDynamics Lite for free today:
>>>>> >> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>> >> _______________________________________________
>>>>> >> Postgres-xc-developers mailing list
>>>>> >> Pos...@li...
>>>>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > StormDB - http://www.stormdb.com
>>>>> > The Database Cloud
>>>>> > Postgres-XC Support and Service
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>

Re: [Postgres-xc-developers] Issues with schema qualified tables and \COPY

From: Koichi S. <koi...@gm...> - 2013-04-04 06:43:52

Thanks Nikhil.   I found the similar problem in ALTER TABLE regression.
Hope this fixes the regression too.

----------
Koichi Suzuki


2013/4/3 Nikhil Sontakke <ni...@st...>

> Hi,
>
> Looks like \COPY does not handle schema qualified tables properly. The
> COPY command that is sent to the remote nodes forgets to schema
> qualify the table:
>
> create schema foo;
> create table foo.bar(x int);
> \copy foo.bar (x) FROM 'file.txt' with csv;
>
> The above fails. The fix is to schema-qualify the table before sending
> it on the wire. We need to be careful about temp tables though. PFA,
> patch against HEAD for the same.
>
> Regards,
> Nikhils
> --
> StormDB - http://www.stormdb.com
> The Database Cloud
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

Re: [Postgres-xc-developers] GTM Standby Crash

From: Koichi S. <koi...@gm...> - 2013-04-04 06:42:22

Sorry taking too long.

I will commit it soon.

Kind Regards;
----------
Koichi Suzuki


2013/4/3 Nikhil Sontakke <ni...@st...>

> >> I think I added the lines in question.   Because you are the original
> >> author if gtm, it's wonderful if you take a look at it.
> >>
> >
> > Looks good to me. The list under question is global to the process and
> > hence the list cells must be allocated in the process-level top
> > context i.e. TopMostMemoryContext. I checked and there are other
> > places in the code where its explained why its necessary to allocate
> > them in the said context.
> >
>
> This commit is still pending. It's an important fix for GTM standby.
>
> Regards,
> Nikhils
>
> > Thanks,
> > Pavan
> >
> > --
> > Pavan Deolasee
> > http://www.linkedin.com/in/pavandeolasee
>
>
>
> --
> StormDB - http://www.stormdb.com
> The Database Cloud
>

Re: [Postgres-xc-developers] GTM Standby Crash

From: Nikhil S. <ni...@st...> - 2013-04-03 09:48:20

>> I think I added the lines in question.   Because you are the original
>> author if gtm, it's wonderful if you take a look at it.
>>
>
> Looks good to me. The list under question is global to the process and
> hence the list cells must be allocated in the process-level top
> context i.e. TopMostMemoryContext. I checked and there are other
> places in the code where its explained why its necessary to allocate
> them in the said context.
>

This commit is still pending. It's an important fix for GTM standby.

Regards,
Nikhils

> Thanks,
> Pavan
>
> --
> Pavan Deolasee
> http://www.linkedin.com/in/pavandeolasee



-- 
StormDB - http://www.stormdb.com
The Database Cloud

[Postgres-xc-developers] Issues with schema qualified tables and \COPY

From: Nikhil S. <ni...@st...> - 2013-04-03 09:45:16

Attachments: pgxc_remote_copy_HEAD.patch

Hi,

Looks like \COPY does not handle schema qualified tables properly. The
COPY command that is sent to the remote nodes forgets to schema
qualify the table:

create schema foo;
create table foo.bar(x int);
\copy foo.bar (x) FROM 'file.txt' with csv;

The above fails. The fix is to schema-qualify the table before sending
it on the wire. We need to be careful about temp tables though. PFA,
patch against HEAD for the same.

Regards,
Nikhils
-- 
StormDB - http://www.stormdb.com
The Database Cloud

Re: [Postgres-xc-developers] Implementation details for BEFORE and AFTER ROW trigger support

From: Ashutosh B. <ash...@en...> - 2013-04-03 09:25:12

Hi Amit,
Given the magnitude of the change, I think we should break this work into
smaller self-sufficient patches and commit them (either on master or in a
separate branch for trigger work). This will allow us to review and commit
small amount of work and set it aside, rather than going over everything in
every round.


On Wed, Apr 3, 2013 at 10:46 AM, Amit Khandekar <
ami...@en...> wrote:

>
>
>
> On 26 March 2013 15:53, Ashutosh Bapat <ash...@en...>wrote:
>
>>
>>
>> On Tue, Mar 26, 2013 at 8:56 AM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>>
>>>
>>> On 4 March 2013 11:11, Amit Khandekar <ami...@en...>wrote:
>>>
>>>> On 1 March 2013 13:53, Nikhil Sontakke <ni...@st...> wrote:
>>>> >>
>>>> >> Issue: Whether we should fetch the whole from the datanode (OLD row)
>>>> and not
>>>> >> just ctid and node_id and required columns and store it at the
>>>> coordinator
>>>> >> for the processing  OR whether we should fetch each row (OLD and NEW
>>>> >> variants) while processing each row.
>>>> >>
>>>> >> Both of them have performance impacts - the first one has disk
>>>> impact for
>>>> >> large number of rows whereas the second has network impact for
>>>> querying
>>>> >> rows. Is it possible to do some analytical assessment as to which of
>>>> them
>>>> >> would be better? If you can come up with something concrete (may be
>>>> numbers
>>>> >> or formulae) we will be able to judge better as to which one to pick
>>>> up.
>>>>
>>>> Will check if we can come up with some sensible analysis or figures.
>>>>
>>>>
>>> I have done some analysis on both of these approaches here:
>>>
>>> https://docs.google.com/document/d/10QPPq_go_wHqKqhmOFXjJAokfdLR8OaUyZVNDu47GWk/edit?usp=sharing
>>>
>>> In practical terms, we anyways would need to implement (B). The reason
>>> is because when the trigger has conditional execution(WHEN clause) we
>>> *have* to fetch the rows beforehand, so there is no point in fetching all
>>> of them again at the end of the statement when we already have them
>>> locally. So may be it would be too ambitious to have have both
>>> implementations, at least for this release.
>>>
>>>
>> I agree here. We can certainly optimize for various cases later, but we
>> should have something which would give all the functionality (albeit at a
>> lower performance for now).
>>
>>
>>> So I am focussing on (B) right now. We have two options:
>>>
>>> 1. Store all rows in palloced memory, and save the HeapTuple pointers in
>>> the trigger queue, and directly access the OLD and NEW rows using these
>>> pointers when needed. Here we will have no control over how much memory we
>>> should use for the old and new records, and this might even hamper system
>>> performance, let alone XC performance.
>>> 2. Other option is to use tuplestore. Here, we need to store the
>>> positions of the records in the tuplestore. So for a particular tigger
>>> event, fetch by the position. From what I understand, tuplestore can be
>>> advanced only sequentially in either direction. So when the read pointer is
>>> at position 6 and we need to fetch a record at position 10, we need to call
>>> tuplestore_advance() 4 times, and this call involves palloc/pfree overhead
>>> because it calls tuplestore_gettuple(). But the trigger records are not
>>> distributed so randomly. In fact a set of trigger events for a particular
>>> event id are accessed in the same order as the order in which they are
>>> queued. So for a particular event id, only the first access call will
>>> require random access. tuplestore supports multiple read pointers, so may
>>> be we can make use of that to access the first record using the closest
>>> read pointer.
>>>
>>>
>> Using palloc will be a problem if the size of data fetched is more that
>> what could fit in memory. Also pallocing frequently is going to be
>> performance problem. Let's see how does the tuple store approach go.
>>
>
> While I am working on the AFTER ROW optimization, here's a patch that has
> only BEFORE ROW trigger support, so that it can get you started with first
> round of review. The regression is not analyzed fully yet. Besides the AR
> trigger related changes, I have also stripped the logic of whether to run
> the trigger on datanode or coordinator; this logic depends on both before
> and after triggers.
>
>
>>
>>
>>>
>>>
>>>>  >>
>>>> >
>>>> > Or we can consider a hybrid approach of getting the rows in batches of
>>>> > 1000 or so if possible as well. That ways they get into coordinator
>>>> > memory in one shot and can be processed in batches. Obviously this
>>>> > should be considered if it's not going to be a complicated
>>>> > implementation.
>>>>
>>>> It just occurred to me that it would not be that hard to optimize the
>>>> row-fetching-by-ctid as shown below:
>>>> 1. When it is time to fire the queued triggers at the
>>>> statement/transaction end, initialize cursors - one cursor per
>>>> datanode - which would do: SELECT remote_heap_fetch(table_name,
>>>> '<ctidlist>'); We can form this ctidlist out of the trigger even list.
>>>> 2. For each trigger event entry in the trigger queue, FETCH NEXT using
>>>> the appropriate cursor name according to the datanode id to which the
>>>> trigger entry belongs.
>>>>
>>>> >
>>>> >>> Currently we fetch all attributes in the SELECT subplans. I have
>>>> >>> created another patch to fetch only the required attribtues, but
>>>> have
>>>> >>> not merged that into this patch.
>>>> >
>>>> > Do we have other places where we unnecessary fetch all attributes?
>>>> > ISTM, this should be fixed as a performance improvement first ahead of
>>>> > everything else.
>>>>
>>>> I believe DML subplan is the only remaining place where we fetch all
>>>> attributes. And yes, this is a must-have for triggers, otherwise, the
>>>> other optimizations would be of no use.
>>>>
>>>> >
>>>> >>> 2.  One important TODO for BEFORE trigger is this: Just before
>>>> >>> invoking the trigger functions, in PG, the tuple is row-locked
>>>> >>> (exclusive) by GetTupleTrigger() and the locked version is fetched
>>>> >>> from the table. So it is made sure that while all the triggers for
>>>> >>> that table are executed, no one can update that particular row.
>>>> >>> In the patch, we haven't locked the row. We need to lock it either
>>>> by
>>>> >>> executing :
>>>> >>>    1. SELECT * from tab1 where ctid = <ctid_val> FOR UPDATE, and
>>>> then
>>>> >>> use the returned ROW as the OLD row.
>>>> >>> OR
>>>> >>>    2. The UPDATE subplan itself should have SELECT for UPDATE so
>>>> that
>>>> >>> the row is already locked, and we don't have to lock it again.
>>>> >>> #2 is simple though it might cause some amount of longer waits in
>>>> general.
>>>> >>> Using #1, though the locks would be acquired only when the
>>>> particular
>>>> >>> row is updated, the locks would be released only after transaction
>>>> >>> end, so #1 might not be worth implementing.
>>>> >>> Also #1 requires another explicit remote fetch for the
>>>> >>> lock-and-get-latest-version operation.
>>>> >>> I am more inclined towards #2.
>>>> >>>
>>>> >> The option #2 however, has problem of locking too many rows if there
>>>> are
>>>> >> coordinator quals in the subplans IOW the number of rows finally
>>>> updated are
>>>> >> lesser than the number of rows fetched from the datanode. It can
>>>> cause
>>>> >> unwanted deadlocks. Unless there is a way to release these extra
>>>> locks, I am
>>>> >> afraid this option will be a problem.
>>>>
>>>> True. Regardless of anything else - whether it is deadlocks or longer
>>>> waits, we should not lock rows that are not to be updated.
>>>>
>>>> There is a more general row-locking issue that we need to solve first
>>>> : 3606317. I anticipate that solving this will solve the trigger
>>>> specific lock issue. So for triggers, this is a must-have, and I am
>>>> going to solve this issue as part of this bug 3606317.
>>>>
>>>> >>
>>>> > Deadlocks? ISTM, we can get more lock waits because of this but I do
>>>> > not see deadlock scenarios..
>>>> >
>>>> > With the FQS shipping work being done by Ashutosh, will we also ship
>>>> > major chunks of subplans to the datanodes? If yes, then row locking
>>>> > will only involve required tuples (hopefully) from the coordinator's
>>>> > point of view.
>>>> >
>>>> > Also, something radical is can be invent a new type of FOR [NODE]
>>>> > UPDATE type lock to minimize the impact of such locking of rows on
>>>> > datanodes?
>>>> >
>>>> > Regards,
>>>> > Nikhils
>>>> >
>>>> >>>
>>>> >>> 3.  The BEFORE trigger function can change the distribution column
>>>> >>> itself. We need to add a check at the end of the trigger executions.
>>>> >>>
>>>> >>
>>>> >> Good, you thought about that. Yes we should check it.
>>>> >>
>>>> >>>
>>>> >>> 4. Fetching OLD row for WHEN clause handling.
>>>> >>>
>>>> >>> 5. Testing with mix of Shippable and non-shippable ROW triggers
>>>> >>>
>>>> >>> 6. Other types of triggers. INSTEAD triggers are anticipated to work
>>>> >>> without significant changes, but they are yet to be tested.
>>>> >>> INSERT/DELETE triggers: Most of the infrastructure has been done
>>>> while
>>>> >>> implementing UPDATE triggers. But some changes specific to INSERT
>>>> and
>>>> >>> DELETE are yet to be done.
>>>> >>> Deferred triggers to be tested.
>>>> >>>
>>>> >>> 7. Regression analysis. There are some new failures. Will post
>>>> another
>>>> >>> fair version of the patch after regression analysis and fixing
>>>> various
>>>> >>> TODOs.
>>>> >>>
>>>> >>> Comments welcome.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> ------------------------------------------------------------------------------
>>>> >>> Everyone hates slow websites. So do we.
>>>> >>> Make your web apps faster with AppDynamics
>>>> >>> Download AppDynamics Lite for free today:
>>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> >>> _______________________________________________
>>>> >>> Postgres-xc-developers mailing list
>>>> >>> Pos...@li...
>>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Best Wishes,
>>>> >> Ashutosh Bapat
>>>> >> EntepriseDB Corporation
>>>> >> The Enterprise Postgres Company
>>>> >>
>>>> >>
>>>> ------------------------------------------------------------------------------
>>>> >> Everyone hates slow websites. So do we.
>>>> >> Make your web apps faster with AppDynamics
>>>> >> Download AppDynamics Lite for free today:
>>>> >> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> >> _______________________________________________
>>>> >> Postgres-xc-developers mailing list
>>>> >> Pos...@li...
>>>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > StormDB - http://www.stormdb.com
>>>> > The Database Cloud
>>>> > Postgres-XC Support and Service
>>>>
>>>
>>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

From: Abbas B. <abb...@en...> - 2013-04-01 08:38:37

On Mon, Apr 1, 2013 at 11:02 AM, Amit Khandekar <
ami...@en...> wrote:

>
>
>
> On 31 March 2013 14:07, Abbas Butt <abb...@en...> wrote:
>
>> Hi,
>> Attached please find the revised patch for restore mode. This patch has
>> to be applied on top of the patches I sent earlier for
>>  3608377,
>>  3608376 &
>>  3608375.
>>
>> I have also attached some scripts and a C file useful for testing the
>> whole procedure. It is a database that has many objects in it.
>>
>> Here are the revised instructions for adding new nodes to the cluster.
>>
>> ======================================
>>
>> Here are the steps to add a new coordinator
>>
>>  1) Initdb new coordinator
>>   /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
>> coord_3
>>
>>  2) Make necessary changes in its postgresql.conf, in particular specify
>> new coordinator name and pooler port
>>
>>  3) Connect to any of the existing coordinators & lock the cluster for
>> backup, do not close this session
>>     ./psql postgres -p 5432
>>     select pgxc_lock_for_backup();
>>
>>  4) Connect to any of the existing coordinators and take backup of the
>> database
>>     ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
>> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>>
>>  5) Start the new coordinator specify --restoremode while starting the
>> coordinator
>>     ./postgres --restoremode -D ../data_cord3 -p 5455
>>
>>  6) Create the new database on the new coordinator - optional
>>     ./createdb test -p 5455
>>
>>  7) Restore the backup that was taken from an existing coordinator by
>> connecting to the new coordinator directly
>>     ./psql -d test -f
>> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>> -p 5455
>>
>>  8) Quit the new coordinator
>>
>>  9) Start the new coordinator as a by specifying --coordinator
>>     ./postgres --coordinator -D ../data_cord3 -p 5455
>>
>> 10) Create the new coordinator on rest of the coordinators and reload
>> configuration
>>      CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
>> PORT = 5455);
>>     SELECT pgxc_pool_reload();
>>
>> 11) Quit the session of step 3, this will unlock the cluster
>>
>> 12) The new coordinator is now ready
>>     ./psql test -p 5455
>>     create table test_new_coord(a int, b int);
>>     \q
>>     ./psql test -p 5432
>>     select * from test_new_coord;
>>
>> *======================================*
>> *======================================*
>>
>> Here are the steps to add a new datanode
>>
>>
>>  1) Initdb new datanode
>>     /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
>> data_node_3
>>
>>  2) Make necessary changes in its postgresql.conf, in particular specify
>> new datanode name
>>
>>  3) Connect to any of the existing coordinators & lock the cluster for
>> backup, do not close this session
>>     ./psql postgres -p 5432
>>     select pgxc_lock_for_backup();
>>
>>  4) Connect to any of the existing datanodes and take backup of the
>> database
>>     ./pg_dumpall -p 15432 -s --include-nodes
>> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>>
>
>
> Why do we need --include-nodes on datanode ?
>

Agreed, this option should not be used.


>
> ----
>
>
> +     *    The dump taken from a datanode does NOT contain any DISTRIBUTE
> BY
> +     *    clause. This fact is used here to make sure that when the
> +     *    DISTRIBUTE BY clause is missing in the statemnet the system
> +     *    should not try to find out the node list itself.
> +     */
> +    if ((IS_PGXC_COORDINATOR || (isRestoreMode && stmt->distributeby !=
> NULL))
> +        && relkind == RELKIND_RELATION)
>
> How do we enforce not having DISTRIBUTE BY clause in the pg_dump output if
> it's a datanode ?
>

We do not have to enforce it, since the pgxc_class catalog table has no
information in it on datanodes, hence dump will not contain any DISTRIBUTE
BY clause.


> Also, can we just error out in restore mode if the DISTRIBUTE BY clause is
> present ?
>

No we cannot error out, because while adding a coordinator DISTRIBUTE BY
clause will be present, and since we have started the server by using
--restoremode in place of --datanode or --coordinator we do not know
whether the user is adding a new datanode or a new coordinator.


>
> -----
>
>
>>  5) Start the new datanode specify --restoremode while starting the it
>>     ./postgres --restoremode -D ../data3 -p 35432
>>
>
>
> It seems you have disabled use of GTM in restore mode.
>

I did not.


> For e.g. in GetNewTransactionId(), we get a global tansaction id only if
> it's a coordinator or if IsPGXCNodeXactDatanodeDirect() is true. But
> IsPGXCNodeXactDatanodeDirect() will now return false in restore mode.
>

No, I have not changed the function IsPGXCNodeXactDatanodeDirect, it would
behave exactly as it used to. I changed the function IsPGXCNodeXactReadOnly.



> Is there any specific reason for disabling use of GTM in restore mode ?
>

No reason. GTM should be used.


> I don't see any harm in using GTM. In fact, it is better to start using
> global xids as soon as possible.
>

Exactly. I just verified that the statement
xid = (TransactionId) BeginTranGTM(timestamp)
in function GetNewTransactionId is called in restore mode.


>
>
>>
>>  6) Restore the backup that was taken from an existing datanode by
>> connecting to the new datanode directly
>>      ./psql -d postgres -f
>> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>> -p 35432
>>
>>  7) Quit the new datanode
>>
>>  8) Start the new datanode as a datanode by specifying --datanode
>>     ./postgres --datanode -D ../data3 -p 35432
>>
>>  9) Create the new datanode on all the coordinators and reload
>> configuration
>>     CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
>> PORT = 35432);
>>     SELECT pgxc_pool_reload();
>>
>> 10) Quit the session of step 3, this will unlock the cluster
>>
>> 11) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>
>> 12) The new daatnode is now ready
>>     ./psql test
>>     create table test_new_dn(a int, b int) distribute by replication;
>>     insert into test_new_dn values(1,2);
>>     EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>     EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>     EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>
>> ======================================
>>
>> On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:
>>
>>> Feature ID 3608379
>>>
>>> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>>>> >
>>>> >
>>>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>>>> > <ami...@en...> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>>>> wrote:
>>>> >>>
>>>> >>> Hi All,
>>>> >>>
>>>> >>> Attached please find a patch that provides a new command line
>>>> argument
>>>> >>> for postgres called --restoremode.
>>>> >>>
>>>> >>> While adding a new node to the cluster we need to restore the
>>>> schema of
>>>> >>> existing database to the new node.
>>>> >>> If the new node is a datanode and we connect directly to it, it
>>>> does not
>>>> >>> allow DDL, because it is in read only mode &
>>>> >>> If the new node is a coordinator, it will send DDLs to all the other
>>>> >>> coordinators which we do not want it to do.
>>>> >>
>>>> >>
>>>> >> What if we allow writes in standalone mode, so that we would
>>>> initialize
>>>> >> the new node using standalone mode instead of --restoremode ?
>>>> >
>>>> >
>>>> > Please take a look at the patch, I am using --restoremode in place of
>>>> > --coordinator & --datanode. I am not sure how would stand alone mode
>>>> fit in
>>>> > here.
>>>>
>>>> I was trying to see if we can avoid adding a new mode, instead, use
>>>> standalone mode for all the purposes for which restoremode is used.
>>>> Actually I checked the documentation, it says this mode is used only
>>>> for debugging or recovery purposes, so now I myself am a bit hesitent
>>>> about this mode for the purpose of restoring.
>>>>
>>>> >
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> To provide ability to restore on the new node a new command line
>>>> argument
>>>> >>> is provided.
>>>> >>> It is to be provided in place of --coordinator OR --datanode.
>>>> >>> In restore mode both coordinator and datanode are internally
>>>> treated as a
>>>> >>> datanode.
>>>> >>> For more details see patch comments.
>>>> >>>
>>>> >>> After this patch one can add a new node to the cluster.
>>>> >>>
>>>> >>> Here are the steps to add a new coordinator
>>>> >>>
>>>> >>>
>>>> >>>  1) Initdb new coordinator
>>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>>>> >>> --nodename coord_3
>>>> >>>
>>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>>> specify
>>>> >>> new coordinator name and pooler port
>>>> >>>
>>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>>> for
>>>> >>> backup
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=yes;
>>>> >>>         \q
>>>> >>
>>>> >>
>>>> >> I haven't given a thought on the earlier patch you sent for cluster
>>>> lock
>>>> >> implementation; may be we can discuss this on that thread, but just
>>>> a quick
>>>> >> question:
>>>> >>
>>>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>>>> finish
>>>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>>>> objects
>>>> >> created by these particular DDLs.
>>>> >
>>>> >
>>>> > Suppose you have a two coordinator cluster. Assume one client
>>>> connected to
>>>> > each. Suppose one client issues a lock cluster command and the other
>>>> issues
>>>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer
>>>> to your
>>>> > question is Yes.
>>>> >
>>>> > Suppose you have a prepared transaction that has a DDL in it, again
>>>> if this
>>>> > can be considered an on going DDL, then again answer to your question
>>>> is
>>>> > Yes.
>>>> >
>>>> > Suppose you have a two coordinator cluster. Assume one client
>>>> connected to
>>>> > each. One client starts a transaction and issues a DDL, the second
>>>> client
>>>> > issues a lock cluster command, the first commits the transaction. If
>>>> this is
>>>> > an ongoing DDL, then the answer to your question is No. But its a
>>>> matter of
>>>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>>>> the
>>>> > deny camp. I decided to put it in allow camp, because I have not yet
>>>> written
>>>> > any code to detect whether a transaction being committed has a DDL in
>>>> it or
>>>> > not, and stopping all transactions from committing looks too
>>>> restrictive to
>>>> > me.
>>>> >
>>>> > Do you have some other meaning of an ongoing DDL?
>>>> >
>>>> > I agree that we should have discussed this on the right thread. Lets
>>>> > continue this discussion on that thread.
>>>>
>>>> Continued on the other thread.
>>>>
>>>> >
>>>> >>
>>>> >>
>>>> >>>
>>>> >>>
>>>> >>>  4) Connect to any of the existing coordinators and take backup of
>>>> the
>>>> >>> database
>>>> >>>         ./pg_dump -p 5432 -C -s
>>>> >>>
>>>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql test
>>>> >>>
>>>> >>>  5) Start the new coordinator specify --restoremode while starting
>>>> the
>>>> >>> coordinator
>>>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>>>> >>>
>>>> >>>  6) connect to the new coordinator directly
>>>> >>>         ./psql postgres -p 5455
>>>> >>>
>>>> >>>  7) create all the datanodes and the rest of the coordinators on
>>>> the new
>>>> >>> coordiantor & reload configuration
>>>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 15432, PRIMARY);
>>>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 25432);
>>>> >>>
>>>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5432);
>>>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5433);
>>>> >>>
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>>  8) quit psql
>>>> >>>
>>>> >>>  9) Create the new database on the new coordinator
>>>> >>>         ./createdb test -p 5455
>>>> >>>
>>>> >>> 10) create the roles and table spaces manually, the dump does not
>>>> contain
>>>> >>> roles or table spaces
>>>> >>>         ./psql test -p 5455
>>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>>> >>>         CREATE TABLESPACE my_space LOCATION
>>>> >>> '/usr/local/pgsql/my_space_location';
>>>> >>>         \q
>>>> >>>
>>>> >>
>>>> >> Will pg_dumpall help ? It dumps roles also.
>>>> >
>>>> >
>>>> > Yah , but I am giving example of pg_dump so this step has to be there.
>>>> >
>>>> >>
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> 11) Restore the backup that was taken from an existing coordinator
>>>> by
>>>> >>> connecting to the new coordinator directly
>>>> >>>          ./psql -d test -f
>>>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p
>>>> 5455
>>>> >>>
>>>> >>> 11) Quit the new coordinator
>>>> >>>
>>>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=no;
>>>> >>>         \q
>>>> >>>
>>>> >>
>>>> >> Unlocking the cluster has to be done *after* the node is added into
>>>> the
>>>> >> cluster.
>>>> >
>>>> >
>>>> > Very true. I stand corrected. This means CREATE NODE has to be
>>>> allowed when
>>>> > xc_lock_for_backup is set.
>>>> >
>>>> >>
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> 13) Start the new coordinator as a by specifying --coordinator
>>>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>>>> >>>
>>>> >>> 14) Create the new coordinator on rest of the coordinators and
>>>> reload
>>>> >>> configuration
>>>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5455);
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>> 15) The new coordinator is now ready
>>>> >>>         ./psql test -p 5455
>>>> >>>         create table test_new_coord(a int, b int);
>>>> >>>         \q
>>>> >>>         ./psql test -p 5432
>>>> >>>         select * from test_new_coord;
>>>> >>>
>>>> >>>
>>>> >>> Here are the steps to add a new datanode
>>>> >>>
>>>> >>>
>>>> >>>  1) Initdb new datanode
>>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>>>> --nodename
>>>> >>> data_node_3
>>>> >>>
>>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>>> specify
>>>> >>> new datanode name
>>>> >>>
>>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>>> for
>>>> >>> backup
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=yes;
>>>> >>>         \q
>>>> >>>
>>>> >>>  4) Connect to any of the existing datanodes and take backup of the
>>>> >>> database
>>>> >>>         ./pg_dump -p 15432 -C -s
>>>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>>>> test
>>>> >>>
>>>> >>>  5) Start the new datanode specify --restoremode while starting the
>>>> it
>>>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>>>> >>>
>>>> >>>  6) Create the new database on the new datanode
>>>> >>>         ./createdb test -p 35432
>>>> >>>
>>>> >>>  7) create the roles and table spaces manually, the dump does not
>>>> contain
>>>> >>> roles or table spaces
>>>> >>>         ./psql test -p 35432
>>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>>> >>>         CREATE TABLESPACE my_space LOCATION
>>>> >>> '/usr/local/pgsql/my_space_location';
>>>> >>>         \q
>>>> >>>
>>>> >>>  8) Restore the backup that was taken from an existing datanode by
>>>> >>> connecting to the new datanode directly
>>>> >>>          ./psql -d test -f
>>>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p
>>>> 35432
>>>> >>>
>>>> >>>  9) Quit the new datanode
>>>> >>>
>>>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=no;
>>>> >>>         \q
>>>> >>>
>>>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>>>> >>>         ./postgres --datanode -D ../data3 -p 35432
>>>> >>>
>>>> >>> 12) Create the new datanode on all the coordinators and reload
>>>> >>> configuration
>>>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 35432);
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>>> >>>
>>>> >>> 14) The new daatnode is now ready
>>>> >>>         ./psql test
>>>> >>>         create table test_new_dn(a int, b int) distribute by
>>>> replication;
>>>> >>>         insert into test_new_dn values(1,2);
>>>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>>> >>>
>>>> >>> Please note that the steps assume that the patch sent earlier
>>>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>>>> applied.
>>>> >>>
>>>> >>> I have also attached test database scripts, that would help in patch
>>>> >>> review.
>>>> >>>
>>>> >>> Comments are welcome.
>>>> >>>
>>>> >>> --
>>>> >>> Abbas
>>>> >>> Architect
>>>> >>> EnterpriseDB Corporation
>>>> >>> The Enterprise PostgreSQL Company
>>>> >>>
>>>> >>> Phone: 92-334-5100153
>>>> >>>
>>>> >>> Website: www.enterprisedb.com
>>>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>> >>>
>>>> >>> This e-mail message (and any attachment) is intended for the use of
>>>> >>> the individual or entity to whom it is addressed. This message
>>>> >>> contains information from EnterpriseDB Corporation that may be
>>>> >>> privileged, confidential, or exempt from disclosure under applicable
>>>> >>> law. If you are not the intended recipient or authorized to receive
>>>> >>> this for the intended recipient, any use, dissemination,
>>>> distribution,
>>>> >>> retention, archiving, or copying of this communication is strictly
>>>> >>> prohibited. If you have received this e-mail in error, please notify
>>>> >>> the sender immediately by reply e-mail and delete this message.
>>>> >>>
>>>> >>>
>>>> ------------------------------------------------------------------------------
>>>> >>> Everyone hates slow websites. So do we.
>>>> >>> Make your web apps faster with AppDynamics
>>>> >>> Download AppDynamics Lite for free today:
>>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> >>> _______________________________________________
>>>> >>> Postgres-xc-developers mailing list
>>>> >>> Pos...@li...
>>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>> >>>
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > --
>>>> > Abbas
>>>> > Architect
>>>> > EnterpriseDB Corporation
>>>> > The Enterprise PostgreSQL Company
>>>> >
>>>> > Phone: 92-334-5100153
>>>> >
>>>> > Website: www.enterprisedb.com
>>>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>> >
>>>> > This e-mail message (and any attachment) is intended for the use of
>>>> > the individual or entity to whom it is addressed. This message
>>>> > contains information from EnterpriseDB Corporation that may be
>>>> > privileged, confidential, or exempt from disclosure under applicable
>>>> > law. If you are not the intended recipient or authorized to receive
>>>> > this for the intended recipient, any use, dissemination, distribution,
>>>> > retention, archiving, or copying of this communication is strictly
>>>> > prohibited. If you have received this e-mail in error, please notify
>>>> > the sender immediately by reply e-mail and delete this message.
>>>>
>>>
>>>
>>>
>>> --
>>> --
>>> Abbas
>>> Architect
>>> EnterpriseDB Corporation
>>> The Enterprise PostgreSQL Company
>>>
>>> Phone: 92-334-5100153
>>>
>>> Website: www.enterprisedb.com
>>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>
>>> This e-mail message (and any attachment) is intended for the use of
>>> the individual or entity to whom it is addressed. This message
>>> contains information from EnterpriseDB Corporation that may be
>>> privileged, confidential, or exempt from disclosure under applicable
>>> law. If you are not the intended recipient or authorized to receive
>>> this for the intended recipient, any use, dissemination, distribution,
>>> retention, archiving, or copying of this communication is strictly
>>> prohibited. If you have received this e-mail in error, please notify
>>> the sender immediately by reply e-mail and delete this message.
>>
>>
>>
>>
>> --
>> --
>> Abbas
>> Architect
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>> Phone: 92-334-5100153
>>
>> Website: www.enterprisedb.com
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>
>> This e-mail message (and any attachment) is intended for the use of
>> the individual or entity to whom it is addressed. This message
>> contains information from EnterpriseDB Corporation that may be
>> privileged, confidential, or exempt from disclosure under applicable
>> law. If you are not the intended recipient or authorized to receive
>> this for the intended recipient, any use, dissemination, distribution,
>> retention, archiving, or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify
>> the sender immediately by reply e-mail and delete this message.
>>
>
>


-- 
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

From: Ahsan H. <ahs...@en...> - 2013-04-01 08:21:25

Hi Amit,

As i understand this is not directly related to the trigger implementation
but it is a optimization in general for DML queries and it will how affect
how trigger function? Secondly you want to test your trigger implementation
with this optimization in place? I believe we need to have this
optimization checked in before the feature freeze date.

So i am wondering if it is worth Ashutosh reviewing this now or waiting to
review this after you have submitted the trigger implementation patch. I am
assuming that you will take most/all of this week to submit the trigger's
patch?

On Mon, Apr 1, 2013 at 11:58 AM, Amit Khandekar <
ami...@en...> wrote:

> The reason I wanted to get this checked in is because I wanted to test the
> trigger work with how the final target list would look like , i.e. having
> just ctid, node_id attributes. Currently the code accidentally was picking
> up the OLD row incorrectly from the redundant attributes, and then I
> realized it should not be doing so, although the results were correct.
>
> But I am ok if you feel you want to review the trigger work first. I have
> begun with the tuplestore-approach implementation for  AFTER triggers.
>

Did you do any benchmarking between the two approaches before you decided
to go with the tuplestore-approach?

Thanks,
Ahsan



>
>
> On 1 April 2013 12:17, Ashutosh Bapat <ash...@en...>wrote:
>
>>
>>
>> On Mon, Apr 1, 2013 at 12:14 PM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> It will not block or affect the trigger support as far as correctness is
>>> concerned. But for triggers, we need to fetch the complete row, so we don't
>>> want to include both the OLD row and  all the unnecessary columns in the
>>> SELECT list.
>>>
>>>
>> If it's so, I think, we will leave it aside for now, complete minimal
>> trigger work and then get back to it. Is that fine?
>>
>>
>>> Currently the SELECT is like this :
>>> SELECT col1, col2, col3 .... , ctid, xc_node_id from tab1
>>> For triggers, it would be :
>>> SELECT col1, col2, col3 .... , , tab1.* ctid, xc_node_id from tab1
>>>
>>> What we want is :
>>> Without triggers:
>>> SELECT ctid, xc_node_id from tab1
>>> With triggers:
>>> SELECT ctid, xc_node_id, tab1.*  from tab1
>>>
>>>
>>>
>>>
>>> On 1 April 2013 10:49, Ashutosh Bapat <ash...@en...>wrote:
>>>
>>>> Hi Amit,
>>>> Does this have any relation with triggers, if so, can you please
>>>> explain this relation?
>>>>
>>>> On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
>>>> ami...@en...> wrote:
>>>>
>>>>> Attached is a patch dml-targetlist.patch which prevents unnecessary
>>>>> columns from being fetched from the ModifyTable SELECT subplans (bug
>>>>> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
>>>>> )
>>>>>
>>>>> -------
>>>>>
>>>>> In expand_targetlist(), currently Vars of all the table attributes
>>>>> that are absent are added into the target list.  The patch adds NULL
>>>>> constants instead of Var nodes, so that those attributes will not be
>>>>> fetched from the SELECT. We still want to keep NULL constants because we
>>>>> need the complete tuple. We don't want to change the PG way of having all
>>>>> the attributes of the NEW ROW tuple descriptor of the source plan slot.
>>>>> Also other PG things like constraint expressions and triggers assume this
>>>>> format, so it is convenient to not modify this design.
>>>>>
>>>>> Using NULL constants will cause the remote SELECT statement to
>>>>> completely exclude all the unnecessary columns. It does not even have NULLs.
>>>>>
>>>>>
>>>>> Below is how the new plan will look now:
>>>>>
>>>>> # explain verbose update newtab *set name5 = f(name5) where name4 =
>>>>> f(name4)*;
>>>>>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>>>>>    Node/s: data_node_1, data_node_2
>>>>>    Node expr: newtab.id
>>>>>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid =
>>>>> $7 AND xc_node_id = $9
>>>>>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"
>>>>> (cost=0.00..250.00 rows=1000 width=234)
>>>>>          Output: newtab.id, NULL::character varying, NULL::character
>>>>> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
>>>>> newtab.ctid, newtab.*, newtab.xc_node_id
>>>>>          Node/s: data_node_1, data_node_2
>>>>>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
>>>>> xc_node_id, name4 *FROM ONLY newtab WHERE true
>>>>>          Coordinator quals: ((newtab.name4)::text =
>>>>> (f(newtab.name4))::text)
>>>>>
>>>>>
>>>>> Notice the following:
>>>>>
>>>>> 1. Remote query now does not have any other columns namely name1,
>>>>> name2, name3.
>>>>>
>>>>> 2. The column id is still present. This is because it is a
>>>>> distribution column, and currently we use this column to decide the
>>>>> target node at execution time. Unfortunately I could not exclude the
>>>>> distribution column from the SELECT subplan target list. Doing this
>>>>> required nodeid-based target datanode determination for updates/deletes.
>>>>> Currently update/deletes use distribution column even if we fetch
>>>>> xc_node_id. I had almost come up with this nodeid-based implementation
>>>>> except that EXPLAIN did not correctly show the en_expr expression if that
>>>>> expression is a xc_node_id Var. Have attached another patch
>>>>> (working_nodeid_based.patch) that includes both : selecting reqd columns,
>>>>> plus node_id based target datanode determination. Due to the
>>>>> explain-of-en_expr issue, we need to first checkin dml-targetlist.patch,
>>>>> because that optimization is important for trigger implementation. The
>>>>> nodeid-based implementation does not compliment any of the changes in this
>>>>> patch.
>>>>>
>>>>> -------
>>>>>
>>>>> In rewriteTargetListUD(), we had added into the parse->targetlist the
>>>>> attributes need for quals; the RemoteQuery->base_tlist anyways does this
>>>>> job of getting the attributes added up in the base_tlist so it is not
>>>>> required to do this in rewrite. I have removed it.
>>>>>
>>>>> -------
>>>>>
>>>>> Since now there are only required columns in the SELECT subplan, we
>>>>> cannot fire check and not-null constraints on coordinator. For this, in the
>>>>> patch, we now include those columns that are referred in the constraint
>>>>> expressions. Also, we keep the not-null constraint check deferred for the
>>>>> datanode. We anyways need the final values for the constraints to get
>>>>> fired, and on datanode we know these are the final values. So in general
>>>>> when possible we should not fire constraints on coordinator. When all the
>>>>> constraints are shippable, they should be fired on datanode, else they
>>>>> should be fired on coordinator. In fact, we don't currently have a check
>>>>> whether the constraint is shippable, so this is a bug. Have added this
>>>>> constraint-shippablity check also.
>>>>>
>>>>> ------
>>>>>
>>>>> In the future, we could also consider optimization in the BIND data
>>>>> row. Currently it includes NULL values for parameters that are not used in
>>>>> the remote statement. We could just skip those parameters instead and thus
>>>>> reduce the data row to some extent. But this requires tweaking the
>>>>> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>>>>>
>>>>> ------
>>>>>
>>>>>
>>>>> Tests
>>>>> ======
>>>>>
>>>>> I have used xc_constraints test to add new check-constraint related
>>>>> tests. Otherwise, the existing tests are more than enough to test this
>>>>> patch. See the other regression test changes in the patch to get an idea of
>>>>> how this patch affects the plan target list.
>>>>>
>>>>> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
>>>>> that the target list does not reflect correct columns currently. I suspect
>>>>> set_plan_references changes the varattnos but they do not end up  correctly
>>>>> reflecting the actual target list. The diff in the test file is because we
>>>>> have lesser columns, but the column names anyway do not show the correct
>>>>> columns.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
>>>>> Rise to greatness in Intel's independent game demo contest.
>>>>> Compete for recognition, cash, and the chance to get your game
>>>>> on Steam. $5K grand prize plus 10 genre and skill prizes.
>>>>> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
>>>>> _______________________________________________
>>>>> Postgres-xc-developers mailing list
>>>>> Pos...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes,
>>>> Ashutosh Bapat
>>>> EntepriseDB Corporation
>>>> The Enterprise Postgres Company
>>>>
>>>
>>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>
>
> ------------------------------------------------------------------------------
> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
> Rise to greatness in Intel's independent game demo contest.
> Compete for recognition, cash, and the chance to get your game
> on Steam. $5K grand prize plus 10 genre and skill prizes.
> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
Ahsan Hadi
Snr Director Product Development
EnterpriseDB Corporation
The Enterprise Postgres Company

Phone: +92-51-8358874
Mobile: +92-333-5162114

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.

Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

From: Abbas B. <abb...@en...> - 2013-04-01 08:18:15

On Mon, Apr 1, 2013 at 8:21 AM, Koichi Suzuki <koi...@gm...>wrote:

> Thank you very much for the summary.   It helps much.
>
> I have a couple of questions on this.
> ----------
> Koichi Suzuki
>
>
> 2013/3/31 Abbas Butt <abb...@en...>
>
>> Hi,
>> Attached please find the revised patch for restore mode. This patch has
>> to be applied on top of the patches I sent earlier for
>>  3608377,
>>  3608376 &
>>  3608375.
>>
>> I have also attached some scripts and a C file useful for testing the
>> whole procedure. It is a database that has many objects in it.
>>
>> Here are the revised instructions for adding new nodes to the cluster.
>>
>> ======================================
>>
>> Here are the steps to add a new coordinator
>>
>>  1) Initdb new coordinator
>>   /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
>> coord_3
>>
>>  2) Make necessary changes in its postgresql.conf, in particular specify
>> new coordinator name and pooler port
>>
>>  3) Connect to any of the existing coordinators & lock the cluster for
>> backup, do not close this session
>>     ./psql postgres -p 5432
>>     select pgxc_lock_for_backup();
>>
>
>>  4) Connect to any of the existing coordinators and take backup of the
>> database
>>     ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
>> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>>
>
>
> Here, because only the default database is available, if the user is not
> "postgres" or default database specified in initdb, we need to specify the
> database name (and the owner name, if necessary).   Similar comments to
> others.
>

Agreed, once can choose to specify the default database to connect to.


>
>>  5) Start the new coordinator specify --restoremode while starting the
>> coordinator
>>     ./postgres --restoremode -D ../data_cord3 -p 5455
>>
>
>>  6) Create the new database on the new coordinator - optional
>>     ./createdb test -p 5455
>>
>
> I believe that pg_dumpall copies the definition of existing databases and
> this operation is usually unnecessary. If new database is needed, then this
> should be created after the new coordinator is up and registered to all the
> other coordinators.
>

True, this is an optional step and would not be required in many cases.


>
>
>
>>  7) Restore the backup that was taken from an existing coordinator by
>> connecting to the new coordinator directly
>>     ./psql -d test -f
>> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>> -p 5455
>>
>
>>  8) Quit the new coordinator
>>
>>  9) Start the new coordinator as a by specifying --coordinator
>>     ./postgres --coordinator -D ../data_cord3 -p 5455
>>
>> 10) Create the new coordinator on rest of the coordinators and reload
>> configuration
>>      CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
>> PORT = 5455);
>>     SELECT pgxc_pool_reload();
>>
>> 11) Quit the session of step 3, this will unlock the cluster
>>
>> 12) The new coordinator is now ready
>>     ./psql test -p 5455
>>     create table test_new_coord(a int, b int);
>>     \q
>>     ./psql test -p 5432
>>     select * from test_new_coord;
>>
>> *======================================*
>> *======================================*
>>
>> Here are the steps to add a new datanode
>>
>>
>>  1) Initdb new datanode
>>     /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
>> data_node_3
>>
>>  2) Make necessary changes in its postgresql.conf, in particular specify
>> new datanode name
>>
>>  3) Connect to any of the existing coordinators & lock the cluster for
>> backup, do not close this session
>>     ./psql postgres -p 5432
>>     select pgxc_lock_for_backup();
>>
>>  4) Connect to any of the existing datanodes and take backup of the
>> database
>>     ./pg_dumpall -p 15432 -s --include-nodes
>> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>>
>>  5) Start the new datanode specify --restoremode while starting the it
>>     ./postgres --restoremode -D ../data3 -p 35432
>>
>>  6) Restore the backup that was taken from an existing datanode by
>> connecting to the new datanode directly
>>      ./psql -d postgres -f
>> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>> -p 35432
>>
>>  7) Quit the new datanode
>>
>
> Same comment as the coordinator.
>
>>
>>  8) Start the new datanode as a datanode by specifying --datanode
>>     ./postgres --datanode -D ../data3 -p 35432
>>
>>  9) Create the new datanode on all the coordinators and reload
>> configuration
>>     CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
>> PORT = 35432);
>>     SELECT pgxc_pool_reload();
>>
>> 10) Quit the session of step 3, this will unlock the cluster
>>
>> 11) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>
>> 12) The new daatnode is now ready
>>     ./psql test
>>     create table test_new_dn(a int, b int) distribute by replication;
>>     insert into test_new_dn values(1,2);
>>     EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>     EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>     EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>
>> ======================================
>>
>> On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:
>>
>>> Feature ID 3608379
>>>
>>> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>>>> >
>>>> >
>>>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>>>> > <ami...@en...> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>>>> wrote:
>>>> >>>
>>>> >>> Hi All,
>>>> >>>
>>>> >>> Attached please find a patch that provides a new command line
>>>> argument
>>>> >>> for postgres called --restoremode.
>>>> >>>
>>>> >>> While adding a new node to the cluster we need to restore the
>>>> schema of
>>>> >>> existing database to the new node.
>>>> >>> If the new node is a datanode and we connect directly to it, it
>>>> does not
>>>> >>> allow DDL, because it is in read only mode &
>>>> >>> If the new node is a coordinator, it will send DDLs to all the other
>>>> >>> coordinators which we do not want it to do.
>>>> >>
>>>> >>
>>>> >> What if we allow writes in standalone mode, so that we would
>>>> initialize
>>>> >> the new node using standalone mode instead of --restoremode ?
>>>> >
>>>> >
>>>> > Please take a look at the patch, I am using --restoremode in place of
>>>> > --coordinator & --datanode. I am not sure how would stand alone mode
>>>> fit in
>>>> > here.
>>>>
>>>> I was trying to see if we can avoid adding a new mode, instead, use
>>>> standalone mode for all the purposes for which restoremode is used.
>>>> Actually I checked the documentation, it says this mode is used only
>>>> for debugging or recovery purposes, so now I myself am a bit hesitent
>>>> about this mode for the purpose of restoring.
>>>>
>>>> >
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> To provide ability to restore on the new node a new command line
>>>> argument
>>>> >>> is provided.
>>>> >>> It is to be provided in place of --coordinator OR --datanode.
>>>> >>> In restore mode both coordinator and datanode are internally
>>>> treated as a
>>>> >>> datanode.
>>>> >>> For more details see patch comments.
>>>> >>>
>>>> >>> After this patch one can add a new node to the cluster.
>>>> >>>
>>>> >>> Here are the steps to add a new coordinator
>>>> >>>
>>>> >>>
>>>> >>>  1) Initdb new coordinator
>>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>>>> >>> --nodename coord_3
>>>> >>>
>>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>>> specify
>>>> >>> new coordinator name and pooler port
>>>> >>>
>>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>>> for
>>>> >>> backup
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=yes;
>>>> >>>         \q
>>>> >>
>>>> >>
>>>> >> I haven't given a thought on the earlier patch you sent for cluster
>>>> lock
>>>> >> implementation; may be we can discuss this on that thread, but just
>>>> a quick
>>>> >> question:
>>>> >>
>>>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>>>> finish
>>>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>>>> objects
>>>> >> created by these particular DDLs.
>>>> >
>>>> >
>>>> > Suppose you have a two coordinator cluster. Assume one client
>>>> connected to
>>>> > each. Suppose one client issues a lock cluster command and the other
>>>> issues
>>>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer
>>>> to your
>>>> > question is Yes.
>>>> >
>>>> > Suppose you have a prepared transaction that has a DDL in it, again
>>>> if this
>>>> > can be considered an on going DDL, then again answer to your question
>>>> is
>>>> > Yes.
>>>> >
>>>> > Suppose you have a two coordinator cluster. Assume one client
>>>> connected to
>>>> > each. One client starts a transaction and issues a DDL, the second
>>>> client
>>>> > issues a lock cluster command, the first commits the transaction. If
>>>> this is
>>>> > an ongoing DDL, then the answer to your question is No. But its a
>>>> matter of
>>>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>>>> the
>>>> > deny camp. I decided to put it in allow camp, because I have not yet
>>>> written
>>>> > any code to detect whether a transaction being committed has a DDL in
>>>> it or
>>>> > not, and stopping all transactions from committing looks too
>>>> restrictive to
>>>> > me.
>>>> >
>>>> > Do you have some other meaning of an ongoing DDL?
>>>> >
>>>> > I agree that we should have discussed this on the right thread. Lets
>>>> > continue this discussion on that thread.
>>>>
>>>> Continued on the other thread.
>>>>
>>>> >
>>>> >>
>>>> >>
>>>> >>>
>>>> >>>
>>>> >>>  4) Connect to any of the existing coordinators and take backup of
>>>> the
>>>> >>> database
>>>> >>>         ./pg_dump -p 5432 -C -s
>>>> >>>
>>>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql test
>>>> >>>
>>>> >>>  5) Start the new coordinator specify --restoremode while starting
>>>> the
>>>> >>> coordinator
>>>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>>>> >>>
>>>> >>>  6) connect to the new coordinator directly
>>>> >>>         ./psql postgres -p 5455
>>>> >>>
>>>> >>>  7) create all the datanodes and the rest of the coordinators on
>>>> the new
>>>> >>> coordiantor & reload configuration
>>>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 15432, PRIMARY);
>>>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 25432);
>>>> >>>
>>>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5432);
>>>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5433);
>>>> >>>
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>>  8) quit psql
>>>> >>>
>>>> >>>  9) Create the new database on the new coordinator
>>>> >>>         ./createdb test -p 5455
>>>> >>>
>>>> >>> 10) create the roles and table spaces manually, the dump does not
>>>> contain
>>>> >>> roles or table spaces
>>>> >>>         ./psql test -p 5455
>>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>>> >>>         CREATE TABLESPACE my_space LOCATION
>>>> >>> '/usr/local/pgsql/my_space_location';
>>>> >>>         \q
>>>> >>>
>>>> >>
>>>> >> Will pg_dumpall help ? It dumps roles also.
>>>> >
>>>> >
>>>> > Yah , but I am giving example of pg_dump so this step has to be there.
>>>> >
>>>> >>
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> 11) Restore the backup that was taken from an existing coordinator
>>>> by
>>>> >>> connecting to the new coordinator directly
>>>> >>>          ./psql -d test -f
>>>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p
>>>> 5455
>>>> >>>
>>>> >>> 11) Quit the new coordinator
>>>> >>>
>>>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=no;
>>>> >>>         \q
>>>> >>>
>>>> >>
>>>> >> Unlocking the cluster has to be done *after* the node is added into
>>>> the
>>>> >> cluster.
>>>> >
>>>> >
>>>> > Very true. I stand corrected. This means CREATE NODE has to be
>>>> allowed when
>>>> > xc_lock_for_backup is set.
>>>> >
>>>> >>
>>>> >>
>>>> >>
>>>> >>>
>>>> >>> 13) Start the new coordinator as a by specifying --coordinator
>>>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>>>> >>>
>>>> >>> 14) Create the new coordinator on rest of the coordinators and
>>>> reload
>>>> >>> configuration
>>>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>>>> >>> 'coordinator', PORT = 5455);
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>> 15) The new coordinator is now ready
>>>> >>>         ./psql test -p 5455
>>>> >>>         create table test_new_coord(a int, b int);
>>>> >>>         \q
>>>> >>>         ./psql test -p 5432
>>>> >>>         select * from test_new_coord;
>>>> >>>
>>>> >>>
>>>> >>> Here are the steps to add a new datanode
>>>> >>>
>>>> >>>
>>>> >>>  1) Initdb new datanode
>>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>>>> --nodename
>>>> >>> data_node_3
>>>> >>>
>>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>>> specify
>>>> >>> new datanode name
>>>> >>>
>>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>>> for
>>>> >>> backup
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=yes;
>>>> >>>         \q
>>>> >>>
>>>> >>>  4) Connect to any of the existing datanodes and take backup of the
>>>> >>> database
>>>> >>>         ./pg_dump -p 15432 -C -s
>>>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>>>> test
>>>> >>>
>>>> >>>  5) Start the new datanode specify --restoremode while starting the
>>>> it
>>>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>>>> >>>
>>>> >>>  6) Create the new database on the new datanode
>>>> >>>         ./createdb test -p 35432
>>>> >>>
>>>> >>>  7) create the roles and table spaces manually, the dump does not
>>>> contain
>>>> >>> roles or table spaces
>>>> >>>         ./psql test -p 35432
>>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>>> >>>         CREATE TABLESPACE my_space LOCATION
>>>> >>> '/usr/local/pgsql/my_space_location';
>>>> >>>         \q
>>>> >>>
>>>> >>>  8) Restore the backup that was taken from an existing datanode by
>>>> >>> connecting to the new datanode directly
>>>> >>>          ./psql -d test -f
>>>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p
>>>> 35432
>>>> >>>
>>>> >>>  9) Quit the new datanode
>>>> >>>
>>>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>>>> >>>         ./psql postgres -p 5432
>>>> >>>         SET xc_lock_for_backup=no;
>>>> >>>         \q
>>>> >>>
>>>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>>>> >>>         ./postgres --datanode -D ../data3 -p 35432
>>>> >>>
>>>> >>> 12) Create the new datanode on all the coordinators and reload
>>>> >>> configuration
>>>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>>>> >>> 'datanode', PORT = 35432);
>>>> >>>         SELECT pgxc_pool_reload();
>>>> >>>
>>>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>>> >>>
>>>> >>> 14) The new daatnode is now ready
>>>> >>>         ./psql test
>>>> >>>         create table test_new_dn(a int, b int) distribute by
>>>> replication;
>>>> >>>         insert into test_new_dn values(1,2);
>>>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>>> >>>
>>>> >>> Please note that the steps assume that the patch sent earlier
>>>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>>>> applied.
>>>> >>>
>>>> >>> I have also attached test database scripts, that would help in patch
>>>> >>> review.
>>>> >>>
>>>> >>> Comments are welcome.
>>>> >>>
>>>> >>> --
>>>> >>> Abbas
>>>> >>> Architect
>>>> >>> EnterpriseDB Corporation
>>>> >>> The Enterprise PostgreSQL Company
>>>> >>>
>>>> >>> Phone: 92-334-5100153
>>>> >>>
>>>> >>> Website: www.enterprisedb.com
>>>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>> >>>
>>>> >>> This e-mail message (and any attachment) is intended for the use of
>>>> >>> the individual or entity to whom it is addressed. This message
>>>> >>> contains information from EnterpriseDB Corporation that may be
>>>> >>> privileged, confidential, or exempt from disclosure under applicable
>>>> >>> law. If you are not the intended recipient or authorized to receive
>>>> >>> this for the intended recipient, any use, dissemination,
>>>> distribution,
>>>> >>> retention, archiving, or copying of this communication is strictly
>>>> >>> prohibited. If you have received this e-mail in error, please notify
>>>> >>> the sender immediately by reply e-mail and delete this message.
>>>> >>>
>>>> >>>
>>>> ------------------------------------------------------------------------------
>>>> >>> Everyone hates slow websites. So do we.
>>>> >>> Make your web apps faster with AppDynamics
>>>> >>> Download AppDynamics Lite for free today:
>>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> >>> _______________________________________________
>>>> >>> Postgres-xc-developers mailing list
>>>> >>> Pos...@li...
>>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>> >>>
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > --
>>>> > Abbas
>>>> > Architect
>>>> > EnterpriseDB Corporation
>>>> > The Enterprise PostgreSQL Company
>>>> >
>>>> > Phone: 92-334-5100153
>>>> >
>>>> > Website: www.enterprisedb.com
>>>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>> >
>>>> > This e-mail message (and any attachment) is intended for the use of
>>>> > the individual or entity to whom it is addressed. This message
>>>> > contains information from EnterpriseDB Corporation that may be
>>>> > privileged, confidential, or exempt from disclosure under applicable
>>>> > law. If you are not the intended recipient or authorized to receive
>>>> > this for the intended recipient, any use, dissemination, distribution,
>>>> > retention, archiving, or copying of this communication is strictly
>>>> > prohibited. If you have received this e-mail in error, please notify
>>>> > the sender immediately by reply e-mail and delete this message.
>>>>
>>>
>>>
>>>
>>> --
>>> --
>>> Abbas
>>> Architect
>>> EnterpriseDB Corporation
>>> The Enterprise PostgreSQL Company
>>>
>>> Phone: 92-334-5100153
>>>
>>> Website: www.enterprisedb.com
>>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>>
>>> This e-mail message (and any attachment) is intended for the use of
>>> the individual or entity to whom it is addressed. This message
>>> contains information from EnterpriseDB Corporation that may be
>>> privileged, confidential, or exempt from disclosure under applicable
>>> law. If you are not the intended recipient or authorized to receive
>>> this for the intended recipient, any use, dissemination, distribution,
>>> retention, archiving, or copying of this communication is strictly
>>> prohibited. If you have received this e-mail in error, please notify
>>> the sender immediately by reply e-mail and delete this message.
>>
>>
>>
>>
>> --
>> --
>> Abbas
>> Architect
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>> Phone: 92-334-5100153
>>
>> Website: www.enterprisedb.com
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>
>> This e-mail message (and any attachment) is intended for the use of
>> the individual or entity to whom it is addressed. This message
>> contains information from EnterpriseDB Corporation that may be
>> privileged, confidential, or exempt from disclosure under applicable
>> law. If you are not the intended recipient or authorized to receive
>> this for the intended recipient, any use, dissemination, distribution,
>> retention, archiving, or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify
>> the sender immediately by reply e-mail and delete this message.
>>
>>
>> ------------------------------------------------------------------------------
>> Own the Future-Intel(R) Level Up Game Demo Contest 2013
>> Rise to greatness in Intel's independent game demo contest. Compete
>> for recognition, cash, and the chance to get your game on Steam.
>> $5K grand prize plus 10 genre and skill prizes. Submit your demo
>> by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
>>
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>


-- 
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

From: Amit K. <ami...@en...> - 2013-04-01 06:58:52

The reason I wanted to get this checked in is because I wanted to test the
trigger work with how the final target list would look like , i.e. having
just ctid, node_id attributes. Currently the code accidentally was picking
up the OLD row incorrectly from the redundant attributes, and then I
realized it should not be doing so, although the results were correct.

But I am ok if you feel you want to review the trigger work first. I have
begun with the tuplestore-approach implementation for  AFTER triggers.



On 1 April 2013 12:17, Ashutosh Bapat <ash...@en...>wrote:

>
>
> On Mon, Apr 1, 2013 at 12:14 PM, Amit Khandekar <
> ami...@en...> wrote:
>
>> It will not block or affect the trigger support as far as correctness is
>> concerned. But for triggers, we need to fetch the complete row, so we don't
>> want to include both the OLD row and  all the unnecessary columns in the
>> SELECT list.
>>
>>
> If it's so, I think, we will leave it aside for now, complete minimal
> trigger work and then get back to it. Is that fine?
>
>
>> Currently the SELECT is like this :
>> SELECT col1, col2, col3 .... , ctid, xc_node_id from tab1
>> For triggers, it would be :
>> SELECT col1, col2, col3 .... , , tab1.* ctid, xc_node_id from tab1
>>
>> What we want is :
>> Without triggers:
>> SELECT ctid, xc_node_id from tab1
>> With triggers:
>> SELECT ctid, xc_node_id, tab1.*  from tab1
>>
>>
>>
>>
>> On 1 April 2013 10:49, Ashutosh Bapat <ash...@en...>wrote:
>>
>>> Hi Amit,
>>> Does this have any relation with triggers, if so, can you please explain
>>> this relation?
>>>
>>> On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
>>> ami...@en...> wrote:
>>>
>>>> Attached is a patch dml-targetlist.patch which prevents unnecessary
>>>> columns from being fetched from the ModifyTable SELECT subplans (bug
>>>> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
>>>> )
>>>>
>>>> -------
>>>>
>>>> In expand_targetlist(), currently Vars of all the table attributes that
>>>> are absent are added into the target list.  The patch adds NULL constants
>>>> instead of Var nodes, so that those attributes will not be fetched from the
>>>> SELECT. We still want to keep NULL constants because we need the complete
>>>> tuple. We don't want to change the PG way of having all the attributes of
>>>> the NEW ROW tuple descriptor of the source plan slot. Also other PG things
>>>> like constraint expressions and triggers assume this format, so it is
>>>> convenient to not modify this design.
>>>>
>>>> Using NULL constants will cause the remote SELECT statement to
>>>> completely exclude all the unnecessary columns. It does not even have NULLs.
>>>>
>>>>
>>>> Below is how the new plan will look now:
>>>>
>>>> # explain verbose update newtab *set name5 = f(name5) where name4 =
>>>> f(name4)*;
>>>>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>>>>    Node/s: data_node_1, data_node_2
>>>>    Node expr: newtab.id
>>>>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid =
>>>> $7 AND xc_node_id = $9
>>>>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"
>>>> (cost=0.00..250.00 rows=1000 width=234)
>>>>          Output: newtab.id, NULL::character varying, NULL::character
>>>> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
>>>> newtab.ctid, newtab.*, newtab.xc_node_id
>>>>          Node/s: data_node_1, data_node_2
>>>>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
>>>> xc_node_id, name4 *FROM ONLY newtab WHERE true
>>>>          Coordinator quals: ((newtab.name4)::text =
>>>> (f(newtab.name4))::text)
>>>>
>>>>
>>>> Notice the following:
>>>>
>>>> 1. Remote query now does not have any other columns namely name1,
>>>> name2, name3.
>>>>
>>>> 2. The column id is still present. This is because it is a
>>>> distribution column, and currently we use this column to decide the
>>>> target node at execution time. Unfortunately I could not exclude the
>>>> distribution column from the SELECT subplan target list. Doing this
>>>> required nodeid-based target datanode determination for updates/deletes.
>>>> Currently update/deletes use distribution column even if we fetch
>>>> xc_node_id. I had almost come up with this nodeid-based implementation
>>>> except that EXPLAIN did not correctly show the en_expr expression if that
>>>> expression is a xc_node_id Var. Have attached another patch
>>>> (working_nodeid_based.patch) that includes both : selecting reqd columns,
>>>> plus node_id based target datanode determination. Due to the
>>>> explain-of-en_expr issue, we need to first checkin dml-targetlist.patch,
>>>> because that optimization is important for trigger implementation. The
>>>> nodeid-based implementation does not compliment any of the changes in this
>>>> patch.
>>>>
>>>> -------
>>>>
>>>> In rewriteTargetListUD(), we had added into the parse->targetlist the
>>>> attributes need for quals; the RemoteQuery->base_tlist anyways does this
>>>> job of getting the attributes added up in the base_tlist so it is not
>>>> required to do this in rewrite. I have removed it.
>>>>
>>>> -------
>>>>
>>>> Since now there are only required columns in the SELECT subplan, we
>>>> cannot fire check and not-null constraints on coordinator. For this, in the
>>>> patch, we now include those columns that are referred in the constraint
>>>> expressions. Also, we keep the not-null constraint check deferred for the
>>>> datanode. We anyways need the final values for the constraints to get
>>>> fired, and on datanode we know these are the final values. So in general
>>>> when possible we should not fire constraints on coordinator. When all the
>>>> constraints are shippable, they should be fired on datanode, else they
>>>> should be fired on coordinator. In fact, we don't currently have a check
>>>> whether the constraint is shippable, so this is a bug. Have added this
>>>> constraint-shippablity check also.
>>>>
>>>> ------
>>>>
>>>> In the future, we could also consider optimization in the BIND data
>>>> row. Currently it includes NULL values for parameters that are not used in
>>>> the remote statement. We could just skip those parameters instead and thus
>>>> reduce the data row to some extent. But this requires tweaking the
>>>> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>>>>
>>>> ------
>>>>
>>>>
>>>> Tests
>>>> ======
>>>>
>>>> I have used xc_constraints test to add new check-constraint related
>>>> tests. Otherwise, the existing tests are more than enough to test this
>>>> patch. See the other regression test changes in the patch to get an idea of
>>>> how this patch affects the plan target list.
>>>>
>>>> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
>>>> that the target list does not reflect correct columns currently. I suspect
>>>> set_plan_references changes the varattnos but they do not end up  correctly
>>>> reflecting the actual target list. The diff in the test file is because we
>>>> have lesser columns, but the column names anyway do not show the correct
>>>> columns.
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
>>>> Rise to greatness in Intel's independent game demo contest.
>>>> Compete for recognition, cash, and the chance to get your game
>>>> on Steam. $5K grand prize plus 10 genre and skill prizes.
>>>> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
>>>> _______________________________________________
>>>> Postgres-xc-developers mailing list
>>>> Pos...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>

Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

From: Ashutosh B. <ash...@en...> - 2013-04-01 06:48:03

On Mon, Apr 1, 2013 at 12:14 PM, Amit Khandekar <
ami...@en...> wrote:

> It will not block or affect the trigger support as far as correctness is
> concerned. But for triggers, we need to fetch the complete row, so we don't
> want to include both the OLD row and  all the unnecessary columns in the
> SELECT list.
>
>
If it's so, I think, we will leave it aside for now, complete minimal
trigger work and then get back to it. Is that fine?


> Currently the SELECT is like this :
> SELECT col1, col2, col3 .... , ctid, xc_node_id from tab1
> For triggers, it would be :
> SELECT col1, col2, col3 .... , , tab1.* ctid, xc_node_id from tab1
>
> What we want is :
> Without triggers:
> SELECT ctid, xc_node_id from tab1
> With triggers:
> SELECT ctid, xc_node_id, tab1.*  from tab1
>
>
>
>
> On 1 April 2013 10:49, Ashutosh Bapat <ash...@en...>wrote:
>
>> Hi Amit,
>> Does this have any relation with triggers, if so, can you please explain
>> this relation?
>>
>> On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> Attached is a patch dml-targetlist.patch which prevents unnecessary
>>> columns from being fetched from the ModifyTable SELECT subplans (bug
>>> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
>>> )
>>>
>>> -------
>>>
>>> In expand_targetlist(), currently Vars of all the table attributes that
>>> are absent are added into the target list.  The patch adds NULL constants
>>> instead of Var nodes, so that those attributes will not be fetched from the
>>> SELECT. We still want to keep NULL constants because we need the complete
>>> tuple. We don't want to change the PG way of having all the attributes of
>>> the NEW ROW tuple descriptor of the source plan slot. Also other PG things
>>> like constraint expressions and triggers assume this format, so it is
>>> convenient to not modify this design.
>>>
>>> Using NULL constants will cause the remote SELECT statement to
>>> completely exclude all the unnecessary columns. It does not even have NULLs.
>>>
>>>
>>> Below is how the new plan will look now:
>>>
>>> # explain verbose update newtab *set name5 = f(name5) where name4 =
>>> f(name4)*;
>>>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>>>    Node/s: data_node_1, data_node_2
>>>    Node expr: newtab.id
>>>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid =
>>> $7 AND xc_node_id = $9
>>>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"
>>> (cost=0.00..250.00 rows=1000 width=234)
>>>          Output: newtab.id, NULL::character varying, NULL::character
>>> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
>>> newtab.ctid, newtab.*, newtab.xc_node_id
>>>          Node/s: data_node_1, data_node_2
>>>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
>>> xc_node_id, name4 *FROM ONLY newtab WHERE true
>>>          Coordinator quals: ((newtab.name4)::text =
>>> (f(newtab.name4))::text)
>>>
>>>
>>> Notice the following:
>>>
>>> 1. Remote query now does not have any other columns namely name1,
>>> name2, name3.
>>>
>>> 2. The column id is still present. This is because it is a distribution
>>> column, and currently we use this column to decide the target node at
>>> execution time. Unfortunately I could not exclude the distribution column
>>> from the SELECT subplan target list. Doing this required nodeid-based
>>> target datanode determination for updates/deletes. Currently update/deletes
>>> use distribution column even if we fetch xc_node_id. I had almost come up
>>> with this nodeid-based implementation except that EXPLAIN did not correctly
>>> show the en_expr expression if that expression is a xc_node_id Var. Have
>>> attached another patch (working_nodeid_based.patch) that includes both :
>>> selecting reqd columns, plus node_id based target datanode determination.
>>> Due to the explain-of-en_expr issue, we need to first checkin
>>> dml-targetlist.patch, because that optimization is important for trigger
>>> implementation. The nodeid-based implementation does not compliment any of
>>> the changes in this patch.
>>>
>>> -------
>>>
>>> In rewriteTargetListUD(), we had added into the parse->targetlist the
>>> attributes need for quals; the RemoteQuery->base_tlist anyways does this
>>> job of getting the attributes added up in the base_tlist so it is not
>>> required to do this in rewrite. I have removed it.
>>>
>>> -------
>>>
>>> Since now there are only required columns in the SELECT subplan, we
>>> cannot fire check and not-null constraints on coordinator. For this, in the
>>> patch, we now include those columns that are referred in the constraint
>>> expressions. Also, we keep the not-null constraint check deferred for the
>>> datanode. We anyways need the final values for the constraints to get
>>> fired, and on datanode we know these are the final values. So in general
>>> when possible we should not fire constraints on coordinator. When all the
>>> constraints are shippable, they should be fired on datanode, else they
>>> should be fired on coordinator. In fact, we don't currently have a check
>>> whether the constraint is shippable, so this is a bug. Have added this
>>> constraint-shippablity check also.
>>>
>>> ------
>>>
>>> In the future, we could also consider optimization in the BIND data row.
>>> Currently it includes NULL values for parameters that are not used in the
>>> remote statement. We could just skip those parameters instead and thus
>>> reduce the data row to some extent. But this requires tweaking the
>>> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>>>
>>> ------
>>>
>>>
>>> Tests
>>> ======
>>>
>>> I have used xc_constraints test to add new check-constraint related
>>> tests. Otherwise, the existing tests are more than enough to test this
>>> patch. See the other regression test changes in the patch to get an idea of
>>> how this patch affects the plan target list.
>>>
>>> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
>>> that the target list does not reflect correct columns currently. I suspect
>>> set_plan_references changes the varattnos but they do not end up  correctly
>>> reflecting the actual target list. The diff in the test file is because we
>>> have lesser columns, but the column names anyway do not show the correct
>>> columns.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
>>> Rise to greatness in Intel's independent game demo contest.
>>> Compete for recognition, cash, and the chance to get your game
>>> on Steam. $5K grand prize plus 10 genre and skill prizes.
>>> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
>>> _______________________________________________
>>> Postgres-xc-developers mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>
>>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

From: Amit K. <ami...@en...> - 2013-04-01 06:44:53

It will not block or affect the trigger support as far as correctness is
concerned. But for triggers, we need to fetch the complete row, so we don't
want to include both the OLD row and  all the unnecessary columns in the
SELECT list.

Currently the SELECT is like this :
SELECT col1, col2, col3 .... , ctid, xc_node_id from tab1
For triggers, it would be :
SELECT col1, col2, col3 .... , , tab1.* ctid, xc_node_id from tab1

What we want is :
Without triggers:
SELECT ctid, xc_node_id from tab1
With triggers:
SELECT ctid, xc_node_id, tab1.*  from tab1



On 1 April 2013 10:49, Ashutosh Bapat <ash...@en...>wrote:

> Hi Amit,
> Does this have any relation with triggers, if so, can you please explain
> this relation?
>
> On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
> ami...@en...> wrote:
>
>> Attached is a patch dml-targetlist.patch which prevents unnecessary
>> columns from being fetched from the ModifyTable SELECT subplans (bug
>> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
>> )
>>
>> -------
>>
>> In expand_targetlist(), currently Vars of all the table attributes that
>> are absent are added into the target list.  The patch adds NULL constants
>> instead of Var nodes, so that those attributes will not be fetched from the
>> SELECT. We still want to keep NULL constants because we need the complete
>> tuple. We don't want to change the PG way of having all the attributes of
>> the NEW ROW tuple descriptor of the source plan slot. Also other PG things
>> like constraint expressions and triggers assume this format, so it is
>> convenient to not modify this design.
>>
>> Using NULL constants will cause the remote SELECT statement to completely
>> exclude all the unnecessary columns. It does not even have NULLs.
>>
>>
>> Below is how the new plan will look now:
>>
>> # explain verbose update newtab *set name5 = f(name5) where name4 =
>> f(name4)*;
>>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>>    Node/s: data_node_1, data_node_2
>>    Node expr: newtab.id
>>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid = $7
>> AND xc_node_id = $9
>>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"
>> (cost=0.00..250.00 rows=1000 width=234)
>>          Output: newtab.id, NULL::character varying, NULL::character
>> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
>> newtab.ctid, newtab.*, newtab.xc_node_id
>>          Node/s: data_node_1, data_node_2
>>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
>> xc_node_id, name4 *FROM ONLY newtab WHERE true
>>          Coordinator quals: ((newtab.name4)::text =
>> (f(newtab.name4))::text)
>>
>>
>> Notice the following:
>>
>> 1. Remote query now does not have any other columns namely name1, name2,
>> name3.
>>
>> 2. The column id is still present. This is because it is a distribution
>> column, and currently we use this column to decide the target node at
>> execution time. Unfortunately I could not exclude the distribution column
>> from the SELECT subplan target list. Doing this required nodeid-based
>> target datanode determination for updates/deletes. Currently update/deletes
>> use distribution column even if we fetch xc_node_id. I had almost come up
>> with this nodeid-based implementation except that EXPLAIN did not correctly
>> show the en_expr expression if that expression is a xc_node_id Var. Have
>> attached another patch (working_nodeid_based.patch) that includes both :
>> selecting reqd columns, plus node_id based target datanode determination.
>> Due to the explain-of-en_expr issue, we need to first checkin
>> dml-targetlist.patch, because that optimization is important for trigger
>> implementation. The nodeid-based implementation does not compliment any of
>> the changes in this patch.
>>
>> -------
>>
>> In rewriteTargetListUD(), we had added into the parse->targetlist the
>> attributes need for quals; the RemoteQuery->base_tlist anyways does this
>> job of getting the attributes added up in the base_tlist so it is not
>> required to do this in rewrite. I have removed it.
>>
>> -------
>>
>> Since now there are only required columns in the SELECT subplan, we
>> cannot fire check and not-null constraints on coordinator. For this, in the
>> patch, we now include those columns that are referred in the constraint
>> expressions. Also, we keep the not-null constraint check deferred for the
>> datanode. We anyways need the final values for the constraints to get
>> fired, and on datanode we know these are the final values. So in general
>> when possible we should not fire constraints on coordinator. When all the
>> constraints are shippable, they should be fired on datanode, else they
>> should be fired on coordinator. In fact, we don't currently have a check
>> whether the constraint is shippable, so this is a bug. Have added this
>> constraint-shippablity check also.
>>
>> ------
>>
>> In the future, we could also consider optimization in the BIND data row.
>> Currently it includes NULL values for parameters that are not used in the
>> remote statement. We could just skip those parameters instead and thus
>> reduce the data row to some extent. But this requires tweaking the
>> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>>
>> ------
>>
>>
>> Tests
>> ======
>>
>> I have used xc_constraints test to add new check-constraint related
>> tests. Otherwise, the existing tests are more than enough to test this
>> patch. See the other regression test changes in the patch to get an idea of
>> how this patch affects the plan target list.
>>
>> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
>> that the target list does not reflect correct columns currently. I suspect
>> set_plan_references changes the varattnos but they do not end up  correctly
>> reflecting the actual target list. The diff in the test file is because we
>> have lesser columns, but the column names anyway do not show the correct
>> columns.
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
>> Rise to greatness in Intel's independent game demo contest.
>> Compete for recognition, cash, and the chance to get your game
>> on Steam. $5K grand prize plus 10 genre and skill prizes.
>> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>

Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

From: Amit K. <ami...@en...> - 2013-04-01 06:02:56

On 31 March 2013 14:07, Abbas Butt <abb...@en...> wrote:

> Hi,
> Attached please find the revised patch for restore mode. This patch has to
> be applied on top of the patches I sent earlier for
>  3608377,
>  3608376 &
>  3608375.
>
> I have also attached some scripts and a C file useful for testing the
> whole procedure. It is a database that has many objects in it.
>
> Here are the revised instructions for adding new nodes to the cluster.
>
> ======================================
>
> Here are the steps to add a new coordinator
>
>  1) Initdb new coordinator
>   /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
> coord_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new coordinator name and pooler port
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>
>  4) Connect to any of the existing coordinators and take backup of the
> database
>     ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>
>  5) Start the new coordinator specify --restoremode while starting the
> coordinator
>     ./postgres --restoremode -D ../data_cord3 -p 5455
>
>  6) Create the new database on the new coordinator - optional
>     ./createdb test -p 5455
>
>  7) Restore the backup that was taken from an existing coordinator by
> connecting to the new coordinator directly
>     ./psql -d test -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
> -p 5455
>
>  8) Quit the new coordinator
>
>  9) Start the new coordinator as a by specifying --coordinator
>     ./postgres --coordinator -D ../data_cord3 -p 5455
>
> 10) Create the new coordinator on rest of the coordinators and reload
> configuration
>      CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
> PORT = 5455);
>     SELECT pgxc_pool_reload();
>
> 11) Quit the session of step 3, this will unlock the cluster
>
> 12) The new coordinator is now ready
>     ./psql test -p 5455
>     create table test_new_coord(a int, b int);
>     \q
>     ./psql test -p 5432
>     select * from test_new_coord;
>
> *======================================*
> *======================================*
>
> Here are the steps to add a new datanode
>
>
>  1) Initdb new datanode
>     /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
> data_node_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new datanode name
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>
>  4) Connect to any of the existing datanodes and take backup of the
> database
>     ./pg_dumpall -p 15432 -s --include-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>


Why do we need --include-nodes on datanode ?

----


+     *    The dump taken from a datanode does NOT contain any DISTRIBUTE BY
+     *    clause. This fact is used here to make sure that when the
+     *    DISTRIBUTE BY clause is missing in the statemnet the system
+     *    should not try to find out the node list itself.
+     */
+    if ((IS_PGXC_COORDINATOR || (isRestoreMode && stmt->distributeby !=
NULL))
+        && relkind == RELKIND_RELATION)

How do we enforce not having DISTRIBUTE BY clause in the pg_dump output if
it's a datanode ? Also, can we just error out in restore mode if the
DISTRIBUTE BY clause is present ?

-----


>  5) Start the new datanode specify --restoremode while starting the it
>     ./postgres --restoremode -D ../data3 -p 35432
>


It seems you have disabled use of GTM in restore mode. For e.g. in
GetNewTransactionId(), we get a global tansaction id only if it's a
coordinator or if IsPGXCNodeXactDatanodeDirect() is true. But
IsPGXCNodeXactDatanodeDirect() will now return false in restore mode. Is
there any specific reason for disabling use of GTM in restore mode ? I
don't see any harm in using GTM. In fact, it is better to start using
global xids as soon as possible.


>
>  6) Restore the backup that was taken from an existing datanode by
> connecting to the new datanode directly
>      ./psql -d postgres -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
> -p 35432
>
>  7) Quit the new datanode
>
>  8) Start the new datanode as a datanode by specifying --datanode
>     ./postgres --datanode -D ../data3 -p 35432
>
>  9) Create the new datanode on all the coordinators and reload
> configuration
>     CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
> PORT = 35432);
>     SELECT pgxc_pool_reload();
>
> 10) Quit the session of step 3, this will unlock the cluster
>
> 11) Redistribute data by using ALTER TABLE REDISTRIBUTE
>
> 12) The new daatnode is now ready
>     ./psql test
>     create table test_new_dn(a int, b int) distribute by replication;
>     insert into test_new_dn values(1,2);
>     EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>
> ======================================
>
> On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:
>
>> Feature ID 3608379
>>
>> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>>> >
>>> >
>>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>>> > <ami...@en...> wrote:
>>> >>
>>> >>
>>> >>
>>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>>> wrote:
>>> >>>
>>> >>> Hi All,
>>> >>>
>>> >>> Attached please find a patch that provides a new command line
>>> argument
>>> >>> for postgres called --restoremode.
>>> >>>
>>> >>> While adding a new node to the cluster we need to restore the schema
>>> of
>>> >>> existing database to the new node.
>>> >>> If the new node is a datanode and we connect directly to it, it does
>>> not
>>> >>> allow DDL, because it is in read only mode &
>>> >>> If the new node is a coordinator, it will send DDLs to all the other
>>> >>> coordinators which we do not want it to do.
>>> >>
>>> >>
>>> >> What if we allow writes in standalone mode, so that we would
>>> initialize
>>> >> the new node using standalone mode instead of --restoremode ?
>>> >
>>> >
>>> > Please take a look at the patch, I am using --restoremode in place of
>>> > --coordinator & --datanode. I am not sure how would stand alone mode
>>> fit in
>>> > here.
>>>
>>> I was trying to see if we can avoid adding a new mode, instead, use
>>> standalone mode for all the purposes for which restoremode is used.
>>> Actually I checked the documentation, it says this mode is used only
>>> for debugging or recovery purposes, so now I myself am a bit hesitent
>>> about this mode for the purpose of restoring.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> To provide ability to restore on the new node a new command line
>>> argument
>>> >>> is provided.
>>> >>> It is to be provided in place of --coordinator OR --datanode.
>>> >>> In restore mode both coordinator and datanode are internally treated
>>> as a
>>> >>> datanode.
>>> >>> For more details see patch comments.
>>> >>>
>>> >>> After this patch one can add a new node to the cluster.
>>> >>>
>>> >>> Here are the steps to add a new coordinator
>>> >>>
>>> >>>
>>> >>>  1) Initdb new coordinator
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>>> >>> --nodename coord_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new coordinator name and pooler port
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>
>>> >>
>>> >> I haven't given a thought on the earlier patch you sent for cluster
>>> lock
>>> >> implementation; may be we can discuss this on that thread, but just a
>>> quick
>>> >> question:
>>> >>
>>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>>> finish
>>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>>> objects
>>> >> created by these particular DDLs.
>>> >
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. Suppose one client issues a lock cluster command and the other
>>> issues
>>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer to
>>> your
>>> > question is Yes.
>>> >
>>> > Suppose you have a prepared transaction that has a DDL in it, again if
>>> this
>>> > can be considered an on going DDL, then again answer to your question
>>> is
>>> > Yes.
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. One client starts a transaction and issues a DDL, the second
>>> client
>>> > issues a lock cluster command, the first commits the transaction. If
>>> this is
>>> > an ongoing DDL, then the answer to your question is No. But its a
>>> matter of
>>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>>> the
>>> > deny camp. I decided to put it in allow camp, because I have not yet
>>> written
>>> > any code to detect whether a transaction being committed has a DDL in
>>> it or
>>> > not, and stopping all transactions from committing looks too
>>> restrictive to
>>> > me.
>>> >
>>> > Do you have some other meaning of an ongoing DDL?
>>> >
>>> > I agree that we should have discussed this on the right thread. Lets
>>> > continue this discussion on that thread.
>>>
>>> Continued on the other thread.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>>
>>> >>>  4) Connect to any of the existing coordinators and take backup of
>>> the
>>> >>> database
>>> >>>         ./pg_dump -p 5432 -C -s
>>> >>>
>>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql test
>>> >>>
>>> >>>  5) Start the new coordinator specify --restoremode while starting
>>> the
>>> >>> coordinator
>>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>>> >>>
>>> >>>  6) connect to the new coordinator directly
>>> >>>         ./psql postgres -p 5455
>>> >>>
>>> >>>  7) create all the datanodes and the rest of the coordinators on the
>>> new
>>> >>> coordiantor & reload configuration
>>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 15432, PRIMARY);
>>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 25432);
>>> >>>
>>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5432);
>>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5433);
>>> >>>
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>>  8) quit psql
>>> >>>
>>> >>>  9) Create the new database on the new coordinator
>>> >>>         ./createdb test -p 5455
>>> >>>
>>> >>> 10) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 5455
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Will pg_dumpall help ? It dumps roles also.
>>> >
>>> >
>>> > Yah , but I am giving example of pg_dump so this step has to be there.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 11) Restore the backup that was taken from an existing coordinator by
>>> >>> connecting to the new coordinator directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p
>>> 5455
>>> >>>
>>> >>> 11) Quit the new coordinator
>>> >>>
>>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Unlocking the cluster has to be done *after* the node is added into
>>> the
>>> >> cluster.
>>> >
>>> >
>>> > Very true. I stand corrected. This means CREATE NODE has to be allowed
>>> when
>>> > xc_lock_for_backup is set.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 13) Start the new coordinator as a by specifying --coordinator
>>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>>> >>>
>>> >>> 14) Create the new coordinator on rest of the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5455);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 15) The new coordinator is now ready
>>> >>>         ./psql test -p 5455
>>> >>>         create table test_new_coord(a int, b int);
>>> >>>         \q
>>> >>>         ./psql test -p 5432
>>> >>>         select * from test_new_coord;
>>> >>>
>>> >>>
>>> >>> Here are the steps to add a new datanode
>>> >>>
>>> >>>
>>> >>>  1) Initdb new datanode
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>>> --nodename
>>> >>> data_node_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new datanode name
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>>
>>> >>>  4) Connect to any of the existing datanodes and take backup of the
>>> >>> database
>>> >>>         ./pg_dump -p 15432 -C -s
>>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>>> test
>>> >>>
>>> >>>  5) Start the new datanode specify --restoremode while starting the
>>> it
>>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>>> >>>
>>> >>>  6) Create the new database on the new datanode
>>> >>>         ./createdb test -p 35432
>>> >>>
>>> >>>  7) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 35432
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>>  8) Restore the backup that was taken from an existing datanode by
>>> >>> connecting to the new datanode directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p 35432
>>> >>>
>>> >>>  9) Quit the new datanode
>>> >>>
>>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>>> >>>         ./postgres --datanode -D ../data3 -p 35432
>>> >>>
>>> >>> 12) Create the new datanode on all the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 35432);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>> >>>
>>> >>> 14) The new daatnode is now ready
>>> >>>         ./psql test
>>> >>>         create table test_new_dn(a int, b int) distribute by
>>> replication;
>>> >>>         insert into test_new_dn values(1,2);
>>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>> >>>
>>> >>> Please note that the steps assume that the patch sent earlier
>>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>>> applied.
>>> >>>
>>> >>> I have also attached test database scripts, that would help in patch
>>> >>> review.
>>> >>>
>>> >>> Comments are welcome.
>>> >>>
>>> >>> --
>>> >>> Abbas
>>> >>> Architect
>>> >>> EnterpriseDB Corporation
>>> >>> The Enterprise PostgreSQL Company
>>> >>>
>>> >>> Phone: 92-334-5100153
>>> >>>
>>> >>> Website: www.enterprisedb.com
>>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >>>
>>> >>> This e-mail message (and any attachment) is intended for the use of
>>> >>> the individual or entity to whom it is addressed. This message
>>> >>> contains information from EnterpriseDB Corporation that may be
>>> >>> privileged, confidential, or exempt from disclosure under applicable
>>> >>> law. If you are not the intended recipient or authorized to receive
>>> >>> this for the intended recipient, any use, dissemination,
>>> distribution,
>>> >>> retention, archiving, or copying of this communication is strictly
>>> >>> prohibited. If you have received this e-mail in error, please notify
>>> >>> the sender immediately by reply e-mail and delete this message.
>>> >>>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> Everyone hates slow websites. So do we.
>>> >>> Make your web apps faster with AppDynamics
>>> >>> Download AppDynamics Lite for free today:
>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>> >>> _______________________________________________
>>> >>> Postgres-xc-developers mailing list
>>> >>> Pos...@li...
>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>> >>>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > --
>>> > Abbas
>>> > Architect
>>> > EnterpriseDB Corporation
>>> > The Enterprise PostgreSQL Company
>>> >
>>> > Phone: 92-334-5100153
>>> >
>>> > Website: www.enterprisedb.com
>>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >
>>> > This e-mail message (and any attachment) is intended for the use of
>>> > the individual or entity to whom it is addressed. This message
>>> > contains information from EnterpriseDB Corporation that may be
>>> > privileged, confidential, or exempt from disclosure under applicable
>>> > law. If you are not the intended recipient or authorized to receive
>>> > this for the intended recipient, any use, dissemination, distribution,
>>> > retention, archiving, or copying of this communication is strictly
>>> > prohibited. If you have received this e-mail in error, please notify
>>> > the sender immediately by reply e-mail and delete this message.
>>>
>>
>>
>>
>> --
>> --
>> Abbas
>> Architect
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>> Phone: 92-334-5100153
>>
>> Website: www.enterprisedb.com
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>
>> This e-mail message (and any attachment) is intended for the use of
>> the individual or entity to whom it is addressed. This message
>> contains information from EnterpriseDB Corporation that may be
>> privileged, confidential, or exempt from disclosure under applicable
>> law. If you are not the intended recipient or authorized to receive
>> this for the intended recipient, any use, dissemination, distribution,
>> retention, archiving, or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify
>> the sender immediately by reply e-mail and delete this message.
>
>
>
>
> --
> --
> Abbas
> Architect
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: 92-334-5100153
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of
> the individual or entity to whom it is addressed. This message
> contains information from EnterpriseDB Corporation that may be
> privileged, confidential, or exempt from disclosure under applicable
> law. If you are not the intended recipient or authorized to receive
> this for the intended recipient, any use, dissemination, distribution,
> retention, archiving, or copying of this communication is strictly
> prohibited. If you have received this e-mail in error, please notify
> the sender immediately by reply e-mail and delete this message.
>

Re: [Postgres-xc-developers] Using remote sorting for merge-join

From: Koichi S. <koi...@gm...> - 2013-04-01 06:02:25

Thanks.  Then 90% improvement means about 53% of the duration, while 50%
means 67% of it.   Number of queries in a given duration is 190 vs. 150,
difference is 40.

Considering the needed resource, it may be okay to begin with
materialization.

Any other inputs?
----------
Koichi Suzuki


2013/4/1 Ashutosh Bapat <ash...@en...>

>
>
> On Mon, Apr 1, 2013 at 10:59 AM, Koichi Suzuki <koi...@gm...>wrote:
>
>> I understand materialize everything makes code clearer and implementation
>> becomes simpler and better structured.
>>
>> What do you mean by x% improvement?   Does 90% improvement mean the total
>> duration is 10% of the original?
>>
> x% improvement means, duration reduces to 100/(100+x) as compared to the
> non-pushdown scenario. Or in simpler words, we see (100+x) queries being
> completed by pushdown approach in the same time in which nonpushdown
> approach completes 100 queries.
>
>> ----------
>> Koichi Suzuki
>>
>>
>> 2013/3/29 Ashutosh Bapat <ash...@en...>
>>
>>> Hi All,
>>> I measured the scale up for both approaches - a. using datanode
>>> connections as tapes (existing one) b. materialising result on tapes before
>>> merging (the approach I proposed). For 1M rows, 5 coordinators I have found
>>> that approach (a) gives 90% improvement whereas approach (b) gives 50%
>>> improvement. Although the difference is significant, I feel that approach
>>> (b) is much cleaner than approach (a) and doesn't have large footprint
>>> compared to PG code and it takes care of all the cases like 1.
>>> materialising sorted result, 2. takes care of any number of datanode
>>> connections without memory overrun. It's possible to improve it further if
>>> we avoid materialisation of datanode result in tuplestore.
>>>
>>> Patch attached for reference.
>>>
>>> On Tue, Mar 26, 2013 at 10:38 AM, Ashutosh Bapat <
>>> ash...@en...> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Mar 26, 2013 at 10:19 AM, Koichi Suzuki <
>>>> koi...@gm...> wrote:
>>>>
>>>>> On thing we should think for option 1 is:
>>>>>
>>>>> When a number of the result is huge, applications has to wait long
>>>>> time until they get the first row.  Because this option may need disk
>>>>> write, total resource consumption will be larger.
>>>>>
>>>>>
>>>> Yes, I am aware of this fact. Please read the next paragraph and you
>>>> will see that the current situation is no better.
>>>>
>>>>
>>>>> I'm wondering if we can use "cursor" at database so that we can read
>>>>> each tape more simply, I mean, to leave each query node open and read
>>>>> next row from any query node.
>>>>>
>>>>>
>>>> We do that right now. But because of such a simulated cursor (it's not
>>>> cursor per say, but we just fetch the required result from connection as
>>>> the demand arises in merging runs), we observer following things
>>>>
>>>> If the plan has multiple remote query nodes (as there will be in case
>>>> of merge join), we assign the same connection to these nodes. Before this
>>>> assignment, the result from the previous connection is materialised at the
>>>> coordinator. This means that, when we will get huge result from the
>>>> datanode, it will be materialised (which will have the more cost as
>>>> materialising it on tape, as this materialisation happens in a linked list,
>>>> which is not optimized). We need to share connection between more than one
>>>> RemoteQuery node because same transaction can not work on two connections
>>>> to same server. Not only performance, but the code has become ugly because
>>>> of this approach. At various places in executor, we have special handling
>>>> for sorting, which needs to be maintained.
>>>>
>>>> Instead if we materialise all the result on tape and then proceed with
>>>> step D5 in Knuth's algorithm for polyphase merge sort, the code will be
>>>> much simpler and we won't loose much performance. In fact, we might be able
>>>> to leverage fetching bulk data on connection which can be materialised on
>>>> tape in bulk.
>>>>
>>>>
>>>>> Regards;
>>>>> ----------
>>>>> Koichi Suzuki
>>>>>
>>>>>
>>>>> 2013/3/25 Ashutosh Bapat <ash...@en...>:
>>>>> > Hi All,
>>>>> > I am working on using remote sorting for merge joins. The idea is
>>>>> while
>>>>> > using merge join at the coordinator, get the data sorted from the
>>>>> datanodes;
>>>>> > for replicated relations, we can get all the rows sorted and for
>>>>> distributed
>>>>> > tables we have to get sorted runs which can be merged at the
>>>>> coordinator.
>>>>> > For merge join the sorted inner relation needs to be randomly
>>>>> accessible.
>>>>> > For replicated relations this can be achieved by materialising the
>>>>> result.
>>>>> > But for distributed relations, we do not materialise the sorted
>>>>> result at
>>>>> > coordinator but compute the sorted result by merging the sorted
>>>>> results from
>>>>> > individual nodes on the fly. For distributed relations, the
>>>>> connection to
>>>>> > the datanodes themselves are used as logical tapes (which provide
>>>>> the sorted
>>>>> > runs). The final result is computed on the fly by choosing the
>>>>> smallest or
>>>>> > greatest row (as required) from the connections.
>>>>> >
>>>>> > For a Sort node the materialised result can reside in memory (if it
>>>>> fits
>>>>> > there) or on one of the logical tapes used for merge sort. So, in
>>>>> order to
>>>>> > provide random access to the sorted result, we need to materialise
>>>>> the
>>>>> > result either in the memory or on the logical tape. In-memory
>>>>> > materialisation is not easily possible since we have already
>>>>> resorted for
>>>>> > tape based sort, in case of distributed relations and to materialise
>>>>> the
>>>>> > result on tape, there is no logical tape available in current
>>>>> algorithm. To
>>>>> > make it work, there are following possible ways
>>>>> >
>>>>> > 1. When random access is required, materialise the sorted runs from
>>>>> > individual nodes onto tapes (one tape for each node) and then merge
>>>>> them on
>>>>> > one extra tape, which can be used for materialisation.
>>>>> > 2. Use a mix of connections and logical tape in the same tape set.
>>>>> Merge the
>>>>> > sorted runs from connections on a logical tape in the same logical
>>>>> tape set.
>>>>> >
>>>>> > While the second one looks attractive from performance perspective
>>>>> (it saves
>>>>> > writing and reading from the tape), it would make the merge code
>>>>> ugly by
>>>>> > using mixed tapes. The read calls for connection and logical tape are
>>>>> > different and we will need both on the logical tape where the final
>>>>> result
>>>>> > is materialized. So, I am thinking of going with 1, in fact, to have
>>>>> same
>>>>> > code to handle remote sort, use 1 in all cases (whether or not
>>>>> > materialization is required).
>>>>> >
>>>>> > Had original authors of remote sort code thought about this
>>>>> materialization?
>>>>> > Anything they can share on this topic?
>>>>> > Any comment?
>>>>> > --
>>>>> > Best Wishes,
>>>>> > Ashutosh Bapat
>>>>> > EntepriseDB Corporation
>>>>> > The Enterprise Postgres Company
>>>>> >
>>>>> >
>>>>> ------------------------------------------------------------------------------
>>>>> > Everyone hates slow websites. So do we.
>>>>> > Make your web apps faster with AppDynamics
>>>>> > Download AppDynamics Lite for free today:
>>>>> > http://p.sf.net/sfu/appdyn_d2d_mar
>>>>> > _______________________________________________
>>>>> > Postgres-xc-developers mailing list
>>>>> > Pos...@li...
>>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>> >
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes,
>>>> Ashutosh Bapat
>>>> EntepriseDB Corporation
>>>> The Enterprise Postgres Company
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>

Re: [Postgres-xc-developers] Using remote sorting for merge-join

From: Ashutosh B. <ash...@en...> - 2013-04-01 05:48:28

On Mon, Apr 1, 2013 at 10:59 AM, Koichi Suzuki <koi...@gm...>wrote:

> I understand materialize everything makes code clearer and implementation
> becomes simpler and better structured.
>
> What do you mean by x% improvement?   Does 90% improvement mean the total
> duration is 10% of the original?
>
x% improvement means, duration reduces to 100/(100+x) as compared to the
non-pushdown scenario. Or in simpler words, we see (100+x) queries being
completed by pushdown approach in the same time in which nonpushdown
approach completes 100 queries.

> ----------
> Koichi Suzuki
>
>
> 2013/3/29 Ashutosh Bapat <ash...@en...>
>
>> Hi All,
>> I measured the scale up for both approaches - a. using datanode
>> connections as tapes (existing one) b. materialising result on tapes before
>> merging (the approach I proposed). For 1M rows, 5 coordinators I have found
>> that approach (a) gives 90% improvement whereas approach (b) gives 50%
>> improvement. Although the difference is significant, I feel that approach
>> (b) is much cleaner than approach (a) and doesn't have large footprint
>> compared to PG code and it takes care of all the cases like 1.
>> materialising sorted result, 2. takes care of any number of datanode
>> connections without memory overrun. It's possible to improve it further if
>> we avoid materialisation of datanode result in tuplestore.
>>
>> Patch attached for reference.
>>
>> On Tue, Mar 26, 2013 at 10:38 AM, Ashutosh Bapat <
>> ash...@en...> wrote:
>>
>>>
>>>
>>> On Tue, Mar 26, 2013 at 10:19 AM, Koichi Suzuki <
>>> koi...@gm...> wrote:
>>>
>>>> On thing we should think for option 1 is:
>>>>
>>>> When a number of the result is huge, applications has to wait long
>>>> time until they get the first row.  Because this option may need disk
>>>> write, total resource consumption will be larger.
>>>>
>>>>
>>> Yes, I am aware of this fact. Please read the next paragraph and you
>>> will see that the current situation is no better.
>>>
>>>
>>>> I'm wondering if we can use "cursor" at database so that we can read
>>>> each tape more simply, I mean, to leave each query node open and read
>>>> next row from any query node.
>>>>
>>>>
>>> We do that right now. But because of such a simulated cursor (it's not
>>> cursor per say, but we just fetch the required result from connection as
>>> the demand arises in merging runs), we observer following things
>>>
>>> If the plan has multiple remote query nodes (as there will be in case of
>>> merge join), we assign the same connection to these nodes. Before this
>>> assignment, the result from the previous connection is materialised at the
>>> coordinator. This means that, when we will get huge result from the
>>> datanode, it will be materialised (which will have the more cost as
>>> materialising it on tape, as this materialisation happens in a linked list,
>>> which is not optimized). We need to share connection between more than one
>>> RemoteQuery node because same transaction can not work on two connections
>>> to same server. Not only performance, but the code has become ugly because
>>> of this approach. At various places in executor, we have special handling
>>> for sorting, which needs to be maintained.
>>>
>>> Instead if we materialise all the result on tape and then proceed with
>>> step D5 in Knuth's algorithm for polyphase merge sort, the code will be
>>> much simpler and we won't loose much performance. In fact, we might be able
>>> to leverage fetching bulk data on connection which can be materialised on
>>> tape in bulk.
>>>
>>>
>>>> Regards;
>>>> ----------
>>>> Koichi Suzuki
>>>>
>>>>
>>>> 2013/3/25 Ashutosh Bapat <ash...@en...>:
>>>> > Hi All,
>>>> > I am working on using remote sorting for merge joins. The idea is
>>>> while
>>>> > using merge join at the coordinator, get the data sorted from the
>>>> datanodes;
>>>> > for replicated relations, we can get all the rows sorted and for
>>>> distributed
>>>> > tables we have to get sorted runs which can be merged at the
>>>> coordinator.
>>>> > For merge join the sorted inner relation needs to be randomly
>>>> accessible.
>>>> > For replicated relations this can be achieved by materialising the
>>>> result.
>>>> > But for distributed relations, we do not materialise the sorted
>>>> result at
>>>> > coordinator but compute the sorted result by merging the sorted
>>>> results from
>>>> > individual nodes on the fly. For distributed relations, the
>>>> connection to
>>>> > the datanodes themselves are used as logical tapes (which provide the
>>>> sorted
>>>> > runs). The final result is computed on the fly by choosing the
>>>> smallest or
>>>> > greatest row (as required) from the connections.
>>>> >
>>>> > For a Sort node the materialised result can reside in memory (if it
>>>> fits
>>>> > there) or on one of the logical tapes used for merge sort. So, in
>>>> order to
>>>> > provide random access to the sorted result, we need to materialise the
>>>> > result either in the memory or on the logical tape. In-memory
>>>> > materialisation is not easily possible since we have already resorted
>>>> for
>>>> > tape based sort, in case of distributed relations and to materialise
>>>> the
>>>> > result on tape, there is no logical tape available in current
>>>> algorithm. To
>>>> > make it work, there are following possible ways
>>>> >
>>>> > 1. When random access is required, materialise the sorted runs from
>>>> > individual nodes onto tapes (one tape for each node) and then merge
>>>> them on
>>>> > one extra tape, which can be used for materialisation.
>>>> > 2. Use a mix of connections and logical tape in the same tape set.
>>>> Merge the
>>>> > sorted runs from connections on a logical tape in the same logical
>>>> tape set.
>>>> >
>>>> > While the second one looks attractive from performance perspective
>>>> (it saves
>>>> > writing and reading from the tape), it would make the merge code ugly
>>>> by
>>>> > using mixed tapes. The read calls for connection and logical tape are
>>>> > different and we will need both on the logical tape where the final
>>>> result
>>>> > is materialized. So, I am thinking of going with 1, in fact, to have
>>>> same
>>>> > code to handle remote sort, use 1 in all cases (whether or not
>>>> > materialization is required).
>>>> >
>>>> > Had original authors of remote sort code thought about this
>>>> materialization?
>>>> > Anything they can share on this topic?
>>>> > Any comment?
>>>> > --
>>>> > Best Wishes,
>>>> > Ashutosh Bapat
>>>> > EntepriseDB Corporation
>>>> > The Enterprise Postgres Company
>>>> >
>>>> >
>>>> ------------------------------------------------------------------------------
>>>> > Everyone hates slow websites. So do we.
>>>> > Make your web apps faster with AppDynamics
>>>> > Download AppDynamics Lite for free today:
>>>> > http://p.sf.net/sfu/appdyn_d2d_mar
>>>> > _______________________________________________
>>>> > Postgres-xc-developers mailing list
>>>> > Pos...@li...
>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Using remote sorting for merge-join

From: Koichi S. <koi...@gm...> - 2013-04-01 05:29:42

I understand materialize everything makes code clearer and implementation
becomes simpler and better structured.

What do you mean by x% improvement?   Does 90% improvement mean the total
duration is 10% of the original?
----------
Koichi Suzuki


2013/3/29 Ashutosh Bapat <ash...@en...>

> Hi All,
> I measured the scale up for both approaches - a. using datanode
> connections as tapes (existing one) b. materialising result on tapes before
> merging (the approach I proposed). For 1M rows, 5 coordinators I have found
> that approach (a) gives 90% improvement whereas approach (b) gives 50%
> improvement. Although the difference is significant, I feel that approach
> (b) is much cleaner than approach (a) and doesn't have large footprint
> compared to PG code and it takes care of all the cases like 1.
> materialising sorted result, 2. takes care of any number of datanode
> connections without memory overrun. It's possible to improve it further if
> we avoid materialisation of datanode result in tuplestore.
>
> Patch attached for reference.
>
> On Tue, Mar 26, 2013 at 10:38 AM, Ashutosh Bapat <
> ash...@en...> wrote:
>
>>
>>
>> On Tue, Mar 26, 2013 at 10:19 AM, Koichi Suzuki <
>> koi...@gm...> wrote:
>>
>>> On thing we should think for option 1 is:
>>>
>>> When a number of the result is huge, applications has to wait long
>>> time until they get the first row.  Because this option may need disk
>>> write, total resource consumption will be larger.
>>>
>>>
>> Yes, I am aware of this fact. Please read the next paragraph and you will
>> see that the current situation is no better.
>>
>>
>>> I'm wondering if we can use "cursor" at database so that we can read
>>> each tape more simply, I mean, to leave each query node open and read
>>> next row from any query node.
>>>
>>>
>> We do that right now. But because of such a simulated cursor (it's not
>> cursor per say, but we just fetch the required result from connection as
>> the demand arises in merging runs), we observer following things
>>
>> If the plan has multiple remote query nodes (as there will be in case of
>> merge join), we assign the same connection to these nodes. Before this
>> assignment, the result from the previous connection is materialised at the
>> coordinator. This means that, when we will get huge result from the
>> datanode, it will be materialised (which will have the more cost as
>> materialising it on tape, as this materialisation happens in a linked list,
>> which is not optimized). We need to share connection between more than one
>> RemoteQuery node because same transaction can not work on two connections
>> to same server. Not only performance, but the code has become ugly because
>> of this approach. At various places in executor, we have special handling
>> for sorting, which needs to be maintained.
>>
>> Instead if we materialise all the result on tape and then proceed with
>> step D5 in Knuth's algorithm for polyphase merge sort, the code will be
>> much simpler and we won't loose much performance. In fact, we might be able
>> to leverage fetching bulk data on connection which can be materialised on
>> tape in bulk.
>>
>>
>>> Regards;
>>> ----------
>>> Koichi Suzuki
>>>
>>>
>>> 2013/3/25 Ashutosh Bapat <ash...@en...>:
>>> > Hi All,
>>> > I am working on using remote sorting for merge joins. The idea is while
>>> > using merge join at the coordinator, get the data sorted from the
>>> datanodes;
>>> > for replicated relations, we can get all the rows sorted and for
>>> distributed
>>> > tables we have to get sorted runs which can be merged at the
>>> coordinator.
>>> > For merge join the sorted inner relation needs to be randomly
>>> accessible.
>>> > For replicated relations this can be achieved by materialising the
>>> result.
>>> > But for distributed relations, we do not materialise the sorted result
>>> at
>>> > coordinator but compute the sorted result by merging the sorted
>>> results from
>>> > individual nodes on the fly. For distributed relations, the connection
>>> to
>>> > the datanodes themselves are used as logical tapes (which provide the
>>> sorted
>>> > runs). The final result is computed on the fly by choosing the
>>> smallest or
>>> > greatest row (as required) from the connections.
>>> >
>>> > For a Sort node the materialised result can reside in memory (if it
>>> fits
>>> > there) or on one of the logical tapes used for merge sort. So, in
>>> order to
>>> > provide random access to the sorted result, we need to materialise the
>>> > result either in the memory or on the logical tape. In-memory
>>> > materialisation is not easily possible since we have already resorted
>>> for
>>> > tape based sort, in case of distributed relations and to materialise
>>> the
>>> > result on tape, there is no logical tape available in current
>>> algorithm. To
>>> > make it work, there are following possible ways
>>> >
>>> > 1. When random access is required, materialise the sorted runs from
>>> > individual nodes onto tapes (one tape for each node) and then merge
>>> them on
>>> > one extra tape, which can be used for materialisation.
>>> > 2. Use a mix of connections and logical tape in the same tape set.
>>> Merge the
>>> > sorted runs from connections on a logical tape in the same logical
>>> tape set.
>>> >
>>> > While the second one looks attractive from performance perspective (it
>>> saves
>>> > writing and reading from the tape), it would make the merge code ugly
>>> by
>>> > using mixed tapes. The read calls for connection and logical tape are
>>> > different and we will need both on the logical tape where the final
>>> result
>>> > is materialized. So, I am thinking of going with 1, in fact, to have
>>> same
>>> > code to handle remote sort, use 1 in all cases (whether or not
>>> > materialization is required).
>>> >
>>> > Had original authors of remote sort code thought about this
>>> materialization?
>>> > Anything they can share on this topic?
>>> > Any comment?
>>> > --
>>> > Best Wishes,
>>> > Ashutosh Bapat
>>> > EntepriseDB Corporation
>>> > The Enterprise Postgres Company
>>> >
>>> >
>>> ------------------------------------------------------------------------------
>>> > Everyone hates slow websites. So do we.
>>> > Make your web apps faster with AppDynamics
>>> > Download AppDynamics Lite for free today:
>>> > http://p.sf.net/sfu/appdyn_d2d_mar
>>> > _______________________________________________
>>> > Postgres-xc-developers mailing list
>>> > Pos...@li...
>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>> >
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>

Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

From: Ashutosh B. <ash...@en...> - 2013-04-01 05:20:03

Hi Amit,
Does this have any relation with triggers, if so, can you please explain
this relation?

On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
ami...@en...> wrote:

> Attached is a patch dml-targetlist.patch which prevents unnecessary
> columns from being fetched from the ModifyTable SELECT subplans (bug
> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
> )
>
> -------
>
> In expand_targetlist(), currently Vars of all the table attributes that
> are absent are added into the target list.  The patch adds NULL constants
> instead of Var nodes, so that those attributes will not be fetched from the
> SELECT. We still want to keep NULL constants because we need the complete
> tuple. We don't want to change the PG way of having all the attributes of
> the NEW ROW tuple descriptor of the source plan slot. Also other PG things
> like constraint expressions and triggers assume this format, so it is
> convenient to not modify this design.
>
> Using NULL constants will cause the remote SELECT statement to completely
> exclude all the unnecessary columns. It does not even have NULLs.
>
>
> Below is how the new plan will look now:
>
> # explain verbose update newtab *set name5 = f(name5) where name4 =
> f(name4)*;
>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>    Node/s: data_node_1, data_node_2
>    Node expr: newtab.id
>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid = $7
> AND xc_node_id = $9
>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"  (cost=0.00..250.00
> rows=1000 width=234)
>          Output: newtab.id, NULL::character varying, NULL::character
> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
> newtab.ctid, newtab.*, newtab.xc_node_id
>          Node/s: data_node_1, data_node_2
>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
> xc_node_id, name4 *FROM ONLY newtab WHERE true
>          Coordinator quals: ((newtab.name4)::text =
> (f(newtab.name4))::text)
>
>
> Notice the following:
>
> 1. Remote query now does not have any other columns namely name1, name2,
> name3.
>
> 2. The column id is still present. This is because it is a distribution
> column, and currently we use this column to decide the target node at
> execution time. Unfortunately I could not exclude the distribution column
> from the SELECT subplan target list. Doing this required nodeid-based
> target datanode determination for updates/deletes. Currently update/deletes
> use distribution column even if we fetch xc_node_id. I had almost come up
> with this nodeid-based implementation except that EXPLAIN did not correctly
> show the en_expr expression if that expression is a xc_node_id Var. Have
> attached another patch (working_nodeid_based.patch) that includes both :
> selecting reqd columns, plus node_id based target datanode determination.
> Due to the explain-of-en_expr issue, we need to first checkin
> dml-targetlist.patch, because that optimization is important for trigger
> implementation. The nodeid-based implementation does not compliment any of
> the changes in this patch.
>
> -------
>
> In rewriteTargetListUD(), we had added into the parse->targetlist the
> attributes need for quals; the RemoteQuery->base_tlist anyways does this
> job of getting the attributes added up in the base_tlist so it is not
> required to do this in rewrite. I have removed it.
>
> -------
>
> Since now there are only required columns in the SELECT subplan, we cannot
> fire check and not-null constraints on coordinator. For this, in the patch,
> we now include those columns that are referred in the constraint
> expressions. Also, we keep the not-null constraint check deferred for the
> datanode. We anyways need the final values for the constraints to get
> fired, and on datanode we know these are the final values. So in general
> when possible we should not fire constraints on coordinator. When all the
> constraints are shippable, they should be fired on datanode, else they
> should be fired on coordinator. In fact, we don't currently have a check
> whether the constraint is shippable, so this is a bug. Have added this
> constraint-shippablity check also.
>
> ------
>
> In the future, we could also consider optimization in the BIND data row.
> Currently it includes NULL values for parameters that are not used in the
> remote statement. We could just skip those parameters instead and thus
> reduce the data row to some extent. But this requires tweaking the
> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>
> ------
>
>
> Tests
> ======
>
> I have used xc_constraints test to add new check-constraint related tests.
> Otherwise, the existing tests are more than enough to test this patch. See
> the other regression test changes in the patch to get an idea of how this
> patch affects the plan target list.
>
> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
> that the target list does not reflect correct columns currently. I suspect
> set_plan_references changes the varattnos but they do not end up  correctly
> reflecting the actual target list. The diff in the test file is because we
> have lesser columns, but the column names anyway do not show the correct
> columns.
>
>
>
>
> ------------------------------------------------------------------------------
> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
> Rise to greatness in Intel's independent game demo contest.
> Compete for recognition, cash, and the chance to get your game
> on Steam. $5K grand prize plus 10 genre and skill prizes.
> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Manually Table Partitioning

From: Ashutosh B. <ash...@en...> - 2013-04-01 04:24:37

Kaiji,
EXECUTE DIRECT is only for debugging and troubleshooting, it shouldn't be
used in applications or tools based on XC, (except for the purposes
stated). We may change or deprecate the EXECUTE DIRECT usage in future.

On Sat, Mar 30, 2013 at 2:39 PM, Kaiji Chen <ch...@im...> wrote:

>  Thanks for your reply!
>
>  It seems ok if I use EXECUTE DIRECT and manually maintain the data
> concurrency and a global index in my middleware. But it looks like I've
> skipped the PostgresXC coordinator, it will not be the best choice.
>
>  I just come up a idea applying external data partitioning design to the
> PostgresXC. As stated in the document XC can distribute tables to data
> nodes using hash function. Then can I manipulate the original table and add
> a new column as my partition decision and let the table distributed by this
> column. Then we add this column to the compound primary key of the table
> and let the coordinator deal with the query planning work. I think this can
> be done if for different tables, the same hash value will be partitioned to
> the same data node if there is no modification to the set of data nodes.
>
>
>
>
> Yours,
> Kaiji Chen
> PhD Candidate <ch...@im...>
> IMADA， Southern Denmark University
> Email: ch...@im...
>   ------------------------------
> *From:* Michael Paquier [mic...@gm...]
> *Sent:* Saturday, March 30, 2013 5:55 AM
> *To:* Kaiji Chen
> *Cc:* pos...@li...
> *Subject:* Re: [Postgres-xc-developers] Manually Table Partitioning
>
>
>
>
> On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...> wrote:
>
>> Hi,
>> I'm working on a data partitioning project on PostgreSQL by adding a
>> middleware between the database cluster interface and applications that
>> modify the SQL statement to specific data nodes. I just find that
>> PostgresXC has a nice GTM that can help me do the distributed transaction
>> management works, I considered to transfer my project on it.
>> It seems the sliders (
>> http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf)
>> intend that user defined table distribution is not available, but the
>> coordinator can choose specific data node when processing the queries, and
>> the table will be distributed to by default if DISTRIBUTED BY is not
>> specified. Then I wonder if I can specify a data node in each query and
>> stop the default auto distributing process.
>>
> For SELECT queries, you can use EXECUTE DIRECT:
> http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html
>  The results you get might not be exact as not global query planning is
> not done and the query string is sent as-is.
>
>  Note that you cannot use EXECUTE DIRECT with DML or the whole cluster
> consistency would be broken.
> --
>  Michael
>
>
> ------------------------------------------------------------------------------
> Own the Future-Intel(R) Level Up Game Demo Contest 2013
> Rise to greatness in Intel's independent game demo contest. Compete
> for recognition, cash, and the chance to get your game on Steam.
> $5K grand prize plus 10 genre and skill prizes. Submit your demo
> by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

From: Koichi S. <koi...@gm...> - 2013-04-01 03:21:56

Thank you very much for the summary.   It helps much.

I have a couple of questions on this.
----------
Koichi Suzuki


2013/3/31 Abbas Butt <abb...@en...>

> Hi,
> Attached please find the revised patch for restore mode. This patch has to
> be applied on top of the patches I sent earlier for
>  3608377,
>  3608376 &
>  3608375.
>
> I have also attached some scripts and a C file useful for testing the
> whole procedure. It is a database that has many objects in it.
>
> Here are the revised instructions for adding new nodes to the cluster.
>
> ======================================
>
> Here are the steps to add a new coordinator
>
>  1) Initdb new coordinator
>   /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
> coord_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new coordinator name and pooler port
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>

>  4) Connect to any of the existing coordinators and take backup of the
> database
>     ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
>


Here, because only the default database is available, if the user is not
"postgres" or default database specified in initdb, we need to specify the
database name (and the owner name, if necessary).   Similar comments to
others.

>
>  5) Start the new coordinator specify --restoremode while starting the
> coordinator
>     ./postgres --restoremode -D ../data_cord3 -p 5455
>

>  6) Create the new database on the new coordinator - optional
>     ./createdb test -p 5455
>

I believe that pg_dumpall copies the definition of existing databases and
this operation is usually unnecessary. If new database is needed, then this
should be created after the new coordinator is up and registered to all the
other coordinators.


>  7) Restore the backup that was taken from an existing coordinator by
> connecting to the new coordinator directly
>     ./psql -d test -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
> -p 5455
>

>  8) Quit the new coordinator
>
>  9) Start the new coordinator as a by specifying --coordinator
>     ./postgres --coordinator -D ../data_cord3 -p 5455
>
> 10) Create the new coordinator on rest of the coordinators and reload
> configuration
>     CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
> PORT = 5455);
>     SELECT pgxc_pool_reload();
>
> 11) Quit the session of step 3, this will unlock the cluster
>
> 12) The new coordinator is now ready
>     ./psql test -p 5455
>     create table test_new_coord(a int, b int);
>     \q
>     ./psql test -p 5432
>     select * from test_new_coord;
>
> *======================================*
> *======================================*
>
> Here are the steps to add a new datanode
>
>
>  1) Initdb new datanode
>     /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
> data_node_3
>
>  2) Make necessary changes in its postgresql.conf, in particular specify
> new datanode name
>
>  3) Connect to any of the existing coordinators & lock the cluster for
> backup, do not close this session
>     ./psql postgres -p 5432
>     select pgxc_lock_for_backup();
>
>  4) Connect to any of the existing datanodes and take backup of the
> database
>     ./pg_dumpall -p 15432 -s --include-nodes
> --file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
>
>  5) Start the new datanode specify --restoremode while starting the it
>     ./postgres --restoremode -D ../data3 -p 35432
>
>  6) Restore the backup that was taken from an existing datanode by
> connecting to the new datanode directly
>     ./psql -d postgres -f
> /home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
> -p 35432
>
>  7) Quit the new datanode
>

Same comment as the coordinator.

>
>  8) Start the new datanode as a datanode by specifying --datanode
>     ./postgres --datanode -D ../data3 -p 35432
>
>  9) Create the new datanode on all the coordinators and reload
> configuration
>     CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
> PORT = 35432);
>     SELECT pgxc_pool_reload();
>
> 10) Quit the session of step 3, this will unlock the cluster
>
> 11) Redistribute data by using ALTER TABLE REDISTRIBUTE
>
> 12) The new daatnode is now ready
>     ./psql test
>     create table test_new_dn(a int, b int) distribute by replication;
>     insert into test_new_dn values(1,2);
>     EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>     EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>
> ======================================
>
> On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:
>
>> Feature ID 3608379
>>
>> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>>> >
>>> >
>>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>>> > <ami...@en...> wrote:
>>> >>
>>> >>
>>> >>
>>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>>> wrote:
>>> >>>
>>> >>> Hi All,
>>> >>>
>>> >>> Attached please find a patch that provides a new command line
>>> argument
>>> >>> for postgres called --restoremode.
>>> >>>
>>> >>> While adding a new node to the cluster we need to restore the schema
>>> of
>>> >>> existing database to the new node.
>>> >>> If the new node is a datanode and we connect directly to it, it does
>>> not
>>> >>> allow DDL, because it is in read only mode &
>>> >>> If the new node is a coordinator, it will send DDLs to all the other
>>> >>> coordinators which we do not want it to do.
>>> >>
>>> >>
>>> >> What if we allow writes in standalone mode, so that we would
>>> initialize
>>> >> the new node using standalone mode instead of --restoremode ?
>>> >
>>> >
>>> > Please take a look at the patch, I am using --restoremode in place of
>>> > --coordinator & --datanode. I am not sure how would stand alone mode
>>> fit in
>>> > here.
>>>
>>> I was trying to see if we can avoid adding a new mode, instead, use
>>> standalone mode for all the purposes for which restoremode is used.
>>> Actually I checked the documentation, it says this mode is used only
>>> for debugging or recovery purposes, so now I myself am a bit hesitent
>>> about this mode for the purpose of restoring.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> To provide ability to restore on the new node a new command line
>>> argument
>>> >>> is provided.
>>> >>> It is to be provided in place of --coordinator OR --datanode.
>>> >>> In restore mode both coordinator and datanode are internally treated
>>> as a
>>> >>> datanode.
>>> >>> For more details see patch comments.
>>> >>>
>>> >>> After this patch one can add a new node to the cluster.
>>> >>>
>>> >>> Here are the steps to add a new coordinator
>>> >>>
>>> >>>
>>> >>>  1) Initdb new coordinator
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>>> >>> --nodename coord_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new coordinator name and pooler port
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>
>>> >>
>>> >> I haven't given a thought on the earlier patch you sent for cluster
>>> lock
>>> >> implementation; may be we can discuss this on that thread, but just a
>>> quick
>>> >> question:
>>> >>
>>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>>> finish
>>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>>> objects
>>> >> created by these particular DDLs.
>>> >
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. Suppose one client issues a lock cluster command and the other
>>> issues
>>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer to
>>> your
>>> > question is Yes.
>>> >
>>> > Suppose you have a prepared transaction that has a DDL in it, again if
>>> this
>>> > can be considered an on going DDL, then again answer to your question
>>> is
>>> > Yes.
>>> >
>>> > Suppose you have a two coordinator cluster. Assume one client
>>> connected to
>>> > each. One client starts a transaction and issues a DDL, the second
>>> client
>>> > issues a lock cluster command, the first commits the transaction. If
>>> this is
>>> > an ongoing DDL, then the answer to your question is No. But its a
>>> matter of
>>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>>> the
>>> > deny camp. I decided to put it in allow camp, because I have not yet
>>> written
>>> > any code to detect whether a transaction being committed has a DDL in
>>> it or
>>> > not, and stopping all transactions from committing looks too
>>> restrictive to
>>> > me.
>>> >
>>> > Do you have some other meaning of an ongoing DDL?
>>> >
>>> > I agree that we should have discussed this on the right thread. Lets
>>> > continue this discussion on that thread.
>>>
>>> Continued on the other thread.
>>>
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>>
>>> >>>  4) Connect to any of the existing coordinators and take backup of
>>> the
>>> >>> database
>>> >>>         ./pg_dump -p 5432 -C -s
>>> >>>
>>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql test
>>> >>>
>>> >>>  5) Start the new coordinator specify --restoremode while starting
>>> the
>>> >>> coordinator
>>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>>> >>>
>>> >>>  6) connect to the new coordinator directly
>>> >>>         ./psql postgres -p 5455
>>> >>>
>>> >>>  7) create all the datanodes and the rest of the coordinators on the
>>> new
>>> >>> coordiantor & reload configuration
>>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 15432, PRIMARY);
>>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 25432);
>>> >>>
>>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5432);
>>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5433);
>>> >>>
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>>  8) quit psql
>>> >>>
>>> >>>  9) Create the new database on the new coordinator
>>> >>>         ./createdb test -p 5455
>>> >>>
>>> >>> 10) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 5455
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Will pg_dumpall help ? It dumps roles also.
>>> >
>>> >
>>> > Yah , but I am giving example of pg_dump so this step has to be there.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 11) Restore the backup that was taken from an existing coordinator by
>>> >>> connecting to the new coordinator directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p
>>> 5455
>>> >>>
>>> >>> 11) Quit the new coordinator
>>> >>>
>>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>
>>> >> Unlocking the cluster has to be done *after* the node is added into
>>> the
>>> >> cluster.
>>> >
>>> >
>>> > Very true. I stand corrected. This means CREATE NODE has to be allowed
>>> when
>>> > xc_lock_for_backup is set.
>>> >
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>> 13) Start the new coordinator as a by specifying --coordinator
>>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>>> >>>
>>> >>> 14) Create the new coordinator on rest of the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>>> >>> 'coordinator', PORT = 5455);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 15) The new coordinator is now ready
>>> >>>         ./psql test -p 5455
>>> >>>         create table test_new_coord(a int, b int);
>>> >>>         \q
>>> >>>         ./psql test -p 5432
>>> >>>         select * from test_new_coord;
>>> >>>
>>> >>>
>>> >>> Here are the steps to add a new datanode
>>> >>>
>>> >>>
>>> >>>  1) Initdb new datanode
>>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>>> --nodename
>>> >>> data_node_3
>>> >>>
>>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>>> specify
>>> >>> new datanode name
>>> >>>
>>> >>>  3) Connect to any of the existing coordinators & lock the cluster
>>> for
>>> >>> backup
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=yes;
>>> >>>         \q
>>> >>>
>>> >>>  4) Connect to any of the existing datanodes and take backup of the
>>> >>> database
>>> >>>         ./pg_dump -p 15432 -C -s
>>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>>> test
>>> >>>
>>> >>>  5) Start the new datanode specify --restoremode while starting the
>>> it
>>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>>> >>>
>>> >>>  6) Create the new database on the new datanode
>>> >>>         ./createdb test -p 35432
>>> >>>
>>> >>>  7) create the roles and table spaces manually, the dump does not
>>> contain
>>> >>> roles or table spaces
>>> >>>         ./psql test -p 35432
>>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>>> >>>         CREATE TABLESPACE my_space LOCATION
>>> >>> '/usr/local/pgsql/my_space_location';
>>> >>>         \q
>>> >>>
>>> >>>  8) Restore the backup that was taken from an existing datanode by
>>> >>> connecting to the new datanode directly
>>> >>>          ./psql -d test -f
>>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p 35432
>>> >>>
>>> >>>  9) Quit the new datanode
>>> >>>
>>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>>> >>>         ./psql postgres -p 5432
>>> >>>         SET xc_lock_for_backup=no;
>>> >>>         \q
>>> >>>
>>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>>> >>>         ./postgres --datanode -D ../data3 -p 35432
>>> >>>
>>> >>> 12) Create the new datanode on all the coordinators and reload
>>> >>> configuration
>>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>>> >>> 'datanode', PORT = 35432);
>>> >>>         SELECT pgxc_pool_reload();
>>> >>>
>>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>>> >>>
>>> >>> 14) The new daatnode is now ready
>>> >>>         ./psql test
>>> >>>         create table test_new_dn(a int, b int) distribute by
>>> replication;
>>> >>>         insert into test_new_dn values(1,2);
>>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>>> >>>
>>> >>> Please note that the steps assume that the patch sent earlier
>>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>>> applied.
>>> >>>
>>> >>> I have also attached test database scripts, that would help in patch
>>> >>> review.
>>> >>>
>>> >>> Comments are welcome.
>>> >>>
>>> >>> --
>>> >>> Abbas
>>> >>> Architect
>>> >>> EnterpriseDB Corporation
>>> >>> The Enterprise PostgreSQL Company
>>> >>>
>>> >>> Phone: 92-334-5100153
>>> >>>
>>> >>> Website: www.enterprisedb.com
>>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >>>
>>> >>> This e-mail message (and any attachment) is intended for the use of
>>> >>> the individual or entity to whom it is addressed. This message
>>> >>> contains information from EnterpriseDB Corporation that may be
>>> >>> privileged, confidential, or exempt from disclosure under applicable
>>> >>> law. If you are not the intended recipient or authorized to receive
>>> >>> this for the intended recipient, any use, dissemination,
>>> distribution,
>>> >>> retention, archiving, or copying of this communication is strictly
>>> >>> prohibited. If you have received this e-mail in error, please notify
>>> >>> the sender immediately by reply e-mail and delete this message.
>>> >>>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> Everyone hates slow websites. So do we.
>>> >>> Make your web apps faster with AppDynamics
>>> >>> Download AppDynamics Lite for free today:
>>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>>> >>> _______________________________________________
>>> >>> Postgres-xc-developers mailing list
>>> >>> Pos...@li...
>>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>> >>>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > --
>>> > Abbas
>>> > Architect
>>> > EnterpriseDB Corporation
>>> > The Enterprise PostgreSQL Company
>>> >
>>> > Phone: 92-334-5100153
>>> >
>>> > Website: www.enterprisedb.com
>>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>>> >
>>> > This e-mail message (and any attachment) is intended for the use of
>>> > the individual or entity to whom it is addressed. This message
>>> > contains information from EnterpriseDB Corporation that may be
>>> > privileged, confidential, or exempt from disclosure under applicable
>>> > law. If you are not the intended recipient or authorized to receive
>>> > this for the intended recipient, any use, dissemination, distribution,
>>> > retention, archiving, or copying of this communication is strictly
>>> > prohibited. If you have received this e-mail in error, please notify
>>> > the sender immediately by reply e-mail and delete this message.
>>>
>>
>>
>>
>> --
>> --
>> Abbas
>> Architect
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>> Phone: 92-334-5100153
>>
>> Website: www.enterprisedb.com
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>>
>> This e-mail message (and any attachment) is intended for the use of
>> the individual or entity to whom it is addressed. This message
>> contains information from EnterpriseDB Corporation that may be
>> privileged, confidential, or exempt from disclosure under applicable
>> law. If you are not the intended recipient or authorized to receive
>> this for the intended recipient, any use, dissemination, distribution,
>> retention, archiving, or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify
>> the sender immediately by reply e-mail and delete this message.
>
>
>
>
> --
> --
> Abbas
> Architect
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: 92-334-5100153
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of
> the individual or entity to whom it is addressed. This message
> contains information from EnterpriseDB Corporation that may be
> privileged, confidential, or exempt from disclosure under applicable
> law. If you are not the intended recipient or authorized to receive
> this for the intended recipient, any use, dissemination, distribution,
> retention, archiving, or copying of this communication is strictly
> prohibited. If you have received this e-mail in error, please notify
> the sender immediately by reply e-mail and delete this message.
>
>
> ------------------------------------------------------------------------------
> Own the Future-Intel(R) Level Up Game Demo Contest 2013
> Rise to greatness in Intel's independent game demo contest. Compete
> for recognition, cash, and the chance to get your game on Steam.
> $5K grand prize plus 10 genre and skill prizes. Submit your demo
> by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

Re: [Postgres-xc-developers] Add a new node to the cluster [Feature ID 3608379]

From: Abbas B. <abb...@en...> - 2013-03-31 08:37:13

Attachments: 3_restore-mode.patch test_all_objects.sql all_objects_no_tests.sql complex_fns.c

Hi,
Attached please find the revised patch for restore mode. This patch has to
be applied on top of the patches I sent earlier for
 3608377,
 3608376 &
 3608375.

I have also attached some scripts and a C file useful for testing the whole
procedure. It is a database that has many objects in it.

Here are the revised instructions for adding new nodes to the cluster.

======================================

Here are the steps to add a new coordinator

 1) Initdb new coordinator
  /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3 --nodename
coord_3

 2) Make necessary changes in its postgresql.conf, in particular specify
new coordinator name and pooler port

 3) Connect to any of the existing coordinators & lock the cluster for
backup, do not close this session
    ./psql postgres -p 5432
    select pgxc_lock_for_backup();

 4) Connect to any of the existing coordinators and take backup of the
database
    ./pg_dumpall -p 5432 -s --include-nodes --dump-nodes
--file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql

 5) Start the new coordinator specify --restoremode while starting the
coordinator
    ./postgres --restoremode -D ../data_cord3 -p 5455

 6) Create the new database on the new coordinator - optional
    ./createdb test -p 5455

 7) Restore the backup that was taken from an existing coordinator by
connecting to the new coordinator directly
    ./psql -d test -f
/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1100_all_objects_coord.sql
-p 5455

 8) Quit the new coordinator

 9) Start the new coordinator as a by specifying --coordinator
    ./postgres --coordinator -D ../data_cord3 -p 5455

10) Create the new coordinator on rest of the coordinators and reload
configuration
    CREATE NODE COORD_3 WITH (HOST = 'localhost', type = 'coordinator',
PORT = 5455);
    SELECT pgxc_pool_reload();

11) Quit the session of step 3, this will unlock the cluster

12) The new coordinator is now ready
    ./psql test -p 5455
    create table test_new_coord(a int, b int);
    \q
    ./psql test -p 5432
    select * from test_new_coord;

*======================================*
*======================================*

Here are the steps to add a new datanode


 1) Initdb new datanode
    /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3 --nodename
data_node_3

 2) Make necessary changes in its postgresql.conf, in particular specify
new datanode name

 3) Connect to any of the existing coordinators & lock the cluster for
backup, do not close this session
    ./psql postgres -p 5432
    select pgxc_lock_for_backup();

 4) Connect to any of the existing datanodes and take backup of the database
    ./pg_dumpall -p 15432 -s --include-nodes
--file=/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql

 5) Start the new datanode specify --restoremode while starting the it
    ./postgres --restoremode -D ../data3 -p 35432

 6) Restore the backup that was taken from an existing datanode by
connecting to the new datanode directly
    ./psql -d postgres -f
/home/edb/Desktop/NodeAddition/revised_patches/misc_dumps/1122_all_objects_dn1.sql
-p 35432

 7) Quit the new datanode

 8) Start the new datanode as a datanode by specifying --datanode
    ./postgres --datanode -D ../data3 -p 35432

 9) Create the new datanode on all the coordinators and reload configuration
    CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type = 'datanode',
PORT = 35432);
    SELECT pgxc_pool_reload();

10) Quit the session of step 3, this will unlock the cluster

11) Redistribute data by using ALTER TABLE REDISTRIBUTE

12) The new daatnode is now ready
    ./psql test
    create table test_new_dn(a int, b int) distribute by replication;
    insert into test_new_dn values(1,2);
    EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
    EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
    EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';

======================================

On Wed, Mar 27, 2013 at 5:02 PM, Abbas Butt <abb...@en...>wrote:

> Feature ID 3608379
>
> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
> ami...@en...> wrote:
>
>> On 1 March 2013 01:30, Abbas Butt <abb...@en...> wrote:
>> >
>> >
>> > On Thu, Feb 28, 2013 at 12:44 PM, Amit Khandekar
>> > <ami...@en...> wrote:
>> >>
>> >>
>> >>
>> >> On 28 February 2013 10:23, Abbas Butt <abb...@en...>
>> wrote:
>> >>>
>> >>> Hi All,
>> >>>
>> >>> Attached please find a patch that provides a new command line argument
>> >>> for postgres called --restoremode.
>> >>>
>> >>> While adding a new node to the cluster we need to restore the schema
>> of
>> >>> existing database to the new node.
>> >>> If the new node is a datanode and we connect directly to it, it does
>> not
>> >>> allow DDL, because it is in read only mode &
>> >>> If the new node is a coordinator, it will send DDLs to all the other
>> >>> coordinators which we do not want it to do.
>> >>
>> >>
>> >> What if we allow writes in standalone mode, so that we would initialize
>> >> the new node using standalone mode instead of --restoremode ?
>> >
>> >
>> > Please take a look at the patch, I am using --restoremode in place of
>> > --coordinator & --datanode. I am not sure how would stand alone mode
>> fit in
>> > here.
>>
>> I was trying to see if we can avoid adding a new mode, instead, use
>> standalone mode for all the purposes for which restoremode is used.
>> Actually I checked the documentation, it says this mode is used only
>> for debugging or recovery purposes, so now I myself am a bit hesitent
>> about this mode for the purpose of restoring.
>>
>> >
>> >>
>> >>
>> >>>
>> >>> To provide ability to restore on the new node a new command line
>> argument
>> >>> is provided.
>> >>> It is to be provided in place of --coordinator OR --datanode.
>> >>> In restore mode both coordinator and datanode are internally treated
>> as a
>> >>> datanode.
>> >>> For more details see patch comments.
>> >>>
>> >>> After this patch one can add a new node to the cluster.
>> >>>
>> >>> Here are the steps to add a new coordinator
>> >>>
>> >>>
>> >>>  1) Initdb new coordinator
>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data_cord3
>> >>> --nodename coord_3
>> >>>
>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>> specify
>> >>> new coordinator name and pooler port
>> >>>
>> >>>  3) Connect to any of the existing coordinators & lock the cluster for
>> >>> backup
>> >>>         ./psql postgres -p 5432
>> >>>         SET xc_lock_for_backup=yes;
>> >>>         \q
>> >>
>> >>
>> >> I haven't given a thought on the earlier patch you sent for cluster
>> lock
>> >> implementation; may be we can discuss this on that thread, but just a
>> quick
>> >> question:
>> >>
>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>> finish
>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>> objects
>> >> created by these particular DDLs.
>> >
>> >
>> > Suppose you have a two coordinator cluster. Assume one client connected
>> to
>> > each. Suppose one client issues a lock cluster command and the other
>> issues
>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer to
>> your
>> > question is Yes.
>> >
>> > Suppose you have a prepared transaction that has a DDL in it, again if
>> this
>> > can be considered an on going DDL, then again answer to your question is
>> > Yes.
>> >
>> > Suppose you have a two coordinator cluster. Assume one client connected
>> to
>> > each. One client starts a transaction and issues a DDL, the second
>> client
>> > issues a lock cluster command, the first commits the transaction. If
>> this is
>> > an ongoing DDL, then the answer to your question is No. But its a
>> matter of
>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>> the
>> > deny camp. I decided to put it in allow camp, because I have not yet
>> written
>> > any code to detect whether a transaction being committed has a DDL in
>> it or
>> > not, and stopping all transactions from committing looks too
>> restrictive to
>> > me.
>> >
>> > Do you have some other meaning of an ongoing DDL?
>> >
>> > I agree that we should have discussed this on the right thread. Lets
>> > continue this discussion on that thread.
>>
>> Continued on the other thread.
>>
>> >
>> >>
>> >>
>> >>>
>> >>>
>> >>>  4) Connect to any of the existing coordinators and take backup of the
>> >>> database
>> >>>         ./pg_dump -p 5432 -C -s
>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql
>> test
>> >>>
>> >>>  5) Start the new coordinator specify --restoremode while starting the
>> >>> coordinator
>> >>>         ./postgres --restoremode -D ../data_cord3 -p 5455
>> >>>
>> >>>  6) connect to the new coordinator directly
>> >>>         ./psql postgres -p 5455
>> >>>
>> >>>  7) create all the datanodes and the rest of the coordinators on the
>> new
>> >>> coordiantor & reload configuration
>> >>>         CREATE NODE DATA_NODE_1 WITH (HOST = 'localhost', type =
>> >>> 'datanode', PORT = 15432, PRIMARY);
>> >>>         CREATE NODE DATA_NODE_2 WITH (HOST = 'localhost', type =
>> >>> 'datanode', PORT = 25432);
>> >>>
>> >>>         CREATE NODE COORD_1 WITH (HOST = 'localhost', type =
>> >>> 'coordinator', PORT = 5432);
>> >>>         CREATE NODE COORD_2 WITH (HOST = 'localhost', type =
>> >>> 'coordinator', PORT = 5433);
>> >>>
>> >>>         SELECT pgxc_pool_reload();
>> >>>
>> >>>  8) quit psql
>> >>>
>> >>>  9) Create the new database on the new coordinator
>> >>>         ./createdb test -p 5455
>> >>>
>> >>> 10) create the roles and table spaces manually, the dump does not
>> contain
>> >>> roles or table spaces
>> >>>         ./psql test -p 5455
>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>> >>>         CREATE TABLESPACE my_space LOCATION
>> >>> '/usr/local/pgsql/my_space_location';
>> >>>         \q
>> >>>
>> >>
>> >> Will pg_dumpall help ? It dumps roles also.
>> >
>> >
>> > Yah , but I am giving example of pg_dump so this step has to be there.
>> >
>> >>
>> >>
>> >>
>> >>>
>> >>> 11) Restore the backup that was taken from an existing coordinator by
>> >>> connecting to the new coordinator directly
>> >>>          ./psql -d test -f
>> >>> /home/edb/Desktop/NodeAddition/dumps/101_all_objects_coord.sql -p 5455
>> >>>
>> >>> 11) Quit the new coordinator
>> >>>
>> >>> 12) Connect to any of the existing coordinators & unlock the cluster
>> >>>         ./psql postgres -p 5432
>> >>>         SET xc_lock_for_backup=no;
>> >>>         \q
>> >>>
>> >>
>> >> Unlocking the cluster has to be done *after* the node is added into the
>> >> cluster.
>> >
>> >
>> > Very true. I stand corrected. This means CREATE NODE has to be allowed
>> when
>> > xc_lock_for_backup is set.
>> >
>> >>
>> >>
>> >>
>> >>>
>> >>> 13) Start the new coordinator as a by specifying --coordinator
>> >>>         ./postgres --coordinator -D ../data_cord3 -p 5455
>> >>>
>> >>> 14) Create the new coordinator on rest of the coordinators and reload
>> >>> configuration
>> >>>         CREATE NODE COORD_3 WITH (HOST = 'localhost', type =
>> >>> 'coordinator', PORT = 5455);
>> >>>         SELECT pgxc_pool_reload();
>> >>>
>> >>> 15) The new coordinator is now ready
>> >>>         ./psql test -p 5455
>> >>>         create table test_new_coord(a int, b int);
>> >>>         \q
>> >>>         ./psql test -p 5432
>> >>>         select * from test_new_coord;
>> >>>
>> >>>
>> >>> Here are the steps to add a new datanode
>> >>>
>> >>>
>> >>>  1) Initdb new datanode
>> >>>         /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data3
>> --nodename
>> >>> data_node_3
>> >>>
>> >>>  2) Make necessary changes in its postgresql.conf, in particular
>> specify
>> >>> new datanode name
>> >>>
>> >>>  3) Connect to any of the existing coordinators & lock the cluster for
>> >>> backup
>> >>>         ./psql postgres -p 5432
>> >>>         SET xc_lock_for_backup=yes;
>> >>>         \q
>> >>>
>> >>>  4) Connect to any of the existing datanodes and take backup of the
>> >>> database
>> >>>         ./pg_dump -p 15432 -C -s
>> >>> --file=/home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql
>> test
>> >>>
>> >>>  5) Start the new datanode specify --restoremode while starting the it
>> >>>         ./postgres --restoremode -D ../data3 -p 35432
>> >>>
>> >>>  6) Create the new database on the new datanode
>> >>>         ./createdb test -p 35432
>> >>>
>> >>>  7) create the roles and table spaces manually, the dump does not
>> contain
>> >>> roles or table spaces
>> >>>         ./psql test -p 35432
>> >>>         CREATE ROLE admin WITH LOGIN CREATEDB CREATEROLE;
>> >>>         CREATE TABLESPACE my_space LOCATION
>> >>> '/usr/local/pgsql/my_space_location';
>> >>>         \q
>> >>>
>> >>>  8) Restore the backup that was taken from an existing datanode by
>> >>> connecting to the new datanode directly
>> >>>          ./psql -d test -f
>> >>> /home/edb/Desktop/NodeAddition/dumps/102_all_objects_dn1.sql -p 35432
>> >>>
>> >>>  9) Quit the new datanode
>> >>>
>> >>> 10) Connect to any of the existing coordinators & unlock the cluster
>> >>>         ./psql postgres -p 5432
>> >>>         SET xc_lock_for_backup=no;
>> >>>         \q
>> >>>
>> >>> 11) Start the new datanode as a datanode by specifying --datanode
>> >>>         ./postgres --datanode -D ../data3 -p 35432
>> >>>
>> >>> 12) Create the new datanode on all the coordinators and reload
>> >>> configuration
>> >>>         CREATE NODE DATA_NODE_3 WITH (HOST = 'localhost', type =
>> >>> 'datanode', PORT = 35432);
>> >>>         SELECT pgxc_pool_reload();
>> >>>
>> >>> 13) Redistribute data by using ALTER TABLE REDISTRIBUTE
>> >>>
>> >>> 14) The new daatnode is now ready
>> >>>         ./psql test
>> >>>         create table test_new_dn(a int, b int) distribute by
>> replication;
>> >>>         insert into test_new_dn values(1,2);
>> >>>         EXECUTE DIRECT ON (data_node_1) 'SELECT * from test_new_dn';
>> >>>         EXECUTE DIRECT ON (data_node_2) 'SELECT * from test_new_dn';
>> >>>         EXECUTE DIRECT ON (data_node_3) 'SELECT * from test_new_dn';
>> >>>
>> >>> Please note that the steps assume that the patch sent earlier
>> >>> 1_lock_cluster.patch in mail subject [Patch to lock cluster] is
>> applied.
>> >>>
>> >>> I have also attached test database scripts, that would help in patch
>> >>> review.
>> >>>
>> >>> Comments are welcome.
>> >>>
>> >>> --
>> >>> Abbas
>> >>> Architect
>> >>> EnterpriseDB Corporation
>> >>> The Enterprise PostgreSQL Company
>> >>>
>> >>> Phone: 92-334-5100153
>> >>>
>> >>> Website: www.enterprisedb.com
>> >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> >>> Follow us on Twitter: http://www.twitter.com/enterprisedb
>> >>>
>> >>> This e-mail message (and any attachment) is intended for the use of
>> >>> the individual or entity to whom it is addressed. This message
>> >>> contains information from EnterpriseDB Corporation that may be
>> >>> privileged, confidential, or exempt from disclosure under applicable
>> >>> law. If you are not the intended recipient or authorized to receive
>> >>> this for the intended recipient, any use, dissemination, distribution,
>> >>> retention, archiving, or copying of this communication is strictly
>> >>> prohibited. If you have received this e-mail in error, please notify
>> >>> the sender immediately by reply e-mail and delete this message.
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> Everyone hates slow websites. So do we.
>> >>> Make your web apps faster with AppDynamics
>> >>> Download AppDynamics Lite for free today:
>> >>> http://p.sf.net/sfu/appdyn_d2d_feb
>> >>> _______________________________________________
>> >>> Postgres-xc-developers mailing list
>> >>> Pos...@li...
>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > --
>> > Abbas
>> > Architect
>> > EnterpriseDB Corporation
>> > The Enterprise PostgreSQL Company
>> >
>> > Phone: 92-334-5100153
>> >
>> > Website: www.enterprisedb.com
>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>> >
>> > This e-mail message (and any attachment) is intended for the use of
>> > the individual or entity to whom it is addressed. This message
>> > contains information from EnterpriseDB Corporation that may be
>> > privileged, confidential, or exempt from disclosure under applicable
>> > law. If you are not the intended recipient or authorized to receive
>> > this for the intended recipient, any use, dissemination, distribution,
>> > retention, archiving, or copying of this communication is strictly
>> > prohibited. If you have received this e-mail in error, please notify
>> > the sender immediately by reply e-mail and delete this message.
>>
>
>
>
> --
> --
> Abbas
> Architect
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: 92-334-5100153
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of
> the individual or entity to whom it is addressed. This message
> contains information from EnterpriseDB Corporation that may be
> privileged, confidential, or exempt from disclosure under applicable
> law. If you are not the intended recipient or authorized to receive
> this for the intended recipient, any use, dissemination, distribution,
> retention, archiving, or copying of this communication is strictly
> prohibited. If you have received this e-mail in error, please notify
> the sender immediately by reply e-mail and delete this message.




-- 
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: [Postgres-xc-developers] Manually Table Partitioning

From: Kaiji C. <ch...@im...> - 2013-03-30 12:35:26

Thanks for your help, I'll use this solution in my project.

On Mar 30, 2013, at 12:40 PM, "Abbas Butt" <abb...@en...<mailto:abb...@en...>> wrote:

On Fri, Mar 29, 2013 at 3:19 PM, Kaiji Chen <ch...@im...<mailto:ch...@im...>> wrote:
Hi,
I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it.
It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process.

Here is what you can do. Add a column of type int in the table and distribute the table by modulo of the added column. Now if you want to specify in your query that the insert should go to first data node use value 0 for the added column, for second data node use 1 and so on.

Off course a better way would be a add support for a user defined function for computing target data node in XC, but the above idea is valid for the current implementation.

------------------------------------------------------------------------------
Own the Future-Intel(R) Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest. Compete
for recognition, cash, and the chance to get your game on Steam.
$5K grand prize plus 10 genre and skill prizes. Submit your demo
by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
_______________________________________________
Postgres-xc-developers mailing list
Pos...@li...<mailto:Pos...@li...>
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers

--
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com<http://www.enterprisedb.com/>
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: [Postgres-xc-developers] Manually Table Partitioning

From: Abbas B. <abb...@en...> - 2013-03-30 11:40:33

On Fri, Mar 29, 2013 at 3:19 PM, Kaiji Chen <ch...@im...> wrote:

>  Hi,
> I'm working on a data partitioning project on PostgreSQL by adding a
> middleware between the database cluster interface and applications that
> modify the SQL statement to specific data nodes. I just find that
> PostgresXC has a nice GTM that can help me do the distributed transaction
> management works, I considered to transfer my project on it.
> It seems the sliders (
> http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf)
> intend that user defined table distribution is not available, but the
> coordinator can choose specific data node when processing the queries, and
> the table will be distributed to by default if DISTRIBUTED BY is not
> specified. Then I wonder if I can specify a data node in each query and
> stop the default auto distributing process.
>

Here is what you can do. Add a column of type int in the table and
distribute the table by modulo of the added column. Now if you want to
specify in your query that the insert should go to first data node use
value 0 for the added column, for second data node use 1 and so on.

Off course a better way would be a add support for a user defined function
for computing target data node in XC, but the above idea is valid for the
current implementation.

>
>
> ------------------------------------------------------------------------------
> Own the Future-Intel(R) Level Up Game Demo Contest 2013
> Rise to greatness in Intel's independent game demo contest. Compete
> for recognition, cash, and the chance to get your game on Steam.
> $5K grand prize plus 10 genre and skill prizes. Submit your demo
> by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

-- 
--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.

Re: [Postgres-xc-developers] Manually Table Partitioning

From: Kaiji C. <ch...@im...> - 2013-03-30 09:12:24

Thanks for your reply!

It seems ok if I use EXECUTE DIRECT and manually maintain the data concurrency and a global index in my middleware. But it looks like I've skipped the PostgresXC coordinator, it will not be the best choice.

I just come up a idea applying external data partitioning design to the PostgresXC. As stated in the document XC can distribute tables to data nodes using hash function. Then can I manipulate the original table and add a new column as my partition decision and let the table distributed by this column. Then we add this column to the compound primary key of the table and let the coordinator deal with the query planning work. I think this can be done if for different tables, the same hash value will be partitioned to the same data node if there is no modification to the set of data nodes.




Yours,
Kaiji Chen
PhD Candidate<mailto:ch...@im...>
IMADA， Southern Denmark University
Email: ch...@im...<mailto:ch...@im...>
________________________________
From: Michael Paquier [mic...@gm...]
Sent: Saturday, March 30, 2013 5:55 AM
To: Kaiji Chen
Cc: pos...@li...
Subject: Re: [Postgres-xc-developers] Manually Table Partitioning




On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...<mailto:ch...@im...>> wrote:
Hi,
I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it.
It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process.
For SELECT queries, you can use EXECUTE DIRECT:
http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html
The results you get might not be exact as not global query planning is not done and the query string is sent as-is.

Note that you cannot use EXECUTE DIRECT with DML or the whole cluster consistency would be broken.
--
Michael

261 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 43 44 45 46 47 .. 139 > >> (Page 45 of 139)