From: Nikhil S. <ni...@st...> - 2013-03-07 12:34:43
|
Hi, PFA, patch which fixes an obnoxious crash in GTM Standby. This one was a tough nut to crack down. The crash is as below Program terminated with signal 11, Segmentation fault. #0 0x00000000004253c9 in gtm_lappend () Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 (gdb) bt #0 0x00000000004253c9 in gtm_lappend () #1 0x000000000040ad77 in GTM_BkupBeginTransactionGetGXIDMulti.clone.0 () #2 0x000000000040aedb in ProcessBkupBeginTransactionGetGXIDCommand () #3 0x000000000040417c in GTM_ThreadMain () IMHO, using TopMemoryContext to mean the top context of each thread is pretty confusing. Bad choice of name for the memory context according to me. Maybe we could have avoided this crash if we had used a different name for the context. This "TopMemoryContext" goes away when that thread goes away. So ain't nothing TOP about it. The GTMTransactions.gt_open_transactions list was being appended to using this memory context. So later if another thread came in (and the earlier appending thread had been cleaned up), it will find garbage in this list and this was causing the crash. I always saw a couple of threads being cleaned up in the gtm standby logs just prior to the crash. The fix is to use TopMostMemoryContext. If it were to me I would re-haul this TopMemoryContext naming business in GTM. Am sure people will get confused in the future too when they write code.. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Koichi S. <koi...@gm...> - 2013-03-08 02:31:46
|
Yes, memory context usage of this part is not correct and it leaves garbage. I will commit it if no further input is given. Regards; ---------- Koichi Suzuki 2013/3/7 Nikhil Sontakke <ni...@st...>: > Hi, > > PFA, patch which fixes an obnoxious crash in GTM Standby. This one was > a tough nut to crack down. The crash is as below > > Program terminated with signal 11, Segmentation fault. > #0 0x00000000004253c9 in gtm_lappend () > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 > (gdb) bt > #0 0x00000000004253c9 in gtm_lappend () > #1 0x000000000040ad77 in GTM_BkupBeginTransactionGetGXIDMulti.clone.0 () > #2 0x000000000040aedb in ProcessBkupBeginTransactionGetGXIDCommand () > #3 0x000000000040417c in GTM_ThreadMain () > > > > IMHO, using TopMemoryContext to mean the top context of each thread is > pretty confusing. Bad choice of name for the memory context according > to me. Maybe we could have avoided this crash if we had used a > different name for the context. > > This "TopMemoryContext" goes away when that thread goes away. So ain't > nothing TOP about it. The GTMTransactions.gt_open_transactions list > was being appended to using this memory context. So later if another > thread came in (and the earlier appending thread had been cleaned up), > it will find garbage in this list and this was causing the crash. > > I always saw a couple of threads being cleaned up in the gtm standby > logs just prior to the crash. The fix is to use TopMostMemoryContext. > If it were to me I would re-haul this TopMemoryContext naming business > in GTM. Am sure people will get confused in the future too when they > write code.. > > Regards, > Nikhils > -- > StormDB - http://www.stormdb.com > The Database Cloud > Postgres-XC Support and Service > > ------------------------------------------------------------------------------ > Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester > Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the > endpoint security space. For insight on selecting the right partner to > tackle endpoint security challenges, access the full report. > http://p.sf.net/sfu/symantec-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Pavan D. <pav...@gm...> - 2013-03-08 04:46:34
|
On Thu, Mar 7, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...> wrote: > Hi, > > PFA, patch which fixes an obnoxious crash in GTM Standby. This one was > a tough nut to crack down. The crash is as below > > Program terminated with signal 11, Segmentation fault. > #0 0x00000000004253c9 in gtm_lappend () > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 > (gdb) bt > #0 0x00000000004253c9 in gtm_lappend () > #1 0x000000000040ad77 in GTM_BkupBeginTransactionGetGXIDMulti.clone.0 () > #2 0x000000000040aedb in ProcessBkupBeginTransactionGetGXIDCommand () > #3 0x000000000040417c in GTM_ThreadMain () > > > > IMHO, using TopMemoryContext to mean the top context of each thread is > pretty confusing. Bad choice of name for the memory context according > to me. Maybe we could have avoided this crash if we had used a > different name for the context. > > This "TopMemoryContext" goes away when that thread goes away. So ain't > nothing TOP about it. Well, let me at least try and defend because that's my baby :-) I think I chose name TopTransactionContext because I wanted to give a thread in GTM as much the same treatment as a process gets in Postgres. So I stick to the same names, but invented TopMost to mean the context which is global to the GTM process. My idea was and still is that we should avoid using TopMost as much as we can because that memory leaks will be hard to plug-in. Remember, we expect GTM to run as long as any one component of the cluster is running.. which pretty much means forever because while any one component of the cluster can go down, but not the entire cluster. But if its causing confusion, I won't mind adding a code commentary to explain the difference. Clearly my fault. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee |
From: Nikhil S. <ni...@st...> - 2013-03-08 06:39:21
|
>> IMHO, using TopMemoryContext to mean the top context of each thread is >> pretty confusing. Bad choice of name for the memory context according >> to me. Maybe we could have avoided this crash if we had used a >> different name for the context. >> >> This "TopMemoryContext" goes away when that thread goes away. So ain't >> nothing TOP about it. > > Well, let me at least try and defend because that's my baby :-) I > think I chose name TopTransactionContext because I wanted to give a > thread in GTM as much the same treatment as a process gets in > Postgres. So I stick to the same names, but invented TopMost to mean > the context which is global to the GTM process. My idea was and still > is that we should avoid using TopMost as much as we can because that > memory leaks will be hard to plug-in. Remember, we expect GTM to run > as long as any one component of the cluster is running.. which pretty > much means forever because while any one component of the cluster can > go down, but not the entire cluster. > > But if its causing confusion, I won't mind adding a code commentary to > explain the difference. Clearly my fault. > Thanks for the explanation Pavan. I come from the Postgres source code background and when I started looking at this problem I looked at the memory context and thought everything is fine because it said "TopMemoryContext". That confused me a little bit. Had it said "ThreadTopContext" it would have been much more readable IMHO. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Koichi S. <koi...@gm...> - 2013-03-08 07:16:54
|
Thank you Pavan. I think I added the lines in question. Because you are the original author if gtm, it's wonderful if you take a look at it. Regards; ---------- Koichi Suzuki 2013/3/8 Pavan Deolasee <pav...@gm...>: > On Thu, Mar 7, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...> wrote: >> Hi, >> >> PFA, patch which fixes an obnoxious crash in GTM Standby. This one was >> a tough nut to crack down. The crash is as below >> >> Program terminated with signal 11, Segmentation fault. >> #0 0x00000000004253c9 in gtm_lappend () >> Missing separate debuginfos, use: debuginfo-install >> glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 >> (gdb) bt >> #0 0x00000000004253c9 in gtm_lappend () >> #1 0x000000000040ad77 in GTM_BkupBeginTransactionGetGXIDMulti.clone.0 () >> #2 0x000000000040aedb in ProcessBkupBeginTransactionGetGXIDCommand () >> #3 0x000000000040417c in GTM_ThreadMain () >> >> >> >> IMHO, using TopMemoryContext to mean the top context of each thread is >> pretty confusing. Bad choice of name for the memory context according >> to me. Maybe we could have avoided this crash if we had used a >> different name for the context. >> >> This "TopMemoryContext" goes away when that thread goes away. So ain't >> nothing TOP about it. > > Well, let me at least try and defend because that's my baby :-) I > think I chose name TopTransactionContext because I wanted to give a > thread in GTM as much the same treatment as a process gets in > Postgres. So I stick to the same names, but invented TopMost to mean > the context which is global to the GTM process. My idea was and still > is that we should avoid using TopMost as much as we can because that > memory leaks will be hard to plug-in. Remember, we expect GTM to run > as long as any one component of the cluster is running.. which pretty > much means forever because while any one component of the cluster can > go down, but not the entire cluster. > > But if its causing confusion, I won't mind adding a code commentary to > explain the difference. Clearly my fault. > > Thanks, > Pavan > > -- > Pavan Deolasee > http://www.linkedin.com/in/pavandeolasee > > ------------------------------------------------------------------------------ > Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester > Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the > endpoint security space. For insight on selecting the right partner to > tackle endpoint security challenges, access the full report. > http://p.sf.net/sfu/symantec-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers |
From: Pavan D. <pav...@gm...> - 2013-03-08 08:21:17
|
On Fri, Mar 8, 2013 at 12:46 PM, Koichi Suzuki <koi...@gm...> wrote: > Thank you Pavan. > > I think I added the lines in question. Because you are the original > author if gtm, it's wonderful if you take a look at it. > Looks good to me. The list under question is global to the process and hence the list cells must be allocated in the process-level top context i.e. TopMostMemoryContext. I checked and there are other places in the code where its explained why its necessary to allocate them in the said context. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee |
From: Nikhil S. <ni...@st...> - 2013-04-03 09:48:20
|
>> I think I added the lines in question. Because you are the original >> author if gtm, it's wonderful if you take a look at it. >> > > Looks good to me. The list under question is global to the process and > hence the list cells must be allocated in the process-level top > context i.e. TopMostMemoryContext. I checked and there are other > places in the code where its explained why its necessary to allocate > them in the said context. > This commit is still pending. It's an important fix for GTM standby. Regards, Nikhils > Thanks, > Pavan > > -- > Pavan Deolasee > http://www.linkedin.com/in/pavandeolasee -- StormDB - http://www.stormdb.com The Database Cloud |
From: Koichi S. <koi...@gm...> - 2013-04-04 06:42:22
|
Sorry taking too long. I will commit it soon. Kind Regards; ---------- Koichi Suzuki 2013/4/3 Nikhil Sontakke <ni...@st...> > >> I think I added the lines in question. Because you are the original > >> author if gtm, it's wonderful if you take a look at it. > >> > > > > Looks good to me. The list under question is global to the process and > > hence the list cells must be allocated in the process-level top > > context i.e. TopMostMemoryContext. I checked and there are other > > places in the code where its explained why its necessary to allocate > > them in the said context. > > > > This commit is still pending. It's an important fix for GTM standby. > > Regards, > Nikhils > > > Thanks, > > Pavan > > > > -- > > Pavan Deolasee > > http://www.linkedin.com/in/pavandeolasee > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud > |