Re: [Postgres-xc-general] failed to find proc in ProcArray

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Sorry I didn't respond for a  while.    I took a look at your
configuration and anything special.

Yes, between coordinator and datamode can stay open.   It is owned by
the pooler process, which manages coordinator and datanode connection.
   Even though a coordinator backend does not need a connection to a
datanode any more, the pooler can keep it opened for subsequent use,
to save connection overhead.

Regards;
---
Koichi Suzuki

2014-02-16 17:19 GMT+09:00 Rishi Ramraj <the...@gm...>:
> FYI: I've been running some more tests on the REL1_2_STABLE branch to see if
> I can find more information on what's wrong with my system. To keep my setup
> simple, I have reverted to using initdb and initgtm to create the cluster
> and am running only the gtm, coordinator and datanode. They are all hosted
> on the same machine and communicate using TCP sockets on localhost.
>
> TL;DR there is a stack trace at the end of the email.
>
> I've added some more debug logging to src/backend/storage/ipc/procarray.c
> 436:
>
>   elog(LOG, "failed to find proc %p in ProcArray", proc);
> +   elog(LOG, "pid %d", proc->pid);
> +   elog(LOG, "pgprocno %d", proc->pgprocno);
>
> The logs only seem to start happening after I connect using psql. Here's
> what they look like on the coordinator without any psql commands issued:
>
> SID 5300263f.163d LOG:  00000: database system was shut down at 2014-02-15
> 19:13:59 EST
> SID 5300263f.163d LOCATION:  StartupXLOG, xlog.c:4959
> SID 5300263f.163b LOG:  00000: database system is ready to accept
> connections
> SID 5300263f.163b LOCATION:  reaper, postmaster.c:2768
> SID 5300263f.1642 LOG:  00000: autovacuum launcher started
> SID 5300263f.1642 LOCATION:  AutoVacLauncherMain, autovacuum.c:417
> SID 5300276c.16c3 LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 5300276c.16c3 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 5300276c.16c3 LOG:  00000: pid 5827
> SID 5300276c.16c3 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 5300276c.16c3 LOG:  00000: pgprocno 102
> SID 5300276c.16c3 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 530027a8.16f9 LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 530027a8.16f9 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 530027a8.16f9 LOG:  00000: pid 5881
> SID 530027a8.16f9 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 530027a8.16f9 LOG:  00000: pgprocno 102
> SID 530027a8.16f9 LOCATION:  ProcArrayRemove, procarray.c:438
>
> Here is what the logs in the coordinator look like after I issued drop
> database test;
>
> SID 53002923.1764 LOG:  00000: statement: drop database test;
> SID 53002923.1764 LOCATION:  exec_simple_query, postgres.c:966
> SID 53002923.1764 LOG:  00000: failed to find proc 0x7f777d08d980 in
> ProcArray
> SID 53002923.1764 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 53002923.1764 STATEMENT:  drop database test;
> SID 53002923.1764 LOG:  00000: pid 0
> SID 53002923.1764 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 53002923.1764 STATEMENT:  drop database test;
> SID 53002923.1764 LOG:  00000: pgprocno 118
> SID 53002923.1764 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 53002923.1764 STATEMENT:  drop database test;
> SID 5300294c.176a LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 5300294c.176a LOCATION:  ProcArrayRemove, procarray.c:436
> SID 5300294c.176a LOG:  00000: pid 5994
> SID 5300294c.176a LOCATION:  ProcArrayRemove, procarray.c:437
> SID 5300294c.176a LOG:  00000: pgprocno 102
> SID 5300294c.176a LOCATION:  ProcArrayRemove, procarray.c:438
> SID 53002988.1784 LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 53002988.1784 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 53002988.1784 LOG:  00000: pid 6020
> SID 53002988.1784 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 53002988.1784 LOG:  00000: pgprocno 102
> SID 53002988.1784 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 530029c4.1791 LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 530029c4.1791 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 530029c4.1791 LOG:  00000: pid 6033
> SID 530029c4.1791 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 530029c4.1791 LOG:  00000: pgprocno 102
> SID 530029c4.1791 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 53002a00.17c6 LOG:  00000: failed to find proc 0x7f777d08ab80 in
> ProcArray
> SID 53002a00.17c6 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 53002a00.17c6 LOG:  00000: pid 6086
> SID 53002a00.17c6 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 53002a00.17c6 LOG:  00000: pgprocno 102
> SID 53002a00.17c6 LOCATION:  ProcArrayRemove, procarray.c:438
>
> At this point, these logs start appearing in the datanode's log:
>
> SID 53002927.1766 LOG:  00000: failed to find proc 0x7f46c031e980 in
> ProcArray
> SID 53002927.1766 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 53002927.1766 STATEMENT:  COMMIT PREPARED 'T10226'
> SID 53002927.1766 LOG:  00000: pid 0
> SID 53002927.1766 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 53002927.1766 STATEMENT:  COMMIT PREPARED 'T10226'
> SID 53002927.1766 LOG:  00000: pgprocno 118
> SID 53002927.1766 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 53002927.1766 STATEMENT:  COMMIT PREPARED 'T10226'
> SID 5300293d.1767 LOG:  00000: failed to find proc 0x7f46c031bb80 in
> ProcArray
> SID 5300293d.1767 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 5300293d.1767 LOG:  00000: pid 5991
> SID 5300293d.1767 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 5300293d.1767 LOG:  00000: pgprocno 102
> SID 5300293d.1767 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 53002979.1781 LOG:  00000: failed to find proc 0x7f46c031bb80 in
> ProcArray
> SID 53002979.1781 LOCATION:  ProcArrayRemove, procarray.c:436
> SID 53002979.1781 LOG:  00000: pid 6017
> SID 53002979.1781 LOCATION:  ProcArrayRemove, procarray.c:437
> SID 53002979.1781 LOG:  00000: pgprocno 102
> SID 53002979.1781 LOCATION:  ProcArrayRemove, procarray.c:438
> SID 530029b5.178e LOG:  00000: failed to find proc 0x7f46c031bb80 in
> ProcArray
> ...
>
> When I disconnect psql, the logs stop in the coordinator but continue in the
> datanode (there's an open connection between the coordinator and the
> datanode). After I set autovacuum = off, the logs with pgprocno 102
> disappeared in both the datanode and coordinator.
>
> I originally set out to get stack traces for both proc 118 and 102.
> Unfortunately, it seems that the autovacuum launcher spawns multiple
> processes making it difficult to intercept the process with gdb. I was able
> to get a stack trace for pgprocno 118 from the open connection between the
> datanode and coordinator:
>
> #0  ProcArrayRemove (proc=proc@entry=0x7ffcb25f9980,
> latestXid=latestXid@entry=10437) at procarray.c:436
> #1  0x00000000004b69ed in FinishPreparedTransaction (gid=<optimized out>,
> isCommit=<optimized out>) at twophase.c:1368
> #2  0x0000000000674551 in standard_ProcessUtility (parsetree=0x29d4110,
>     queryString=0x29d3730 "COMMIT PREPARED 'T10437'",
> context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
>     dest=<optimized out>, sentToRemote=<optimized out>,
> completionTag=0x7fff2c308910 "") at utility.c:574
> #3  0x0000000000670d02 in PortalRunUtility (portal=portal@entry=0x29d99c0,
> utilityStmt=utilityStmt@entry=0x29d4110,
>     isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x29d4450,
>     completionTag=completionTag@entry=0x7fff2c308910 "") at pquery.c:1285
> #4  0x00000000006718cd in PortalRunMulti (portal=portal@entry=0x29d99c0,
> isTopLevel=isTopLevel@entry=1 '\001',
>     dest=dest@entry=0x29d4450, altdest=altdest@entry=0x29d4450,
> completionTag=completionTag@entry=0x7fff2c308910 "")
>     at pquery.c:1432
> #5  0x00000000006723e9 in PortalRun (portal=portal@entry=0x29d99c0,
> count=count@entry=9223372036854775807,
>     isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x29d4450,
> altdest=altdest@entry=0x29d4450,
>     completionTag=completionTag@entry=0x7fff2c308910 "") at pquery.c:882
> #6  0x00000000006702d2 in exec_simple_query (query_string=0x29d3730 "COMMIT
> PREPARED 'T10437'") at postgres.c:1140
> #7  PostgresMain (argc=<optimized out>, argv=argv@entry=0x29bb360,
> dbname=0x29bb288 "postgres",
>     username=<optimized out>) at postgres.c:4251
> #8  0x00000000004621c6 in BackendRun (port=0x29dd9a0) at postmaster.c:4205
> #9  BackendStartup (port=0x29dd9a0) at postmaster.c:3894
> #10 ServerLoop () at postmaster.c:1705
> #11 0x000000000062f768 in PostmasterMain (argc=argc@entry=4,
> argv=argv@entry=0x29b9340) at postmaster.c:1374
> #12 0x0000000000462b47 in main (argc=4, argv=0x29b9340) at main.c:196
>
> I'm not familiar with the codebase, so I'm not entirely sure what's
> happening. As far as I can tell, FinishPreparedTransaction is trying to end
> a two phase commit, which it cannot find. I did a select * from
> pg_prepared_xacts on both the coordinator and the datanode, but both tables
> were empty. The commits seem to complete successfully despite the log. I'll
> keep digging to see what I can find.
>
>
> On Sat, Feb 15, 2014 at 2:44 AM, Rishi Ramraj <the...@gm...>
> wrote:
>>
>> I just recompiled XC using the REL1_2_STABLE branch. I used pgxc_ctl to
>> configure and create the cluster. The physical configuration is still the
>> same, except now there's a GTM proxy on the machine as well.
>>
>> The database seems to be functioning correctly, but I'm still getting the
>> ProcArray error. Here's the log from the coordinator:
>>
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3cb0 XID 0LOG:  00000: database
>> system was shut down at 2014-02-15 02:23:33 EST
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3cb0 XID 0LOCATION:  StartupXLOG,
>> xlog.c:4959
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3ca5 XID 0LOG:  00000: database
>> system is ready to accept connections
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3ca5 XID 0LOCATION:  reaper,
>> postmaster.c:2768
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3cb5 XID 0LOG:  00000: autovacuum
>> launcher started
>> 2014-02-15 02:26:28 EST SID 52ff16a4.3cb5 XID 0LOCATION:
>> AutoVacLauncherMain, autovacuum.c:417
>> 2014-02-15 02:28:53 EST SID 52ff1727.3cd2 XID 0ERROR:  42601: syntax error
>> at or near "asdf" at character 1
>> 2014-02-15 02:28:53 EST SID 52ff1727.3cd2 XID 0LOCATION:  scanner_yyerror,
>> scan.l:1044
>> 2014-02-15 02:28:53 EST SID 52ff1727.3cd2 XID 0STATEMENT:  asdf
>> ;
>> 2014-02-15 02:29:30 EST SID 52ff1759.3cd3 XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:29:30 EST SID 52ff1759.3cd3 XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>> 2014-02-15 02:30:30 EST SID 52ff1795.3cd5 XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:30:30 EST SID 52ff1795.3cd5 XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>> 2014-02-15 02:31:01 EST SID 52ff1727.3cd2 XID 0LOG:  00000: statement:
>> create node data with (type='datanode', host='localhost', port=20008);
>> 2014-02-15 02:31:01 EST SID 52ff1727.3cd2 XID 0LOCATION:
>> exec_simple_query, postgres.c:966
>> 2014-02-15 02:31:08 EST SID 52ff1727.3cd2 XID 0ERROR:  42601: syntax error
>> at or near "exit" at character 1
>> 2014-02-15 02:31:08 EST SID 52ff1727.3cd2 XID 0LOCATION:  scanner_yyerror,
>> scan.l:1044
>> 2014-02-15 02:31:08 EST SID 52ff1727.3cd2 XID 0STATEMENT:  exit
>> ;
>> 2014-02-15 02:32:29 EST SID 52ff1808.3d1e XID 0LOG:  00000: statement:
>> create database test;
>> 2014-02-15 02:32:29 EST SID 52ff1808.3d1e XID 0LOCATION:
>> exec_simple_query, postgres.c:966
>> 2014-02-15 02:32:29 EST SID 52ff180d.3d20 XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:32:29 EST SID 52ff180d.3d20 XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>> 2014-02-15 02:32:31 EST SID 52ff1808.3d1e XID 2054LOG:  00000: failed to
>> find proc 0x7f3c12ff8980 in ProcArray
>> 2014-02-15 02:32:31 EST SID 52ff1808.3d1e XID 2054LOCATION:
>> ProcArrayRemove, procarray.c:436
>> 2014-02-15 02:32:31 EST SID 52ff1808.3d1e XID 2054STATEMENT:  create
>> database test;
>> 2014-02-15 02:33:30 EST SID 52ff1849.3e12 XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:33:30 EST SID 52ff1849.3e12 XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>> 2014-02-15 02:34:30 EST SID 52ff1885.3e2e XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:34:30 EST SID 52ff1885.3e2e XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>> 2014-02-15 02:35:30 EST SID 52ff18c1.3e30 XID 0LOG:  00000: failed to find
>> proc 0x7f3c12ff5b80 in ProcArray
>> 2014-02-15 02:35:30 EST SID 52ff18c1.3e30 XID 0LOCATION:  ProcArrayRemove,
>> procarray.c:436
>>
>> Here's the log from the datanode:
>>
>> LOG:  database system was shut down at 2014-02-15 02:23:35 EST
>> LOG:  database system is ready to accept connections
>> LOG:  autovacuum launcher started
>> ERROR:  PGXC Node data: object already defined
>> STATEMENT:  create node data with (type='coordinator', host='localhost',
>> port=20004);
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>> LOG:  failed to find proc 0x7f5a2cc58980 in ProcArray
>> STATEMENT:  COMMIT PREPARED 'T2051'
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>> LOG:  failed to find proc 0x7f5a2cc55b80 in ProcArray
>>
>> I've also attached the pgxcConf file I used to generate the cluster. I'm
>> going to try to run it under gdb and get a stack trace. Let me know if there
>> are any other diagnostics you would like me to run. That being said, the
>> database in its current condition is fine for my testing :)
>>
>>
>> On Thu, Feb 13, 2014 at 11:14 PM, Rishi Ramraj
>> <the...@gm...> wrote:
>>>
>>> Will give it a shot. Thanks for the help!
>>>
>>>
>>> On Thu, Feb 13, 2014 at 11:05 PM, Koichi Suzuki <koi...@gm...>
>>> wrote:
>>>>
>>>> I'm afraid something is wrong inside but sorry I couldn't locate what
>>>> it is.   Series of error message is supposed to be from COMMIT, ABORT,
>>>> COMMIT PREPARED or ABORT PREPARED.
>>>>
>>>> I'd advise to recreate the cluster with pgxc_ctl.    I hope this is
>>>> better to track what is going on.   You will find materials for this
>>>> from PGXC wiki page.   Try www.postgres-xc.org,
>>>>
>>>> Regards;
>>>> ---
>>>> Koichi Suzuki
>>>>
>>>>
>>>> 2014-02-14 12:57 GMT+09:00 Rishi Ramraj <the...@gm...>:
>>>> > Apparently I left the datanode running, and after a while the logs
>>>> > were
>>>> > filled with the following:
>>>> >
>>>> > FATAL:  sorry, too many clients already
>>>> >
>>>> > Here are the logs with the increased log levels. First on the
>>>> > coordinator:
>>>> >
>>>> > 2014-02-13 22:24:31 EST SID 52fd8c6f.386 XID 0LOG:  database system
>>>> > was shut
>>>> > down at 2014-02-13 22:07:11 EST
>>>> > 2014-02-13 22:24:31 EST SID 52fd8c6e.382 XID 0LOG:  database system is
>>>> > ready
>>>> > to accept connections
>>>> > 2014-02-13 22:24:31 EST SID 52fd8c6f.38b XID 0LOG:  autovacuum
>>>> > launcher
>>>> > started
>>>> > 2014-02-13 22:45:18 EST SID 52fd914a.59f XID 0LOG:  statement: drop
>>>> > database
>>>> > test;
>>>> > 2014-02-13 22:45:18 EST SID 52fd914a.59f XID 11571ERROR:  database
>>>> > "test"
>>>> > does not exist
>>>> > 2014-02-13 22:45:18 EST SID 52fd914a.59f XID 11571STATEMENT:  drop
>>>> > database
>>>> > test;
>>>> > 2014-02-13 22:45:32 EST SID 52fd915c.5a2 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:45:33 EST SID 52fd914a.59f XID 0LOG:  statement: create
>>>> > database test;
>>>> > 2014-02-13 22:45:35 EST SID 52fd914a.59f XID 11575LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297dbca60 in ProcArray
>>>> > 2014-02-13 22:45:35 EST SID 52fd914a.59f XID 11575STATEMENT:  create
>>>> > database test;
>>>> > 2014-02-13 22:46:32 EST SID 52fd9198.5b1 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:47:02 EST SID 52fd91b6.5b6 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:47:32 EST SID 52fd91d4.5bb XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:48:02 EST SID 52fd91f2.5be XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:48:32 EST SID 52fd9210.5c7 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:49:02 EST SID 52fd922e.5d1 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:49:32 EST SID 52fd924c.5d6 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:50:02 EST SID 52fd926a.5da XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:50:32 EST SID 52fd9288.5e0 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:51:02 EST SID 52fd92a6.5e3 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:51:32 EST SID 52fd92c4.5ee XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> > 2014-02-13 22:52:02 EST SID 52fd92e2.5f1 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f0297db9c60 in ProcArray
>>>> >
>>>> > Next, on the datanode:
>>>> >
>>>> > 2014-02-13 22:24:21 EST SID 52fd8c65.33a XID 0LOG:  database system
>>>> > was shut
>>>> > down at 2014-02-13 22:23:35 EST
>>>> > 2014-02-13 22:24:21 EST SID 52fd8c64.336 XID 0LOG:  database system is
>>>> > ready
>>>> > to accept connections
>>>> > 2014-02-13 22:24:21 EST SID 52fd8c65.33e XID 0LOG:  autovacuum
>>>> > launcher
>>>> > started
>>>> > 2014-02-13 22:45:34 EST SID 52fd915e.5a4 XID 0LOG:  statement: START
>>>> > TRANSACTION ISOLATION LEVEL read committed READ WRITE
>>>> > 2014-02-13 22:45:34 EST SID 52fd915e.5a4 XID 0LOG:  statement: create
>>>> > database test;
>>>> > 2014-02-13 22:45:35 EST SID 52fd915e.5a4 XID 11574LOG:  statement:
>>>> > PREPARE
>>>> > TRANSACTION 'T11574'
>>>> > 2014-02-13 22:45:35 EST SID 52fd915e.5a4 XID 0LOG:  statement: COMMIT
>>>> > PREPARED 'T11574'
>>>> > 2014-02-13 22:45:35 EST SID 52fd915e.5a4 XID 11575LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de46a60 in ProcArray
>>>> > 2014-02-13 22:45:35 EST SID 52fd915e.5a4 XID 11575STATEMENT:  COMMIT
>>>> > PREPARED 'T11574'
>>>> > 2014-02-13 22:46:22 EST SID 52fd918e.5ae XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:47:22 EST SID 52fd91ca.5b8 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:48:22 EST SID 52fd9206.5c2 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:49:22 EST SID 52fd9242.5d3 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:50:22 EST SID 52fd927e.5dc XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:51:22 EST SID 52fd92ba.5e5 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:52:22 EST SID 52fd92f6.5f5 XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:53:22 EST SID 52fd9332.5fd XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> > 2014-02-13 22:54:22 EST SID 52fd936e.60d XID 0LOG:  failed to find
>>>> > proc
>>>> > 0x7f1c3de43c60 in ProcArray
>>>> >
>>>> > Would you like me to keep experimenting on this installation, or
>>>> > should I
>>>> > recreate the cluster?
>>>> >
>>>> >
>>>> > On Thu, Feb 13, 2014 at 9:49 PM, Koichi Suzuki <koi...@gm...>
>>>> > wrote:
>>>> >>
>>>> >> I don't see anything strange in the configuration file.
>>>> >>
>>>> >> I found regression test sets up gtm port explicitly (value is the
>>>> >> default, 6666, though).     Could you try to configure your cluster
>>>> >> with pgxc_ctl, which is far simpler and you have much less chance to
>>>> >> encounter errors?   This utility comes with built-in command
>>>> >> sequences
>>>> >> needed to configure and operate your cluster.
>>>> >>
>>>> >> Or, could you re-run the same with different log_min_message level?
>>>> >>
>>>> >> Regards;
>>>> >> ---
>>>> >> Koichi Suzuki
>>>> >>
>>>> >>
>>>> >> 2014-02-14 11:16 GMT+09:00 Rishi Ramraj
>>>> >> <the...@gm...>:
>>>> >> > Missed the coordinator.conf file. Find attached.
>>>> >> >
>>>> >> >
>>>> >> > On Thu, Feb 13, 2014 at 9:15 PM, Rishi Ramraj
>>>> >> > <the...@gm...>
>>>> >> > wrote:
>>>> >> >>
>>>> >> >> Find all of the files attached. I have the GTM, coordinator and
>>>> >> >> datanode
>>>> >> >> services all hosted on the same Linux Mint machine for testing.
>>>> >> >> They
>>>> >> >> communicate using the loopback interface.
>>>> >> >>
>>>> >> >> To initialize the cluster, I ran the following:
>>>> >> >>
>>>> >> >> /var/lib/postgres-xc/gtm $ initgtm -Z
>>>> >> >> /var/lib/postgres-xc/data $ initdb -D . --nodename data
>>>> >> >> /var/lib/postgres-xc/coord $ initdb -D . --nodename coord
>>>> >> >>
>>>> >> >> To start the cluster, I use upstart:
>>>> >> >>
>>>> >> >> /usr/local/pgsql/bin/gtm -D /var/lib/postgres-xc/gtm
>>>> >> >> /usr/local/pgsql/bin/postgres --datanode -D
>>>> >> >> /var/lib/postgres-xc/data
>>>> >> >> /usr/local/pgsql/bin/postgres --coordinator -D
>>>> >> >> /var/lib/postgres-xc/coord
>>>> >> >>
>>>> >> >> I will increase the log level and try to reproduce the errors.
>>>> >> >>
>>>> >> >>
>>>> >> >> On Thu, Feb 13, 2014 at 8:39 PM, 鈴木 幸市 <ko...@in...>
>>>> >> >> wrote:
>>>> >> >>>
>>>> >> >>> Hello;
>>>> >> >>>
>>>> >> >>> You need to set log_statement GUC to appropriate value.   Default
>>>> >> >>> is
>>>> >> >>> “none”.   “all” prints all statements accepted.    Also, it will
>>>> >> >>> be a
>>>> >> >>> good
>>>> >> >>> idea to include session ID in the log_line_prefix GUC.    Your
>>>> >> >>> postgresql.conf has some information how to set this.   I tested
>>>> >> >>> REL1_1_STABLE with four coordinators, four datanodes and four
>>>> >> >>> gtm_proxy and
>>>> >> >>> did not have any additional log for any coordinators/datanodes.
>>>> >> >>> It
>>>> >> >>> will
>>>> >> >>> be helpful what statements you issued.
>>>> >> >>>
>>>> >> >>> Here’s what I got in REL1_1_STABLE (I configured the cluster
>>>> >> >>> using
>>>> >> >>> pgxc_ctl).
>>>> >> >>>
>>>> >> >>> Coordinator:
>>>> >> >>> LOG:  database system was shut down at 2014-02-14 10:2sd
>>>> >> >>> LOG:  autovacuum launcher started
>>>> >> >>>
>>>> >> >>> Datanode:
>>>> >> >>> LOG:  database system was shut down at 2014-02-14 10:21:08 JST
>>>> >> >>> LOG:  database system is ready to accept connections
>>>> >> >>> LOG:  autovacuum launcher started
>>>> >> >>>
>>>> >> >>> I’d like to have your configuration, each postgresql.conf file
>>>> >> >>> and
>>>> >> >>> what
>>>> >> >>> you’ve done to initialize, start and run your application.
>>>> >> >>>
>>>> >> >>> Best;
>>>> >> >>> ---
>>>> >> >>> Koichi Suzuki
>>>> >> >>>
>>>> >> >>> 2014/02/14 0:38、Rishi Ramraj <the...@gm...> のメール：
>>>> >> >>>
>>>> >> >>> I just tried with REL1_1_STABLE. Here are my logs from the
>>>> >> >>> coordinator:
>>>> >> >>>
>>>> >> >>> LOG:  database system was shut down at 2014-02-13 10:14:16 EST
>>>> >> >>> LOG:  database system is ready to accept connections
>>>> >> >>> LOG:  autovacuum launcher started
>>>> >> >>> ERROR:  syntax error at or near "asdf" at character 1
>>>> >> >>> STATEMENT:  asdf;
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79ea60 in ProcArray
>>>> >> >>> STATEMENT:  create database test;
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79bc60 in ProcArray
>>>> >> >>> ERROR:  cannot drop the currently open database
>>>> >> >>> STATEMENT:  drop database test;
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79ea60 in ProcArray
>>>> >> >>> STATEMENT:  drop database test;
>>>> >> >>> ERROR:  syntax error at or near "blah" at character 1
>>>> >> >>> STATEMENT:  blah blah blah;
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f7d4a79bc60 in ProcArray
>>>> >> >>>
>>>> >> >>> The logs from the datanode:
>>>> >> >>>
>>>> >> >>> LOG:  database system was shut down at 2014-02-13 10:19:33 EST
>>>> >> >>> LOG:  database system is ready to accept connections
>>>> >> >>> LOG:  autovacuum launcher started
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5ea60 in ProcArray
>>>> >> >>> STATEMENT:  COMMIT PREPARED 'T10014'
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5ea60 in ProcArray
>>>> >> >>> STATEMENT:  COMMIT PREPARED 'T10023'
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>> LOG:  failed to find proc 0x7f06bee5bc60 in ProcArray
>>>> >> >>>
>>>> >> >>> The datanode seems to continuously produce logs but the
>>>> >> >>> coordinator
>>>> >> >>> does
>>>> >> >>> not. I issued CREATE NODE commands to both services but they
>>>> >> >>> don't
>>>> >> >>> seem to
>>>> >> >>> show up in the log.
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> On Thu, Feb 13, 2014 at 9:49 AM, Rishi Ramraj
>>>> >> >>> <the...@gm...> wrote:
>>>> >> >>>>
>>>> >> >>>> I haven't been issuing commits or aborts every 30 seconds. So
>>>> >> >>>> far,
>>>> >> >>>> I've
>>>> >> >>>> only issued five commands to the cluster using psql, but I have
>>>> >> >>>> over
>>>> >> >>>> 100
>>>> >> >>>> logs. I will try 1.1 today and let you know.
>>>> >> >>>>
>>>> >> >>>> Do you use a specific branching methodology, like gitflow? Do
>>>> >> >>>> you
>>>> >> >>>> have a
>>>> >> >>>> bug tracking system? If you need any help with release
>>>> >> >>>> engineering,
>>>> >> >>>> let me
>>>> >> >>>> know; I don't mind volunteering some time.
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> On Thu, Feb 13, 2014 at 1:41 AM, 鈴木 幸市
>>>> >> >>>> <ko...@in...>
>>>> >> >>>> wrote:
>>>> >> >>>>>
>>>> >> >>>>> GTM message is just a report.   When GTM starts, it tries to
>>>> >> >>>>> read if
>>>> >> >>>>> there is any slave connected in previous run and it didn’t find
>>>> >> >>>>> one.
>>>> >> >>>>>
>>>> >> >>>>> The first message looks not harmful but could be some potential
>>>> >> >>>>> issues.
>>>> >> >>>>> Please let me look into it.   The chance to have this message
>>>> >> >>>>> is at
>>>> >> >>>>> the
>>>> >> >>>>> initialization of each datanode/coordinator, COMMIT, ABORT,
>>>> >> >>>>> COMMIT
>>>> >> >>>>> PREPARED
>>>> >> >>>>> or ABORT PREPARED.
>>>> >> >>>>>
>>>> >> >>>>> Do you have a chance to issue them every 30 seconds?
>>>> >> >>>>>
>>>> >> >>>>> If possible, could you try release 1.1 and see if you have the
>>>> >> >>>>> same
>>>> >> >>>>> issue (first message)?  I think release 1.1 is better because
>>>> >> >>>>> master
>>>> >> >>>>> is
>>>> >> >>>>> anyway development branch.
>>>> >> >>>>>
>>>> >> >>>>> Best;
>>>> >> >>>>> ---
>>>> >> >>>>> Koichi Suzuki
>>>> >> >>>>>
>>>> >> >>>>> 2014/02/13 14:51、Rishi Ramraj <the...@gm...>
>>>> >> >>>>> のメール：
>>>> >> >>>>>
>>>> >> >>>>> > Hello All,
>>>> >> >>>>> >
>>>> >> >>>>> > I just installed postgres-xc from the git master branch on a
>>>> >> >>>>> > test
>>>> >> >>>>> > machine. All processes are running on the same box. On both
>>>> >> >>>>> > the
>>>> >> >>>>> > coordinator
>>>> >> >>>>> > and data processes, I'm getting the following logs about
>>>> >> >>>>> > every 30
>>>> >> >>>>> > seconds:
>>>> >> >>>>> >
>>>> >> >>>>> > LOG:  failed to find proc 0x7fd9ee703f80 in ProcArray
>>>> >> >>>>> >
>>>> >> >>>>> > On the GTM process, I'm getting the following logs at about
>>>> >> >>>>> > the
>>>> >> >>>>> > same
>>>> >> >>>>> > frequency:
>>>> >> >>>>> >
>>>> >> >>>>> > LOG:  Any GTM standby node not found in registered node(s).
>>>> >> >>>>> > LOCATION:  gtm_standby_connect_to_standby_int,
>>>> >> >>>>> > gtm_standby.c:381
>>>> >> >>>>> >
>>>> >> >>>>> > The cluster seems to be working properly. I was able to
>>>> >> >>>>> > create a
>>>> >> >>>>> > new
>>>> >> >>>>> > database and a table within that database without any
>>>> >> >>>>> > problem. I
>>>> >> >>>>> > restarted
>>>> >> >>>>> > all services and the data was persisted, but the logs
>>>> >> >>>>> > persist. Any
>>>> >> >>>>> > idea
>>>> >> >>>>> > what's causing these logs?
>>>> >> >>>>> >
>>>> >> >>>>> > Thanks,
>>>> >> >>>>> > - Rishi
>>>> >> >>>>> >
>>>> >> >>>>> >
>>>> >> >>>>> >
>>>> >> >>>>> > ------------------------------------------------------------------------------
>>>> >> >>>>> > Android apps run on BlackBerry 10
>>>> >> >>>>> > Introducing the new BlackBerry 10.2.1 Runtime for Android
>>>> >> >>>>> > apps.
>>>> >> >>>>> > Now with support for Jelly Bean, Bluetooth, Mapview and more.
>>>> >> >>>>> > Get your Android app in front of a whole new audience.  Start
>>>> >> >>>>> > now.
>>>> >> >>>>> >
>>>> >> >>>>> >
>>>> >> >>>>> >
>>>> >> >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk_______________________________________________
>>>> >> >>>>> > Postgres-xc-general mailing list
>>>> >> >>>>> > Pos...@li...
>>>> >> >>>>> >
>>>> >> >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>> >> >>>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> --
>>>> >> >>>> Cheers,
>>>> >> >>>> - Rishi
>>>> >> >>>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> --
>>>> >> >>> Cheers,
>>>> >> >>> - Rishi
>>>> >> >>>
>>>> >> >>>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Cheers,
>>>> >> >> - Rishi
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> > Cheers,
>>>> >> > - Rishi
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > ------------------------------------------------------------------------------
>>>> >> > Android apps run on BlackBerry 10
>>>> >> > Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
>>>> >> > Now with support for Jelly Bean, Bluetooth, Mapview and more.
>>>> >> > Get your Android app in front of a whole new audience.  Start now.
>>>> >> >
>>>> >> >
>>>> >> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
>>>> >> > _______________________________________________
>>>> >> > Postgres-xc-general mailing list
>>>> >> > Pos...@li...
>>>> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>> >> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Cheers,
>>>> > - Rishi
>>>
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> - Rishi
>>
>>
>>
>>
>> --
>> Cheers,
>> - Rishi
>
>
>
>
> --
> Cheers,
> - Rishi