|
From: Aaron J. <aja...@re...> - 2014-06-02 21:11:16
|
I tried to create a database as follows ...
CREATE TABLE Schema.TableFoo(
SomeId serial NOT NULL,
ForeignId int NOT NULL,
...
) WITH (OIDS = FALSE);
The server returned the following...
ERROR: GTM error, could not create sequence
Looked at the server logs for the gtm_proxy, nothing so I went to the gtm.
LOCATION: pq_copymsgbytes, pqformat.c:554
1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: No transaction handle for gxid: 0
LOCATION: GTM_GXIDToHandle, gtm_txn.c:163
1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: Invalid transaction handle: -1
LOCATION: GTM_HandleToTransactionInfo, gtm_txn.c:213
1:140488486782720:2014-06-02 21:10:58.870 UTC -ERROR: Failed to get a snapshot
LOCATION: ProcessGetSnapshotCommandMulti, gtm_snap.c:420
1:140488478390016:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data left in message
LOCATION: pq_copymsgbytes, pqformat.c:554
1:140488486782720:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data left in message
LOCATION: pq_copymsgbytes, pqformat.c:554
I'm definitely confused here. This cluster has been running fine for several days now. And now the GTM is failing. I performed a restart of the gtm and proxies (each using gtm_ctl to stop and restart the instance). Nothing has changed, the GTM continues to fail and will not create the sequence.
Any ideas?
Aaron
|
|
From: Aaron J. <aja...@re...> - 2014-06-03 07:20:38
|
I've been able to work my way backwards through the problem and have discovered the underlying problem. When a data coordinator is paired with a GTM proxy, it forwards its message to the GTM proxy who adds some data to the payload and forwards it to the GTM. Here is what I saw when looking at the wire.
The message captured between the data coordinator and the GTM proxy was as follows:
430000003d000000270000001064656d6f2e7075626c69632e666f6f0001000000000000000100000000000000ffffffffffffff7f010000000000000000
The message captured between the GTM proxy and the GTM was as follows:
430000000a00000000002746000000080000003b
Definitely a horrible truncation of the payload. The problem is in GTMProxy_ProxyCommand, specifically, the two calls to pq_getmsgunreadlen(). The assumption is that these are called before anything else. Unfortunately, the intel compiler calls pq_getmsgbytes() and subsequently calls the second instance of pq_getmsgunreadlen(). The second time it is called, the value returns zero and we end up with all kinds of byte truncation. I've attached a patch to fix the issue.
--- postgres-xc-1.2.1-orig/src/gtm/proxy/proxy_main.c 2014-04-03 05:18:38.000000000 +0000
+++ postgres-xc-1.2.1/src/gtm/proxy/proxy_main.c 2014-06-03 07:14:58.451411000 +0000
@@ -2390,6 +2390,7 @@
GTMProxy_CommandInfo *cmdinfo;
GTMProxy_ThreadInfo *thrinfo = GetMyThreadInfo;
GTM_ProxyMsgHeader proxyhdr;
+ size_t msgunreadlen = pq_getmsgunreadlen(message);
proxyhdr.ph_conid = conninfo->con_id;
@@ -2397,8 +2398,8 @@
if (gtmpqPutMsgStart('C', true, gtm_conn) ||
gtmpqPutnchar((char *)&proxyhdr, sizeof (GTM_ProxyMsgHeader), gtm_conn) ||
gtmpqPutInt(mtype, sizeof (GTM_MessageType), gtm_conn) ||
- gtmpqPutnchar(pq_getmsgbytes(message, pq_getmsgunreadlen(message)),
- pq_getmsgunreadlen(message), gtm_conn))
+ gtmpqPutnchar(pq_getmsgbytes(message, msgunreadlen),
+ msgunreadlen, gtm_conn))
elog(ERROR, "Error proxing data");
/*
Aaron
________________________________
From: Aaron Jackson [aja...@re...]
Sent: Monday, June 02, 2014 4:11 PM
To: pos...@li...
Subject: [Postgres-xc-general] Unable to create sequences
I tried to create a database as follows ...
CREATE TABLE Schema.TableFoo(
SomeId serial NOT NULL,
ForeignId int NOT NULL,
...
) WITH (OIDS = FALSE);
The server returned the following...
ERROR: GTM error, could not create sequence
Looked at the server logs for the gtm_proxy, nothing so I went to the gtm.
LOCATION: pq_copymsgbytes, pqformat.c:554
1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: No transaction handle for gxid: 0
LOCATION: GTM_GXIDToHandle, gtm_txn.c:163
1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: Invalid transaction handle: -1
LOCATION: GTM_HandleToTransactionInfo, gtm_txn.c:213
1:140488486782720:2014-06-02 21:10:58.870 UTC -ERROR: Failed to get a snapshot
LOCATION: ProcessGetSnapshotCommandMulti, gtm_snap.c:420
1:140488478390016:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data left in message
LOCATION: pq_copymsgbytes, pqformat.c:554
1:140488486782720:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data left in message
LOCATION: pq_copymsgbytes, pqformat.c:554
I'm definitely confused here. This cluster has been running fine for several days now. And now the GTM is failing. I performed a restart of the gtm and proxies (each using gtm_ctl to stop and restart the instance). Nothing has changed, the GTM continues to fail and will not create the sequence.
Any ideas?
Aaron
|
|
From: Masataka S. <pg...@gm...> - 2014-06-03 08:31:14
|
Hi, Aaron
I think you've done right analysis.
The order of evaluation of sub-expressions and the order in which side
effects take place are frequently defined as unspecified behavior by
the C Standard. As you saw, it's one of the such case that the order
in which the arguments to a function are evaluated.
The patch looks good, and I think we must back-patch it to all the
versions after 1.0.
Regards.
On 3 June 2014 16:20, Aaron Jackson <aja...@re...> wrote:
> I've been able to work my way backwards through the problem and have
> discovered the underlying problem. When a data coordinator is paired with a
> GTM proxy, it forwards its message to the GTM proxy who adds some data to
> the payload and forwards it to the GTM. Here is what I saw when looking at
> the wire.
>
> The message captured between the data coordinator and the GTM proxy was as
> follows:
>
> 430000003d000000270000001064656d6f2e7075626c69632e666f6f0001000000000000000100000000000000ffffffffffffff7f010000000000000000
>
> The message captured between the GTM proxy and the GTM was as follows:
>
> 430000000a00000000002746000000080000003b
>
>
> Definitely a horrible truncation of the payload. The problem is in
> GTMProxy_ProxyCommand, specifically, the two calls to pq_getmsgunreadlen().
> The assumption is that these are called before anything else.
> Unfortunately, the intel compiler calls pq_getmsgbytes() and subsequently
> calls the second instance of pq_getmsgunreadlen(). The second time it is
> called, the value returns zero and we end up with all kinds of byte
> truncation. I've attached a patch to fix the issue.
>
> --- postgres-xc-1.2.1-orig/src/gtm/proxy/proxy_main.c 2014-04-03
> 05:18:38.000000000 +0000
> +++ postgres-xc-1.2.1/src/gtm/proxy/proxy_main.c 2014-06-03
> 07:14:58.451411000 +0000
> @@ -2390,6 +2390,7 @@
> GTMProxy_CommandInfo *cmdinfo;
> GTMProxy_ThreadInfo *thrinfo = GetMyThreadInfo;
> GTM_ProxyMsgHeader proxyhdr;
> + size_t msgunreadlen = pq_getmsgunreadlen(message);
>
> proxyhdr.ph_conid = conninfo->con_id;
>
> @@ -2397,8 +2398,8 @@
> if (gtmpqPutMsgStart('C', true, gtm_conn) ||
> gtmpqPutnchar((char *)&proxyhdr, sizeof
> (GTM_ProxyMsgHeader), gtm_conn) ||
> gtmpqPutInt(mtype, sizeof (GTM_MessageType), gtm_conn) ||
> - gtmpqPutnchar(pq_getmsgbytes(message,
> pq_getmsgunreadlen(message)),
> - pq_getmsgunreadlen(message),
> gtm_conn))
> + gtmpqPutnchar(pq_getmsgbytes(message, msgunreadlen),
> + msgunreadlen, gtm_conn))
> elog(ERROR, "Error proxing data");
>
> /*
>
>
>
> Aaron
> ________________________________
> From: Aaron Jackson [aja...@re...]
> Sent: Monday, June 02, 2014 4:11 PM
> To: pos...@li...
> Subject: [Postgres-xc-general] Unable to create sequences
>
> I tried to create a database as follows ...
>
> CREATE TABLE Schema.TableFoo(
> SomeId serial NOT NULL,
> ForeignId int NOT NULL,
> ...
> ) WITH (OIDS = FALSE);
>
>
> The server returned the following...
>
> ERROR: GTM error, could not create sequence
>
>
> Looked at the server logs for the gtm_proxy, nothing so I went to the gtm.
>
> LOCATION: pq_copymsgbytes, pqformat.c:554
> 1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: No transaction
> handle for gxid: 0
> LOCATION: GTM_GXIDToHandle, gtm_txn.c:163
> 1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: Invalid transaction
> handle: -1
> LOCATION: GTM_HandleToTransactionInfo, gtm_txn.c:213
> 1:140488486782720:2014-06-02 21:10:58.870 UTC -ERROR: Failed to get a
> snapshot
> LOCATION: ProcessGetSnapshotCommandMulti, gtm_snap.c:420
> 1:140488478390016:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data
> left in message
> LOCATION: pq_copymsgbytes, pqformat.c:554
> 1:140488486782720:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data
> left in message
> LOCATION: pq_copymsgbytes, pqformat.c:554
>
>
> I'm definitely confused here. This cluster has been running fine for
> several days now. And now the GTM is failing. I performed a restart of the
> gtm and proxies (each using gtm_ctl to stop and restart the instance).
> Nothing has changed, the GTM continues to fail and will not create the
> sequence.
>
> Any ideas?
>
> Aaron
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
|
|
From: 鈴木 幸市 <ko...@in...> - 2014-06-05 05:18:23
|
Hi,
I reviewed the patch.
+1 to go to master and 1.x.
Regards;
---
Koichi Suzuki
2014/06/03 17:31、Masataka Saito <pg...@gm...> のメール:
> Hi, Aaron
>
> I think you've done right analysis.
>
> The order of evaluation of sub-expressions and the order in which side
> effects take place are frequently defined as unspecified behavior by
> the C Standard. As you saw, it's one of the such case that the order
> in which the arguments to a function are evaluated.
>
> The patch looks good, and I think we must back-patch it to all the
> versions after 1.0.
>
> Regards.
>
> On 3 June 2014 16:20, Aaron Jackson <aja...@re...> wrote:
>> I've been able to work my way backwards through the problem and have
>> discovered the underlying problem. When a data coordinator is paired with a
>> GTM proxy, it forwards its message to the GTM proxy who adds some data to
>> the payload and forwards it to the GTM. Here is what I saw when looking at
>> the wire.
>>
>> The message captured between the data coordinator and the GTM proxy was as
>> follows:
>>
>> 430000003d000000270000001064656d6f2e7075626c69632e666f6f0001000000000000000100000000000000ffffffffffffff7f010000000000000000
>>
>> The message captured between the GTM proxy and the GTM was as follows:
>>
>> 430000000a00000000002746000000080000003b
>>
>>
>> Definitely a horrible truncation of the payload. The problem is in
>> GTMProxy_ProxyCommand, specifically, the two calls to pq_getmsgunreadlen().
>> The assumption is that these are called before anything else.
>> Unfortunately, the intel compiler calls pq_getmsgbytes() and subsequently
>> calls the second instance of pq_getmsgunreadlen(). The second time it is
>> called, the value returns zero and we end up with all kinds of byte
>> truncation. I've attached a patch to fix the issue.
>>
>> --- postgres-xc-1.2.1-orig/src/gtm/proxy/proxy_main.c 2014-04-03
>> 05:18:38.000000000 +0000
>> +++ postgres-xc-1.2.1/src/gtm/proxy/proxy_main.c 2014-06-03
>> 07:14:58.451411000 +0000
>> @@ -2390,6 +2390,7 @@
>> GTMProxy_CommandInfo *cmdinfo;
>> GTMProxy_ThreadInfo *thrinfo = GetMyThreadInfo;
>> GTM_ProxyMsgHeader proxyhdr;
>> + size_t msgunreadlen = pq_getmsgunreadlen(message);
>>
>> proxyhdr.ph_conid = conninfo->con_id;
>>
>> @@ -2397,8 +2398,8 @@
>> if (gtmpqPutMsgStart('C', true, gtm_conn) ||
>> gtmpqPutnchar((char *)&proxyhdr, sizeof
>> (GTM_ProxyMsgHeader), gtm_conn) ||
>> gtmpqPutInt(mtype, sizeof (GTM_MessageType), gtm_conn) ||
>> - gtmpqPutnchar(pq_getmsgbytes(message,
>> pq_getmsgunreadlen(message)),
>> - pq_getmsgunreadlen(message),
>> gtm_conn))
>> + gtmpqPutnchar(pq_getmsgbytes(message, msgunreadlen),
>> + msgunreadlen, gtm_conn))
>> elog(ERROR, "Error proxing data");
>>
>> /*
>>
>>
>>
>> Aaron
>> ________________________________
>> From: Aaron Jackson [aja...@re...]
>> Sent: Monday, June 02, 2014 4:11 PM
>> To: pos...@li...
>> Subject: [Postgres-xc-general] Unable to create sequences
>>
>> I tried to create a database as follows ...
>>
>> CREATE TABLE Schema.TableFoo(
>> SomeId serial NOT NULL,
>> ForeignId int NOT NULL,
>> ...
>> ) WITH (OIDS = FALSE);
>>
>>
>> The server returned the following...
>>
>> ERROR: GTM error, could not create sequence
>>
>>
>> Looked at the server logs for the gtm_proxy, nothing so I went to the gtm.
>>
>> LOCATION: pq_copymsgbytes, pqformat.c:554
>> 1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: No transaction
>> handle for gxid: 0
>> LOCATION: GTM_GXIDToHandle, gtm_txn.c:163
>> 1:140488486782720:2014-06-02 21:10:58.870 UTC -WARNING: Invalid transaction
>> handle: -1
>> LOCATION: GTM_HandleToTransactionInfo, gtm_txn.c:213
>> 1:140488486782720:2014-06-02 21:10:58.870 UTC -ERROR: Failed to get a
>> snapshot
>> LOCATION: ProcessGetSnapshotCommandMulti, gtm_snap.c:420
>> 1:140488478390016:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data
>> left in message
>> LOCATION: pq_copymsgbytes, pqformat.c:554
>> 1:140488486782720:2014-06-02 21:10:58.871 UTC -ERROR: insufficient data
>> left in message
>> LOCATION: pq_copymsgbytes, pqformat.c:554
>>
>>
>> I'm definitely confused here. This cluster has been running fine for
>> several days now. And now the GTM is failing. I performed a restart of the
>> gtm and proxies (each using gtm_ctl to stop and restart the instance).
>> Nothing has changed, the GTM continues to fail and will not create the
>> sequence.
>>
>> Any ideas?
>>
>> Aaron
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and their
>> applications. Written by three acclaimed leaders in the field,
>> this first edition is now available. Download your free book today!
>> http://p.sf.net/sfu/NeoTech
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
|