#481 CREATE EXTENSION Failure

1.1 Dev Q
open
nobody
5
2014-02-22
2014-02-20
David E. Wheeler
No

Copy the two attached files to $PGXC_INSTALL_BASE/share/extension on each data node. Then connect to a coordinator and try to create it. When I do it, I get errors:

# CREATE EXTENSION nodeid;
WARNING:  unexpected EOF on datanode connection
WARNING:  unexpected EOF on datanode connection
ERROR:  type "node_id" already exists

Try again and find you've been disconnected:

# CREATE EXTENSION nodeid;
You are currently not connected to a database.

So reconnect and try again:

# \c dwheeler
# CREATE EXTENSION nodeid;
ERROR:  type "node_id" already exists

No EOF warnings this time, but still, where is this data type defined? I can't find it on this coordinator:

# \dT+ node.*
                                List of data types
 Schema | Name | Internal name | Size | Elements | Access privileges | Description 
--------+------+---------------+------+----------+-------------------+-------------
(0 rows)

But I connect to a data node, and now I can see it:

=# \dT+ node*
                                  List of data types
 Schema |  Name   | Internal name | Size  | Elements | Access privileges | Description 
--------+---------+---------------+-------+----------+-------------------+-------------
 public | node_id | node_id       | tuple |          |                   | 
(1 row)

I have four coordinators and four data nodes. The data type appears to be present on all of the data nodes and all of the coordinators except for the one I actually connected to.

There is no way for me to clean this up, either, short of dropping the database in which I attempted to create the extension (or the schema with CASCADE).

I have two questions about this issue:

  1. What causes the error such that the coordinator I'm connected to thinks that the node_id type is not created when it manages to be created on the other nodes?

  2. How could it be present on any other nodes at all? If there was an error, shouldn't thinks have been rolled back to prevent their creation? Could there be a two-phase commit issue here?

Thanks!

2 Attachments

Discussion

  • Log output. From the coordinator on which I ran the command:

    WARNING:  Do not have a GTM snapshot available
    WARNING:  Do not have a GTM snapshot available
    WARNING:  Do not have a GTM snapshot available
    WARNING:  Do not have a GTM snapshot available
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b WARNING:  unexpected EOF on datanode connection
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b ERROR:  type "node_id" already exists
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b STATEMENT:  CREATE EXTENSION nodeid;
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b WARNING:  unexpected EOF on datanode connection
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b LOG:  Failed to ABORT at node 16404
        Detail: unexpected EOF on datanode connection
    2014-02-21 16:31:37 PST dwheeler [local] 18075 5307efe9.469b LOG:  Failed to ABORT an implicitly PREPARED transaction status - 7
    

    From the primary data node:

    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(45626) 18077 5307efe9.469d WARNING:  Do not have a GTM snapshot available
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(45626) 18077 5307efe9.469d LOG:  Falling back to local Xid. Was = 0, now is = 53146
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(45626) 18077 5307efe9.469d STATEMENT:  SET client_min_messages = warning;
    
        CREATE TYPE node_id AS (
            customer_id INTEGER,
            username    TEXT,
            type        BIGINT
        );
    
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(45626) 18077 5307efe9.469d ERROR:  type "node_id" already exists
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(45626) 18077 5307efe9.469d STATEMENT:  CREATE EXTENSION nodeid;
    

    From another coordinator:

    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51612) 21621 5307efe9.5475 LOG:  Falling back to local Xid. Was = 0, now is = 53157
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51612) 21621 5307efe9.5475 STATEMENT:  SET client_min_messages = warning;
    
        CREATE TYPE node_id AS (
            customer_id INTEGER,
            username    TEXT,
            type        BIGINT
        );
    
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51612) 21621 5307efe9.5475 ERROR:  type "node_id" already exists
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51612) 21621 5307efe9.5475 STATEMENT:  CREATE EXTENSION nodeid;
    

    And another data node:

    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(49818) 21622 5307efe9.5476 WARNING:  Do not have a GTM snapshot available
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(49818) 21622 5307efe9.5476 LOG:  Falling back to local Xid. Was = 0, now is = 53155
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(49818) 21622 5307efe9.5476 STATEMENT:  SET client_min_messages = warning;
    
        CREATE TYPE node_id AS (
            customer_id INTEGER,
            username    TEXT,
            type        BIGINT
        );
    
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(49818) 21622 5307efe9.5476 ERROR:  type "node_id" already exists
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(49818) 21622 5307efe9.5476 STATEMENT:  CREATE EXTENSION nodeid;
    

    A third data node looks pretty ugly:

    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df WARNING:  Do not have a GTM snapshot available
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df LOG:  Falling back to local Xid. Was = 0, now is = 53160
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df STATEMENT:  SET client_min_messages = warning;
    
        CREATE TYPE node_id AS (
            customer_id INTEGER,
            username    TEXT,
            type        BIGINT
        );
    
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df ERROR:  type "node_id" already exists
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df STATEMENT:  CREATE EXTENSION nodeid;
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51883) 26847 5307efe9.68df PANIC:  cannot abort transaction 53160, it was already committed
    2014-02-21 16:31:37 PST   5162 53065e9f.142a LOG:  server process (PID 26847) was terminated by signal 6: Aborted
    2014-02-21 16:31:37 PST   5162 53065e9f.142a DETAIL:  Failed process was running: CREATE EXTENSION nodeid;
    2014-02-21 16:31:37 PST   5162 53065e9f.142a LOG:  terminating any other active server processes
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51844) 26776 5307ee22.6898 WARNING:  terminating connection because of crash of another server process
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51844) 26776 5307ee22.6898 DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
    2014-02-21 16:31:37 PST dwheeler 10.4.34.1(51844) 26776 5307ee22.6898 HINT:  In a moment you should be able to reconnect to the database and repeat your command.
    2014-02-21 16:31:37 PST [unknown] 10.4.34.1(51885) 26848 5307efe9.68e0 LOG:  PID 26847 in cancel request did not match any process
    2014-02-21 16:31:37 PST   5169 53065e9f.1431 WARNING:  terminating connection because of crash of another server process
    2014-02-21 16:31:37 PST   5169 53065e9f.1431 DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
    2014-02-21 16:31:37 PST   5169 53065e9f.1431 HINT:  In a moment you should be able to reconnect to the database and repeat your command.
    2014-02-21 16:31:37 PST   5162 53065e9f.142a LOG:  all server processes terminated; reinitializing
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  database system was interrupted; last known up at 2014-02-21 16:29:09 PST
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  database system was not properly shut down; automatic recovery in progress
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  redo starts at C/5BBA2688
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  record with zero length at C/5BBB53D0
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  redo done at C/5BBB53A0
    2014-02-21 16:31:37 PST   26849 5307efe9.68e1 LOG:  last completed transaction was at log time 2014-02-22 00:31:37.34089+00
    2014-02-21 16:31:38 PST   5162 53065e9f.142a LOG:  database system is ready to accept connections
    2014-02-21 16:31:38 PST   26853 5307efea.68e5 LOG:  autovacuum launcher started
    

    Curious that I have problems with no other extensions or loads. I will dig into the GTM warnings, though.