From: Pavan D. <pav...@gm...> - 2013-07-11 10:07:18
|
Hello All, I wonder if we should add appropriate dependencies in pg_depend so that a user should not be able to DROP a NODE without first changing the distribution information of the tables that are currently using that NODE. I could very quickly try out a case where I deleted a NODE and the queries on the table started returning wrong results or INSERTs started to a wrong node. test=# SELECT * from pgxc_node; node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id -----------+-----------+-----------+-----------+----------------+------------------+------------ c2 | C | 6433 | localhost | f | f | 558734100 d2 | D | 7433 | localhost | f | f | 823431008 C1 | C | 6432 | localhost | f | f | -551723744 d1 | D | 7432 | localhost | f | f | 342786568 (4 rows) test=# CREATE TABLE test_nodedel(a int) DISTRIBUTE BY HASH(a); CREATE TABLE test=# INSERT INTO test_nodedel VALUES (1), (2), (3); INSERT 0 3 test=# SELECT *, xc_node_id FROM test_nodedel ; a | xc_node_id ---+------------ 1 | 342786568 2 | 342786568 3 | 823431008 (3 rows) test=# DROP NODE d1; DROP NODE test=# SELECT pgxc_pool_reload(); pgxc_pool_reload ------------------ t (1 row) test=# SELECT *, xc_node_id FROM test_nodedel WHERE a = 1; a | xc_node_id ---+------------ (0 rows) test=# SELECT *, xc_node_id FROM test_nodedel WHERE a = 2; a | xc_node_id ---+------------ (0 rows) Since both these rows are on datanode d1 and I just dropped that node, may be we are not getting those rows back. Fair enough. test=# SELECT *, xc_node_id FROM test_nodedel WHERE a = 3; ERROR: Invalid Datanode number Here I would have expected to get the row because d2 is still running and is configured on the coordinator. At the least, we should give a better error. test=# INSERT INTO test_nodedel VALUES (1), (2); INSERT 0 2 This INSERT works OK though and the rows get inserted without any error. test=# CREATE NODE d1 WITH (type = datanode, port = 7432); CREATE NODE Now I recreated the node with the same information. test=# SELECT pgxc_pool_reload(); pgxc_pool_reload ------------------ t (1 row) test=# SELECT *, xc_node_id FROM test_nodedel; a | xc_node_id ---+------------ 1 | 342786568 2 | 342786568 3 | 823431008 1 | 823431008 2 | 823431008 (5 rows) But now we have a problem. The last two tuples inserted went into a wrong node. Note that the xc_node_id value is different from the same value of column 'a' which is the hash distribution column. I haven't looked at the code to see if this can be fixed easily, but looks like a bug to me. I am not sure if the node management code is smart enough to deal with scenarios like these. Or do we expect the DBAs to take care of such issues i.e. expect them to take extreme care while dealing with node management ? Can adding appropriate dependencies between objects could solve this ? Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee |