From: Wang D. <dia...@gm...> - 2014-05-07 10:46:20
|
I wrote an extension 'xc2pctest' to debug and test internal 2PC in Postgres-XC, which need a small patch to master code. The patch and source of the extension attached. I add a hook at remote commit stage on coordinator when it do a 2PC commit, the extension xc2pctest will make one node fail to commit and the other node success, the whole cluster will go into partial commit state. following are reproduce steps for the MVCC bug: 1. patch the git master code using attached patch, recompile and reinstall the whole Postgres-XC, then compile and install the extension xc2pctest 2. create a cluster with 1 GTM, and 1 Coordinator named co1, and 2 Datanodes named dn1 and dn2, and start it. 3. execute following SQL statement to create tables, load the extension and do some update to make cluster go into partial commit state: create table t1(a int, b int) to node (dn1); create table t2(a int, b int) to node (dn2); create extension xc2pctest; insert into t2 values(2,2); insert into t1 values(1,1); select install_xc2pctest_fail_hook(); begin; update t1 set b = 11; update t2 set b = 22; commit; execute direct on (dn1) 'select * from pg_prepared_xact()'; execute direct on (dn2) 'select * from pg_prepared_xact()'; 4. the step 3, we created table 't1' on Datanode 'dn1', and table 't2' on Datanode 'dn2', and we *just inserted 1 row* for each table and updated them. The transaction of updtes will be failed due to the extension enabled. The last 2 statements in step 3 will tell us which node (which table) success to commit, it is the table 't1' on my machine. 5. now update the success updated table (it is 't1' in my test), and show the result, the output is following in my machine: postgres=# update t1 set b = 111 where a = 1; UPDATE 1 postgres=# select xmin, xmax, * from t1; xmin | xmax | a | b -------+-------+---+----- 10018 | 10020 | 1 | 1 10096 | 10096 | 1 | 111 (2 rows) 6. you can see 2 rows in step 5(there should be just 1 rows, and updates shoud be blocked under read committed isolation level). I think this is a serious MVCC bug in Postgres-XC. |