From: Koichi S. <koi...@gm...> - 2010-12-14 01:15:11
|
Hi, please see inline... ---------- Koichi Suzuki 2010/12/13 Mason Sharp <mas...@en...>: > On 12/12/10 9:28 PM, Michael Paquier wrote: >> >> I reviewed, and I thought it looked good, except for a possible issue with >> committing. >> >> I wanted to test what happened with implicit transactions when there was a >> failure. >> >> I executed this in one session: >> >> mds1=# begin; >> BEGIN >> mds1=# insert into mds1 values (1,1); >> INSERT 0 1 >> mds1=# insert into mds1 values (2,2); >> INSERT 0 1 >> mds1=# commit; >> >> Before committing, I fired up gdb for a coordinator session and a data >> node session. >> >> On one of the data nodes, when the COMMIT PREPARED was received, I killed >> the backend to see what would happen. On the Coordinator I saw this: >> >> >> WARNING: unexpected EOF on datanode connection >> WARNING: Connection to Datanode 1 has unexpected state 1 and will be >> dropped >> WARNING: Connection to Datanode 2 has unexpected state 1 and will be >> dropped >> >> ERROR: Could not commit prepared transaction implicitely >> PANIC: cannot abort transaction 10312, it was already committed >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Failed. >> >> I am not sure we should be aborting 10312, since it was committed on one >> of the nodes. It corresponds to the original prepared transaction. We also >> do not want a panic to happen. > > This has to be corrected. > If a PANIC happens on Coordinators each time a Datanode crashes, a simple > node crash would mess up the whole cluster. > It is a real problem I think. > > Yes. > > >> >> Next, I started a new coordinator session: >> >> mds1=# select * from mds1; >> col1 | col2 >> ------+------ >> 2 | 2 >> (1 row) >> >> >> I only see one of the rows. I thought, well, ok, we cannot undo a commit, >> and the other one must commit eventually. I was able to continue working >> normally: >> >> mds1=# insert into mds1 values (3,3); >> INSERT 0 1 >> mds1=# insert into mds1 values (4,4); >> INSERT 0 1 >> mds1=# insert into mds1 values (5,5); >> INSERT 0 1 >> mds1=# insert into mds1 values (6,6); >> INSERT 0 1 Are these statements run as a transaction block or did they run as "autocommit" statements? >> >> mds1=# select xmin,* from mds1; >> xmin | col1 | col2 >> -------+------+------ >> 10420 | 4 | 4 >> 10422 | 6 | 6 >> 10312 | 2 | 2 >> 10415 | 3 | 3 >> 10421 | 5 | 5 >> (5 rows) >> >> >> Note xmin keeps increasing because we closed the transaction on GTM at the >> "finish:" label. This may or may not be ok. > > This should be OK, no? If the above statements ran in "autocommit" mode, each statement ran as separate transaction. Xmin just indicates GXID which "created" the row. To determine if it is visible or not, we have to visit CLOG (if GXID is not "frozen") and the list of live transactions to see if it is running, committed or aborted. Then we can determine if a given row should be visible or not. Therefore, if the creator transaction is left just "PREPARED", the creator transaction information will remain in PgProc and is regarded "running", thus it should be regarded "invisible" from other transactions. Similar consideration should be made to see "xmac" value of the row, in the case of "update" or "delete" statement. Hope it helps. --- Koichi Suzuki > > Not necessarily. > > >> >> Meanwhile, on the failed data node: >> >> mds1=# select * from pg_prepared_xacts; >> WARNING: Do not have a GTM snapshot available >> WARNING: Do not have a GTM snapshot available >> transaction | gid | prepared | owner | >> database >> >> -------------+--------+-------------------------------+------------+---------- >> 10312 | T10312 | 2010-12-12 12:04:30.946287-05 | xxxxxx | mds1 >> (1 row) >> >> The transaction id is 10312. Normally this would still appear in >> snapshots, but we close it on GTM. >> >> What should we do? >> >> - We could leave as is. We may in the future have an XC monitoring process >> look for possible 2PC anomalies occasionally and send an alert so that they >> could be resolved by a DBA. > > I was thinking about an external utility that could clean up partially > committed or prepared transactions when a node crash happens. > This is a part of HA, so I think the only thing that should be corrected now > is the way errors are managed in the case of a partially committed prepared > transaction on nodes. > A PANIC is not acceptable for this case. > >> >> - We could instead choose not close out the transaction on GTM, so that >> the xid is still in snapshots. We could test if the rows are viewable or >> not. This could result in other side effects, but without further testing, I >> am guessing this may be similar to when an existing statement is running and >> cannot see a previously committed transaction that is open in its snapshot. >> So, I am thinking this is probably the preferable option (keeping it open on >> GTM until committed on all nodes), but we should test it. In any event, we >> should also fix the panic. > > If we let it open the transaction open on GTM, how do we know the GXID that > has been used for Commit (different from the one that has been used for > PREPARE as I recall)? > > We can test the behavior to see if it is ok to close this one out, > otherwise, we have more work to do... > > If we do a Commit prepare on the remaining node that crashed, we have to > commit the former PREPARE GXID, the former COMMIT PREPARED GXID and also the > GXID that is used to issue the new COMMIT PREPARED on the remaining node. > > It is easy to get the GXID used for former PREPARE and new COMMIT PREPARED. > But there is no real way yet to get back the GXID used for the former COMMIT > PREPARE. > I would see two ways to correct that: > 1) Save the former COMMIT PREPARED GXID in GTM, but this would really impact > performance. > 2) Save the COMMIT PREPARED GXID on Coordinator and let the GXACT open on > Coordinator (would be the best solution, but the transaction has already > been committed on Coordinator). > > I think we need to research the effects of this and see how the system > behaves if the partially failed commit prepared GXID is closed. I suppose it > could cause a problem with viewing pg_prepared_xacts. We don't want the > hint bits to get updated.... well, the first XID will be lower, so the lower > open xmin should keep this from having the tuple frozen. > > That's why I think the transaction should be to close the transaction on > GTM, and a monitoring agent would be in charge to commit on the remaining > nodes that crashed if a partial COMMIT has been done. > > From above, the node is still active and the query after the transaction is > returning partial results. It should be an all or nothing operation. If we > close the transaction on GTM, then it means that Postgres-XC is not atomic. > I think it is important to be ACID compliant. > > I think we should fix the panic, then test how the system behaves if, even > though the transaction is committed on one node, if we keep the transaction > open. The XID will appear in all the snapshots and the row should not be > viewable, and we can make sure that vacuum is also ok (should be). If it > works ok, then I think we should keep the transaction open on GTM until all > components have committed. > > > Btw, it is a complicated point, so other's opinion is completely welcome. > > Yes. > > Thanks, > > Mason > > Regards, > > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > > > > -- > Mason Sharp > EnterpriseDB Corporation > The Enterprise Postgres Company > > > This e-mail message (and any attachment) is intended for the use of > the individual or entity to whom it is addressed. This message > contains information from EnterpriseDB Corporation that may be > privileged, confidential, or exempt from disclosure under applicable > law. If you are not the intended recipient or authorized to receive > this for the intended recipient, any use, dissemination, distribution, > retention, archiving, or copying of this communication is strictly > prohibited. If you have received this e-mail in error, please notify > the sender immediately by reply e-mail and delete this message. > > ------------------------------------------------------------------------------ > Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, > new data types, scalar functions, improved concurrency, built-in packages, > OCI, SQL*Plus, data movement tools, best practices and more. > http://p.sf.net/sfu/oracle-sfdev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |