From: Koichi S. <koi...@gm...> - 2014-02-14 02:07:21
|
I misunderstand the implication. Anyway additional wait is separate from your suggestion. Disconnecting the connection as you suggested will bring another problem such as TEMPORARY object in the subsequent queries. We do not support TEMPORARY object but I believe we should be consistent on this for future releases. Thoughts? --- Koichi Suzuki 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk <and...@gm...>: > Hello, > > Postgres establishes separate connection to deliver Cancel command to the > target session. > On a heavily loaded node it may take fairly long. Longer sleep would help > out, but it means longer recovery after an error. > Better solution is to remove canceled connection from the pool and therefore > do not use it to handle subsequent queries. > > > > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: >> >> I think it hits the point. I tested this patch several times and it >> seems to work fine. The delay time (at present 10ms) is short enough >> and it is applied only when we need to cancel a statement. >> >> We should check this into all the master and STABLE branches improving >> magic number with some meaningful name. >> >> Any thoughts? >> --- >> Koichi Suzuki >> >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: >> > Hello, >> > >> > As I've been exasperated by random failures, I'm willing to whip the >> > cause >> > of the issue. >> > >> > This issue is related to cancel of the failed query. >> > When a datanode reports an error of a query, a coordinator sends a >> > cancel >> > request to non-idle nodes, waits the node to get ready and requests >> > nodes to >> > rollback the transaction. >> > >> > Where's the problem? Consider the next case. >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' >> > message) >> > 2. Coordinator A receives [1] and reports an error to a frontend. ([2] >> > 'E' >> > message) >> > 3. Coordinator A starts aborting process and it thinks datanode A (PID >> > 1) is >> > not idle. >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A (PID >> > 2). >> > ([3] cancel message) >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' message) >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" >> > immediately. >> > ([5] 'Q' message) >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. >> > 8. Datanode A (PID 2) receives [3]. >> > 9. Datanode A (PID 2) notify PID 1 of [3]. >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error to >> > Coordinator A. ([6] 'E' message) >> > 11. Coordinator A receives [6] and reports an error to a frontend. ([7] >> > 'E' >> > message) >> > >> > [7] makes unexpected output and a test fails. >> > >> > Saying an extreme thing, it could occur that the next query of [5] is >> > cancelled by [3]. >> > >> > As far as I know, there's no way to know when to the cancel request get >> > to >> > be processed, I think we can't not wait an experimental duration after >> > cancelling like the attached patch. >> > >> > Does anyone have another cool idea to solve this issue? >> > >> > Regards. >> > >> > >> > ------------------------------------------------------------------------------ >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >> > Critical Workloads, Development Environments & Everything In Between. >> > Get a Quote or Start a Free Trial Today. >> > >> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > Postgres-xc-developers mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> >> >> ------------------------------------------------------------------------------ >> Android apps run on BlackBerry 10 >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> Get your Android app in front of a whole new audience. Start now. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > -- > Andrei Martsinchyk > > StormDB - http://www.stormdb.com > The Database Cloud > |