From: Andrei M. <and...@gm...> - 2014-02-14 06:41:35
|
It is not an issue of PG, but the way as XC uses the feature. It is somewhat differs from intended. 14.02.2014 7:21 пользователь "Koichi Suzuki" <koi...@gm...> написал: > It seems to be an issue of PG itself, doesn't it? > --- > Koichi Suzuki > > > 2014-02-14 14:06 GMT+09:00 Andrei Martsinchyk < > and...@gm...>: > > You are right, the temp objects are problem. > > On the one hand if we run a long query and there was an error on one > node we > > want to cancel it on others to avoid unnecessary waiting. On the other > hand > > the query may be near its natural end and the cancel may be late and hit > the > > next query. > > Just throwing out ideas: > > - Make Cancel more selective and affect only specific query. That means > an > > ID for each query to introduce, that should be known to client and way to > > deliver it. > > - Introduce procedure of changing backend key. Old cancel won't affect > such > > backend. > > - Before starting new query, check if there is pending cancel and remove > it. > > It sounds ridiculous "cancel cancel" but may work, if queries and cancels > > are issued synchronously from single source. > > > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> > > написал: > > > >> I misunderstand the implication. Anyway additional wait is separate > >> from your suggestion. > >> > >> Disconnecting the connection as you suggested will bring another > >> problem such as TEMPORARY object in the subsequent queries. We do > >> not support TEMPORARY object but I believe we should be consistent on > >> this for future releases. > >> > >> Thoughts? > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk > >> <and...@gm...>: > >> > Hello, > >> > > >> > Postgres establishes separate connection to deliver Cancel command to > >> > the > >> > target session. > >> > On a heavily loaded node it may take fairly long. Longer sleep would > >> > help > >> > out, but it means longer recovery after an error. > >> > Better solution is to remove canceled connection from the pool and > >> > therefore > >> > do not use it to handle subsequent queries. > >> > > >> > > >> > > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: > >> >> > >> >> I think it hits the point. I tested this patch several times and it > >> >> seems to work fine. The delay time (at present 10ms) is short > enough > >> >> and it is applied only when we need to cancel a statement. > >> >> > >> >> We should check this into all the master and STABLE branches > improving > >> >> magic number with some meaningful name. > >> >> > >> >> Any thoughts? > >> >> --- > >> >> Koichi Suzuki > >> >> > >> >> > >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: > >> >> > Hello, > >> >> > > >> >> > As I've been exasperated by random failures, I'm willing to whip > the > >> >> > cause > >> >> > of the issue. > >> >> > > >> >> > This issue is related to cancel of the failed query. > >> >> > When a datanode reports an error of a query, a coordinator sends a > >> >> > cancel > >> >> > request to non-idle nodes, waits the node to get ready and requests > >> >> > nodes to > >> >> > rollback the transaction. > >> >> > > >> >> > Where's the problem? Consider the next case. > >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' > >> >> > message) > >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. > >> >> > ([2] > >> >> > 'E' > >> >> > message) > >> >> > 3. Coordinator A starts aborting process and it thinks datanode A > >> >> > (PID > >> >> > 1) is > >> >> > not idle. > >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A > >> >> > (PID > >> >> > 2). > >> >> > ([3] cancel message) > >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' > >> >> > message) > >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" > >> >> > immediately. > >> >> > ([5] 'Q' message) > >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. > >> >> > 8. Datanode A (PID 2) receives [3]. > >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. > >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error > to > >> >> > Coordinator A. ([6] 'E' message) > >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. > >> >> > ([7] > >> >> > 'E' > >> >> > message) > >> >> > > >> >> > [7] makes unexpected output and a test fails. > >> >> > > >> >> > Saying an extreme thing, it could occur that the next query of [5] > is > >> >> > cancelled by [3]. > >> >> > > >> >> > As far as I know, there's no way to know when to the cancel request > >> >> > get > >> >> > to > >> >> > be processed, I think we can't not wait an experimental duration > >> >> > after > >> >> > cancelling like the attached patch. > >> >> > > >> >> > Does anyone have another cool idea to solve this issue? > >> >> > > >> >> > Regards. > >> >> > > >> >> > > >> >> > > >> >> > > ------------------------------------------------------------------------------ > >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For > >> >> > Critical Workloads, Development Environments & Everything In > Between. > >> >> > Get a Quote or Start a Free Trial Today. > >> >> > > >> >> > > >> >> > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > >> >> > _______________________________________________ > >> >> > Postgres-xc-developers mailing list > >> >> > Pos...@li... > >> >> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >> > > >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ > >> >> Android apps run on BlackBerry 10 > >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> >> Get your Android app in front of a whole new audience. Start now. > >> >> > >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> >> > >> >> _______________________________________________ > >> >> Postgres-xc-developers mailing list > >> >> Pos...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> > > >> > > >> > > >> > > >> > -- > >> > Andrei Martsinchyk > >> > > >> > StormDB - http://www.stormdb.com > >> > The Database Cloud > >> > > |