From: Nikhil S. <ni...@st...> - 2013-08-16 12:20:29
|
> Additionally, ISTM, that the tuplestore should only be used for inner > nodes. There's no need to store outer nodes in tuplestores. Also if it's a > single SELECT query, then there's no need to use tuplestore at all as well. > > PFA, a patch which tries to avoid using the tuplestore in the above two cases. During planning we decide if a tuplestore should be used for the RemoteQuery. The default is true, and we set it to false for the above two cases for now. I ran regression test cases with and without the patch and got the exact same set of failures (and more importantly same diffs). To be clear this patch is not specific to COPY TO, but it's a generic change to avoid using tuplestore in certain simple scenarios thereby reducing the memory footprint of the remote query execution. Note that it also does not solve Hitoshi-san's COPY FROM issues. Will submit a separate patch for that. Regards, Nikhils > Looks like if we can pass hints during plan creation as to whether the > remote scan is part of a join (and is inner node) or not, then accordingly > decision can be taken to materialize into the tuplestore. > > Regards, > Nikhils > > On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <ni...@st...>wrote: > >> Looks like my theory was wrong, make installcheck is giving more errors >> with this patch applied. Will have to look at a different solution.. >> >> Regards, >> Nikhils >> >> >> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <ni...@st...>wrote: >> >>> So, I looked at this code carefully and ISTM, that because of the way we >>> fetch the data from the connections and return it immediately inside >>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot >>> is NOT required at all. >>> >>> So, I have removed the call to tuplestore_puttupleslot and things seem >>> to be ok for me. I guess we should do a FULL test run with this patch just >>> to ensure that it does not cause issues in any scenarios. >>> >>> A careful look by new set of eyes will help here. I think, if there are >>> no issues, this plugs a major leak in the RemoteQuery code path which is >>> almost always used in our case. >>> >>> Regards, >>> Nikhils >>> >>> >>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <ni...@st...>wrote: >>> >>>> Using a tuplestore for data coming from RemoteQuery is kinda wrong and >>>> that's what has introduced this issue. Looks like just changing the memory >>>> context will not work as it interferes with the other functioning of the >>>> tuplestore :-| >>>> >>>> Regards, >>>> Nikhils >>>> >>>> >>>> >>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat < >>>> ash...@en...> wrote: >>>> >>>>> Yes, that's correct. >>>>> >>>>> My patch was not intended to fix this. This was added while fixing a >>>>> bug for parameterised quals on RemoteQuery I think. Check commits by Amit >>>>> in this area. >>>>> >>>>> >>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <ni...@st...>wrote: >>>>> >>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in it. >>>>>> >>>>>> Here's the snippet from RemoteQueryNext: >>>>>> >>>>>> if (tuplestorestate && !TupIsNull(scanslot)) >>>>>> tuplestore_puttupleslot(tuplestorestate, >>>>>> scanslot); >>>>>> >>>>>> I printed the current memory context inside this function, it is >>>>>> ""ExecutorState". This means that the tuple will stay around till the query >>>>>> is executing in its entirety! For large COPY queries this is bound to cause >>>>>> issues as is also reported by Hitoshi san on another thread. >>>>>> >>>>>> I propose that in RemoteQueryNext, before calling the >>>>>> tuplestore_puttupleslot we switch into >>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context. It will >>>>>> get reset, when the next tuple has to be returned to the caller and the >>>>>> leak will be curtailed. Thoughts? >>>>>> >>>>>> Regards, >>>>>> Nikhils >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat < >>>>>> ash...@en...> wrote: >>>>>> >>>>>>> There has been an overhaul in the planner (and corresponding parts >>>>>>> of executor) in 1.1, so it would be better if they move to 1.1 after GA. >>>>>>> >>>>>>> >>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke < >>>>>>> ni...@st...> wrote: >>>>>>> >>>>>>>> Ah, I see. >>>>>>>> >>>>>>>> I was looking at REL_1_0 sources. >>>>>>>> >>>>>>>> There are people out there using REL_1_0 as well. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Nikhils >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat < >>>>>>>> ash...@en...> wrote: >>>>>>>> >>>>>>>>> It should be part of 1.1 as well. It was done to support >>>>>>>>> projection out of RemoteQuery node. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke < >>>>>>>>> ni...@st...> wrote: >>>>>>>>> >>>>>>>>>> Hi Ashutosh, >>>>>>>>>> >>>>>>>>>> I guess you have changed it in pgxc head? I was looking at 103 >>>>>>>>>> and 11 branches and saw this. In that even ExecRemoteQuery seems to have an >>>>>>>>>> issue wherein it's not using the appropriate context. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Nikhils >>>>>>>>>> >>>>>>>>>> Sent from my iPhone >>>>>>>>>> >>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat < >>>>>>>>>> ash...@en...> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Welcome to the mess ;) and enjoy junk food. >>>>>>>>>> >>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called in the >>>>>>>>>> same fashion as other Scan nodes. So, you will see ExecRemoteQuery calling >>>>>>>>>> ExecScan with RemoteQueryNext as the iterator. So, I assume your comment >>>>>>>>>> pertains to RemoteQueryNext and its minions and not ExecRemoteQuery per say! >>>>>>>>>> >>>>>>>>>> This code needs a lot of rework, removing duplications, using >>>>>>>>>> proper way of materialisation, central response handler and error handler >>>>>>>>>> etc. If we clean up this code, some improvements in planner (like using >>>>>>>>>> MergeAppend plan) for Sort, will be possible. Regarding materialisation, >>>>>>>>>> the code uses a linked list for materialising the rows from datanodes (in >>>>>>>>>> case the same connection needs to be given to other remote query node), >>>>>>>>>> which must be eating a lot of performance. Instead we should be using some >>>>>>>>>> kind of tuplestore there. We actually use tuplestore (as well) in the >>>>>>>>>> RemoteQuery node; the same method can be used. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke < >>>>>>>>>> ni...@st...> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Have a Query about ExecRemoteQuery. >>>>>>>>>>> >>>>>>>>>>> The logic seems to have been modeled after ExecMaterial. ISTM, >>>>>>>>>>> that it should have been modeled after ExecScan because we fetch tuples, >>>>>>>>>>> and those which match the qual should be sent up. ExecMaterial is for >>>>>>>>>>> materializing and collecting and storing tuples. >>>>>>>>>>> >>>>>>>>>>> Can anyone explain? The reason for asking this is I am >>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any >>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one >>>>>>>>>>> which qualifies. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Nikhils >>>>>>>>>>> -- >>>>>>>>>>> StormDB - http://www.stormdb.com >>>>>>>>>>> The Database Cloud >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics Lite! >>>>>>>>>>> It's a free troubleshooting tool designed for production. >>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2% overhead. >>>>>>>>>>> Download for free and get started troubleshooting in minutes. >>>>>>>>>>> >>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Postgres-xc-developers mailing list >>>>>>>>>>> Pos...@li... >>>>>>>>>>> >>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best Wishes, >>>>>>>>>> Ashutosh Bapat >>>>>>>>>> EntepriseDB Corporation >>>>>>>>>> The Postgres Database Company >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Wishes, >>>>>>>>> Ashutosh Bapat >>>>>>>>> EntepriseDB Corporation >>>>>>>>> The Postgres Database Company >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> StormDB - http://www.stormdb.com >>>>>>>> The Database Cloud >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Wishes, >>>>>>> Ashutosh Bapat >>>>>>> EntepriseDB Corporation >>>>>>> The Postgres Database Company >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> StormDB - http://www.stormdb.com >>>>>> The Database Cloud >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Wishes, >>>>> Ashutosh Bapat >>>>> EntepriseDB Corporation >>>>> The Postgres Database Company >>>>> >>>> >>>> >>>> >>>> -- >>>> StormDB - http://www.stormdb.com >>>> The Database Cloud >>>> >>> >>> >>> >>> -- >>> StormDB - http://www.stormdb.com >>> The Database Cloud >>> >> >> >> >> -- >> StormDB - http://www.stormdb.com >> The Database Cloud >> > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud > -- StormDB - http://www.stormdb.com The Database Cloud |