From: Koichi S. <koi...@gm...> - 2013-05-24 13:30:53
|
I didn't realize this idea while listening to Robert's talk. Yes, it will be nice. I think we should test how it works. I understand this is a very good candidate for the problem. Regards; ---------- Koichi Suzuki 2013/5/23 Ashutosh Bapat <ash...@en...> > Hi All, > While I hear Robert's presentation on query planning go wrong, I heard > (first time) about MergeAppend plan. It looks like, we have something > better to do with the way we are pushing the ORDER BY clauses and doing > merge at the coordinator. Right now we are using Sort node and in ExecSort > we start merging the rows. Instead, I think we should be using MergeAppend > node. The only bad thing about this approach is MergeAppend node expects > separate plans for each run of the sorted data. To do this, we need to > replicate the RemoteQuery node as many times as there are nodes and put > them in the list to the MergeAppend plan. I think that's going to solve the > problem with materialisation (see mail thread "Using remote sorting for > merge-join") and thus improve performance. It also takes away the need to > have xc_node_id in the tupleslot of tuplestore. Obviously, simplifying the > code. > > Any comments? > > Created 3613815 > <https://sourceforge.net/tracker/?func=detail&aid=3613815&group_id=311227&atid=1310232>and > assigned to me. > <https://sourceforge.net/tracker/?func=detail&aid=3613815&group_id=311227&atid=1310232> > > On Mon, Dec 3, 2012 at 3:44 AM, Ashutosh Bapat < > ash...@en...> wrote: > >> On Fri, Nov 30, 2012 at 9:37 AM, Michael Paquier >> <mic...@gm...> wrote: >> > Hi Ashutosh, >> > >> > Thanks for the patch. >> > I found for the time being the following problems: >> > 1) The patch has some whitespaces >> > /home/michael/download/xc_sort.patch:16: trailing whitespace. >> > >> > /home/michael/download/xc_sort.patch:34: trailing whitespace. >> > >> > /home/michael/download/xc_sort.patch:307: trailing whitespace. >> > return local_plan_left; >> > /home/michael/download/xc_sort.patch:356: trailing whitespace. >> > >> > /home/michael/download/xc_sort.patch:360: trailing whitespace. >> > >> >> Fixed. >> >> > error: patch failed: src/include/pgxc/execRemote.h:95 >> > 2) the patch does not apply cleanly even by using "patch -p1"-like >> commands >> >> When I sent the patch (more than a week before you replied) it was >> based on the HEAD, but then there were some commits which created the >> conflicts. The one attached should apply. >> >> > 3) No documentation about the new option enable_remotesort. >> > 4) enable_remotesort is not listed in postgresql.conf. >> > Please provide both documentation and reference in >> postgresql.conf.sample >> > for transparency of the code. As an open source project it is important >> to >> > classify all the options available, for both the user and the >> developers. >> >> Added although, I still don't approve of it myself. >> >> > 5) What is cruelly missing with this patch are numbers. By how much this >> > feature improves performance? >> > Simple tests like a ORDER BY on a table with millions of rows would be >> > enough, but as this is a performance feature, you should provide data >> about >> > how much those cases are improved. >> > 6) A complain that I say for each planner-related patch in this >> project.... >> > This patch has a size of 200KB, 65% being updates on the EXPLAIN >> queries of >> > the XC query planning regression tests. This makes the commit tree >> > unreadable and the evolution of the regression tests realy hard to >> follow >> > because they need to be modified each time a planner optimization is >> > introduced. We should move to a set of tests which is far more >> > code-modification proof! >> >> This is planner code, so it's going to affect the plans that it >> creates. In the test files we add EXPLAIN (with proper switches) to >> check 1. whether the plans are coming as expected at that stage of >> development, 2. Whether the clauses expected to be pushed down are >> being indeed pushed down 3. the queries to be sent to the datanodes >> are being crafted correctly. For every planner change, all the three >> things above change, for the operation being planned and hence the >> expected output changes are unavoidable. In fact, if you look at how >> the expected output changing you get a idea of how correctly/wrong the >> planner changes are made. Those diffs if go wrong, are great help to >> me in finding problems earlier. During merge as well, if those tests >> fail because of EXPLAIN outputs, that's a way to unwanted planner >> changes early enough and correct them. Sorry, for the huge change, but >> it's unavoidable. The testcases are intentionally prone to (planner) >> code modification. >> >> > >> > Regards, >> > >> > On Mon, Nov 26, 2012 at 8:42 PM, Ashutosh Bapat >> > <ash...@en...> wrote: >> >> >> >> Hi All, >> >> PFA patch which enables pushing down the ORDER BY clause or sorting to >> the >> >> Datanode/s through standard_planner(). >> >> >> >> This patch leverages the already existing merge sort mechanism for >> >> RemoteQuery but now outside RemoteQuery using Sort plans. >> >> >> >> For a Sort plan it checks if all the expressions on which the result >> needs >> >> to be sorted are shippable and that the underlying tree allows to push >> ORDER >> >> BY to the Datanodes. If so, it includes the ORDER BY clause in the >> query to >> >> be sent to the Datanodes. If the number of Datanodes which participate >> in >> >> RemoteQuery is not 1, it adds a covering Sort node to merge the >> results from >> >> the Datanodes, otherwise, there is no covering Sort plan. >> >> >> >> The merge sort mechanism treats every Datanode as a tape and prefetches >> >> the results from the Datanode corresponding to a dwindling tape. This >> is >> >> enabled by setting the node from where to pull the next row before >> calling >> >> ExecProcNode on RemoteQueryState. >> >> >> >> The same mechanism is being used for Sort plan node under Group or Agg >> >> nodes. >> >> >> >> The patch adds test xc_sort.sql for testing this functionality. It also >> >> removes the now useless members sort and tuplesortstate from >> RemoteQuery and >> >> RemoteQueryState resp. >> >> -- >> >> Best Wishes, >> >> Ashutosh Bapat >> >> EntepriseDB Corporation >> >> The Enterprise Postgres Company >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Monitor your physical, virtual and cloud infrastructure from a single >> >> web console. Get in-depth insight into apps, servers, databases, >> vmware, >> >> SAP, cloud infrastructure, etc. Download 30-day Free Trial. >> >> Pricing starts from $795 for 25 servers or applications! >> >> http://p.sf.net/sfu/zoho_dev2dev_nov >> >> _______________________________________________ >> >> Postgres-xc-developers mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >> > >> > >> > >> > -- >> > Michael Paquier >> > http://michael.otacoo.com >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Enterprise Postgres Company >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Postgres Database Company > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |