From: Ashutosh B. <ash...@en...> - 2013-05-28 12:50:39
|
Can you please explain what this fix does? It would help to have an elaborate explanation with code snippets. On Sun, May 26, 2013 at 10:18 PM, Abbas Butt <abb...@en...>wrote: > > > On Fri, May 24, 2013 at 7:04 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> >> >> >> On Fri, May 24, 2013 at 9:01 AM, Abbas Butt <abb...@en...>wrote: >> >>> >>> >>> On Fri, May 24, 2013 at 7:22 AM, Ashutosh Bapat < >>> ash...@en...> wrote: >>> >>>> >>>> >>>> >>>> On Thu, May 23, 2013 at 9:21 PM, Abbas Butt < >>>> abb...@en...> wrote: >>>> >>>>> Hi, >>>>> >>>>> While working on test case plancache it was brought up as a review >>>>> comment that solving bug id 3607975 should solve the problem of the test >>>>> case. >>>>> However there is some confusion in the statement of bug id 3607975. >>>>> >>>>> "When a user does and PREPARE and then EXECUTEs multiple times, the >>>>> coordinator keeps on preparing and executing the query on datanode al >>>>> times, as against preparing once and executing multiple times. This is >>>>> because somehow the remote query is being prepared as an unnamed statement." >>>>> >>>>> Consider this test case >>>>> >>>>> A. create table abc(a int, b int); >>>>> B. insert into abc values(11, 22); >>>>> C. prepare p1 as select * from abc; >>>>> D. execute p1; >>>>> E. execute p1; >>>>> F. execute p1; >>>>> >>>>> Here are the confusions >>>>> >>>>> 1. The coordinator never prepares on datanode in response to a prepare >>>>> issued by a user. >>>>> In fact step C does nothing on the datanodes. >>>>> Step D simply sends "SELECT a, b FROM abc" to all datanodes. >>>>> >>>>> 2. In step D, ExecuteQuery calls BuildCachedPlan to build a new >>>>> generic plan, >>>>> and steps E and F use the already built generic plan. >>>>> For details see function GetCachedPlan. >>>>> This means that executing a prepared statement again and again >>>>> does use cached plans >>>>> and does not prepare again and again every time we issue an >>>>> execute. >>>>> >>>>> >>>> The problem is not here. The problem is in do_query() where somehow the >>>> name of prepared statement gets wiped out and we keep on preparing unnamed >>>> statements at the datanode. >>>> >>> >>> We never prepare any named/unnamed statements on the datanode. I spent >>> time looking at the code written in do_query and functions called from with >>> in do_query to handle prepared statements but the code written in >>> pgxc_start_command_on_connection to handle statements prepared on datanodes >>> is dead as of now. It is never called during the complete regression run. >>> The function ActivateDatanodeStatementOnNode is never called. The way >>> prepared statements are being handled now is the same as I described >>> earlier in the mail chain with the help of an example. >>> The code that is dead was originally added by Mason through commit >>> d6d2d3d925f571b0b58ff6b4f6504d88e96bb342, back in December 2010. This code >>> has been changed a lot over the last two years. This commit does not >>> contain any test cases so I am not sure how did it use to work back then. >>> >>> >> >> This code wasn't dead, when I worked on prepared statements. So, >> something has gone wrong in-between. That's what we need to find out and >> fix. Not preparing statements on the datanode is not good for performance >> either. >> > > I was able to find the reason why the code was dead and the attached patch > (WIP) fixes the problem. This would now ensure that statements are prepared > on datanodes whenever required. However there is a problem in the way > prepared statements are handled. The problem is that unless a prepared > statement is executed it is never prepared on datanodes, hence changing the > path before executing the statement gives us incorrect results. For Example > > create schema s1 create table abc (f1 int) distribute by replication; > create schema s2 create table abc (f1 int) distribute by replication; > > insert into s1.abc values(123); > insert into s2.abc values(456); > set search_path = s2; > prepare p1 as select f1 from abc; > set search_path = s1; > execute p1; > > The last execute results in 123, where as it should have resulted in 456. > I can finalize the attached patch by fixing any regression issues that may > result and that would fix 3607975 and improve performance however the above > test case would still fail. > > >> >> >>> >>>> >>>>> My conclusion is that the bug ID 3607975 is not reproducible. >>>>> >>>>> >>>> Did you verify it under the debugger? If that would have been the case, >>>> we would not have seen this problem if search_path changed in between steps >>>> D and E. >>>> >>> >>> If search path is changed between steps D & E, the problem occurs >>> because when the remote query node is created, schema qualification is not >>> added in the sql statement to be sent to the datanode, but changes in >>> search path do get communicated to the datanode. The sql statement is built >>> when execute is issued for the first time and is reused on subsequent >>> executes. The datanode is totally unaware that the select that it just >>> received is due to an execute of a prepared statement that was prepared >>> when search path was some thing else. >>> >>> >> Fixing the prepared statements the way I suggested, would fix the >> problem, since the statement will get prepared at the datanode, with the >> same search path settings, as it would on the coordinator. >> >> >>> >>> >>>> >>>> >>>>> Comments are welcome. >>>>> >>>>> -- >>>>> *Abbas* >>>>> Architect >>>>> >>>>> Ph: 92.334.5100153 >>>>> Skype ID: gabbasb >>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>> * >>>>> Follow us on Twitter* >>>>> @EnterpriseDB >>>>> >>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Try New Relic Now & We'll Send You this Cool Shirt >>>>> New Relic is the only SaaS-based application performance monitoring >>>>> service >>>>> that delivers powerful full stack analytics. Optimize and monitor your >>>>> browser, app, & servers with just a few lines of code. Try New Relic >>>>> and get this awesome Nerd Life shirt! >>>>> http://p.sf.net/sfu/newrelic_d2d_may >>>>> _______________________________________________ >>>>> Postgres-xc-developers mailing list >>>>> Pos...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>> >>>>> >>>> >>>> >>>> -- >>>> Best Wishes, >>>> Ashutosh Bapat >>>> EntepriseDB Corporation >>>> The Postgres Database Company >>>> >>> >>> >>> >>> -- >>> -- >>> *Abbas* >>> Architect >>> >>> Ph: 92.334.5100153 >>> Skype ID: gabbasb >>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>> * >>> Follow us on Twitter* >>> @EnterpriseDB >>> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Postgres Database Company >> > > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > * > Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company |