From: Ashutosh B. <ash...@en...> - 2013-05-28 14:17:22
|
Hi Abbas, I think the fix is on the right track. There are couple of improvements that we need to do here (but you may not do those if the time doesn't permit). 1. We should have a status in RemoteQuery node, as to whether the query in the node should use extended protocol or not, rather than relying on the presence of statement name and parameters etc. Amit has already added a status with that effect. We need to leverage it. On Tue, May 28, 2013 at 9:04 AM, Abbas Butt <abb...@en...>wrote: > The patch fixes the dead code issue, that I described earlier. The code > was dead because of two issues: > > 1. The function CompleteCachedPlan was wrongly setting stmt_name to NULL > and this was the main reason ActivateDatanodeStatementOnNode was not being > called in the function pgxc_start_command_on_connection. > 2. The function SetRemoteStatementName was wrongly assuming that a > prepared statement must have some parameters. > > Fixing these two issues makes sure that the function > ActivateDatanodeStatementOnNode is now called and statements get prepared > on the datanode. > This patch would fix bug 3607975. It would however not fix the test case I > described in my previous email because of reasons I described. > > > On Tue, May 28, 2013 at 5:50 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Can you please explain what this fix does? It would help to have an >> elaborate explanation with code snippets. >> >> >> On Sun, May 26, 2013 at 10:18 PM, Abbas Butt <abb...@en... >> > wrote: >> >>> >>> >>> On Fri, May 24, 2013 at 7:04 PM, Ashutosh Bapat < >>> ash...@en...> wrote: >>> >>>> >>>> >>>> >>>> On Fri, May 24, 2013 at 9:01 AM, Abbas Butt < >>>> abb...@en...> wrote: >>>> >>>>> >>>>> >>>>> On Fri, May 24, 2013 at 7:22 AM, Ashutosh Bapat < >>>>> ash...@en...> wrote: >>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 23, 2013 at 9:21 PM, Abbas Butt < >>>>>> abb...@en...> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> While working on test case plancache it was brought up as a review >>>>>>> comment that solving bug id 3607975 should solve the problem of the test >>>>>>> case. >>>>>>> However there is some confusion in the statement of bug id 3607975. >>>>>>> >>>>>>> "When a user does and PREPARE and then EXECUTEs multiple times, the >>>>>>> coordinator keeps on preparing and executing the query on datanode al >>>>>>> times, as against preparing once and executing multiple times. This is >>>>>>> because somehow the remote query is being prepared as an unnamed statement." >>>>>>> >>>>>>> Consider this test case >>>>>>> >>>>>>> A. create table abc(a int, b int); >>>>>>> B. insert into abc values(11, 22); >>>>>>> C. prepare p1 as select * from abc; >>>>>>> D. execute p1; >>>>>>> E. execute p1; >>>>>>> F. execute p1; >>>>>>> >>>>>>> Here are the confusions >>>>>>> >>>>>>> 1. The coordinator never prepares on datanode in response to a >>>>>>> prepare issued by a user. >>>>>>> In fact step C does nothing on the datanodes. >>>>>>> Step D simply sends "SELECT a, b FROM abc" to all datanodes. >>>>>>> >>>>>>> 2. In step D, ExecuteQuery calls BuildCachedPlan to build a new >>>>>>> generic plan, >>>>>>> and steps E and F use the already built generic plan. >>>>>>> For details see function GetCachedPlan. >>>>>>> This means that executing a prepared statement again and again >>>>>>> does use cached plans >>>>>>> and does not prepare again and again every time we issue an >>>>>>> execute. >>>>>>> >>>>>>> >>>>>> The problem is not here. The problem is in do_query() where somehow >>>>>> the name of prepared statement gets wiped out and we keep on preparing >>>>>> unnamed statements at the datanode. >>>>>> >>>>> >>>>> We never prepare any named/unnamed statements on the datanode. I spent >>>>> time looking at the code written in do_query and functions called from with >>>>> in do_query to handle prepared statements but the code written in >>>>> pgxc_start_command_on_connection to handle statements prepared on datanodes >>>>> is dead as of now. It is never called during the complete regression run. >>>>> The function ActivateDatanodeStatementOnNode is never called. The way >>>>> prepared statements are being handled now is the same as I described >>>>> earlier in the mail chain with the help of an example. >>>>> The code that is dead was originally added by Mason through commit >>>>> d6d2d3d925f571b0b58ff6b4f6504d88e96bb342, back in December 2010. This code >>>>> has been changed a lot over the last two years. This commit does not >>>>> contain any test cases so I am not sure how did it use to work back then. >>>>> >>>>> >>>> >>>> This code wasn't dead, when I worked on prepared statements. So, >>>> something has gone wrong in-between. That's what we need to find out and >>>> fix. Not preparing statements on the datanode is not good for performance >>>> either. >>>> >>> >>> I was able to find the reason why the code was dead and the attached >>> patch (WIP) fixes the problem. This would now ensure that statements are >>> prepared on datanodes whenever required. However there is a problem in the >>> way prepared statements are handled. The problem is that unless a prepared >>> statement is executed it is never prepared on datanodes, hence changing the >>> path before executing the statement gives us incorrect results. For Example >>> >>> create schema s1 create table abc (f1 int) distribute by replication; >>> create schema s2 create table abc (f1 int) distribute by replication; >>> >>> insert into s1.abc values(123); >>> insert into s2.abc values(456); >>> set search_path = s2; >>> prepare p1 as select f1 from abc; >>> set search_path = s1; >>> execute p1; >>> >>> The last execute results in 123, where as it should have resulted in 456. >>> I can finalize the attached patch by fixing any regression issues that >>> may result and that would fix 3607975 and improve performance however the >>> above test case would still fail. >>> >>> >>>> >>>> >>>>> >>>>>> >>>>>>> My conclusion is that the bug ID 3607975 is not reproducible. >>>>>>> >>>>>>> >>>>>> Did you verify it under the debugger? If that would have been the >>>>>> case, we would not have seen this problem if search_path changed in between >>>>>> steps D and E. >>>>>> >>>>> >>>>> If search path is changed between steps D & E, the problem occurs >>>>> because when the remote query node is created, schema qualification is not >>>>> added in the sql statement to be sent to the datanode, but changes in >>>>> search path do get communicated to the datanode. The sql statement is built >>>>> when execute is issued for the first time and is reused on subsequent >>>>> executes. The datanode is totally unaware that the select that it just >>>>> received is due to an execute of a prepared statement that was prepared >>>>> when search path was some thing else. >>>>> >>>>> >>>> Fixing the prepared statements the way I suggested, would fix the >>>> problem, since the statement will get prepared at the datanode, with the >>>> same search path settings, as it would on the coordinator. >>>> >>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>>> Comments are welcome. >>>>>>> >>>>>>> -- >>>>>>> *Abbas* >>>>>>> Architect >>>>>>> >>>>>>> Ph: 92.334.5100153 >>>>>>> Skype ID: gabbasb >>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>> * >>>>>>> Follow us on Twitter* >>>>>>> @EnterpriseDB >>>>>>> >>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Try New Relic Now & We'll Send You this Cool Shirt >>>>>>> New Relic is the only SaaS-based application performance monitoring >>>>>>> service >>>>>>> that delivers powerful full stack analytics. Optimize and monitor >>>>>>> your >>>>>>> browser, app, & servers with just a few lines of code. Try New Relic >>>>>>> and get this awesome Nerd Life shirt! >>>>>>> http://p.sf.net/sfu/newrelic_d2d_may >>>>>>> _______________________________________________ >>>>>>> Postgres-xc-developers mailing list >>>>>>> Pos...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Wishes, >>>>>> Ashutosh Bapat >>>>>> EntepriseDB Corporation >>>>>> The Postgres Database Company >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> *Abbas* >>>>> Architect >>>>> >>>>> Ph: 92.334.5100153 >>>>> Skype ID: gabbasb >>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>> * >>>>> Follow us on Twitter* >>>>> @EnterpriseDB >>>>> >>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Wishes, >>>> Ashutosh Bapat >>>> EntepriseDB Corporation >>>> The Postgres Database Company >>>> >>> >>> >>> >>> -- >>> -- >>> *Abbas* >>> Architect >>> >>> Ph: 92.334.5100153 >>> Skype ID: gabbasb >>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>> * >>> Follow us on Twitter* >>> @EnterpriseDB >>> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Postgres Database Company >> > > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > * > Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company |