From: Abbas B. <abb...@en...> - 2013-06-10 12:31:57
|
Hi, Attached please find a WIP patch that provides the functionality of preparing the statement at the datanodes as soon as it is prepared on the coordinator. This is to take care of a test case in plancache that makes sure that change of search_path is ignored by replans. While the patch fixes this replan test case and the regression works fine there are still these two problems I have to take care of. 1. This test case fails CREATE TABLE xc_alter_table_3 (a int, b varchar(10)) DISTRIBUTE BY HASH(a); INSERT INTO xc_alter_table_3 VALUES (1, 'a'); PREPARE d3 AS DELETE FROM xc_alter_table_3 WHERE a = $1; -- fails test=# explain verbose DELETE FROM xc_alter_table_3 WHERE a = 1; QUERY PLAN ------------------------------------------------------------------- Delete on public.xc_alter_table_3 (cost=0.00..0.00 rows=1000 width=14) Node/s: data_node_1, data_node_2, data_node_3, data_node_4 Remote query: DELETE FROM ONLY xc_alter_table_3 WHERE ((xc_alter_table_3.ctid = $1) AND (xc_alter_table_3.xc_node_id = $2)) -> Data Node Scan on xc_alter_table_3 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=14) Output: xc_alter_table_3.a, xc_alter_table_3.ctid, xc_alter_table_3.xc_node_id Node/s: data_node_3 Remote query: SELECT a, ctid, xc_node_id FROM ONLY xc_alter_table_3 WHERE (a = 1) (7 rows) The reason of the failure is that the select query is selecting 3 items, the first of which is an int, whereas the delete query is comparing $1 with a ctid. I am not sure how this works without prepare, but it fails when used with prepare. The reason of this planning is this section of code in function pgxc_build_dml_statement else if (cmdtype == CMD_DELETE) { /* * Since there is no data to update, the first param is going to be * ctid. */ ctid_param_num = 1; } Amit/Ashutosh can you suggest a fix for this problem? There are a number of possibilities. a) The select should not have selected column a. b) The DELETE should have referred to $2 and $3 for ctid and xc_node_id respectively. c) Since the query works without PREPARE, we should make PREPARE work the same way. 2. This test case in plancache fails. -- Try it with a view, which isn't directly used in the resulting plan -- but should trigger invalidation anyway create table tab33 (a int, b int); insert into tab33 values(1,2); CREATE VIEW v_tab33 AS SELECT * FROM tab33; PREPARE vprep AS SELECT * FROM v_tab33; EXECUTE vprep; CREATE OR REPLACE VIEW v_tab33 AS SELECT a, b/2 AS q2 FROM tab33; -- does not cause plan invalidation because views are never created on datanodes EXECUTE vprep; and the reason of the failure is that views are never created on the datanodes hence plan invalidation is not triggered. This can be documented as an XC limitation. 3. I still have to add comments in the patch and some ifdefs may be missing too. In addition to the patch I have also attached some example Java programs that test the some basic functionality through JDBC. I found that these programs are working fine after my patch. 1. Prepared.java : Issues parameterized delete, insert and update through JDBC. These are un-named prepared statements and works fine. 2. NamedPrepared.java : Issues two named prepared statements through JDBC and works fine. 3. Retrieve.java : Runs a simple select to verify results. The comments on top of the files explain their usage. Comments are welcome. Thanks Regards On Mon, Jun 3, 2013 at 10:54 AM, Ashutosh Bapat < ash...@en...> wrote: > > > > On Mon, Jun 3, 2013 at 10:51 AM, Abbas Butt <abb...@en...>wrote: > >> >> >> On Mon, Jun 3, 2013 at 8:43 AM, Ashutosh Bapat < >> ash...@en...> wrote: >> >>> >>> >>> >>> On Mon, Jun 3, 2013 at 7:40 AM, Abbas Butt <abb...@en...>wrote: >>> >>>> Attached please find updated patch to fix the bug. The patch takes care >>>> of the bug and the regression issues resulting from the changes done in the >>>> patch. Please note that the issue in test case plancache still stands >>>> unsolved because of the following test case (simplified but taken from >>>> plancache.sql) >>>> >>>> create schema s1 create table abc (f1 int); >>>> create schema s2 create table abc (f1 int); >>>> >>>> >>>> insert into s1.abc values(123); >>>> insert into s2.abc values(456); >>>> >>>> set search_path = s1; >>>> >>>> prepare p1 as select f1 from abc; >>>> execute p1; -- works fine, results in 123 >>>> >>>> set search_path = s2; >>>> execute p1; -- works fine after the patch, results in 123 >>>> >>>> alter table s1.abc add column f2 float8; -- force replan >>>> execute p1; -- fails >>>> >>>> >>> Huh! The beast bit us. >>> >>> I think the right solution here is either of two >>> 1. Take your previous patch to always use qualified names (but you need >>> to improve it not to affect the view dumps) >>> 2. Prepare the statements at the datanode at the time of prepare. >>> >>> >>> Is this test added new in 9.2? >>> >> >> No, it was added by commit 547b6e537aa8bbae83a8a4c4d0d7f216390bdb9c in >> March 2007. >> >> >>> Why didn't we see this issue the first time prepare was implemented? I >>> don't remember (but it was two years back). >>> >> >> I was unable to locate the exact reason but since statements were not >> being prepared on datanodes due to a merge issue this issue just surfaced >> up. >> >> > > Well, even though statements were not getting prepared (actually prepared > statements were not being used again and again) on datanodes, we never > prepared them on datanode at the time of preparing the statement. So, this > bug should have shown itself long back. > > >> >>> >>>> The last execute should result in 123, whereas it results in 456. The >>>> reason is that the search path has already been changed at the datanode and >>>> a replan would mean select from abc in s2. >>>> >>>> >>>> >>>> >>>> On Tue, May 28, 2013 at 7:17 PM, Ashutosh Bapat < >>>> ash...@en...> wrote: >>>> >>>>> Hi Abbas, >>>>> I think the fix is on the right track. There are couple of >>>>> improvements that we need to do here (but you may not do those if the time >>>>> doesn't permit). >>>>> >>>>> 1. We should have a status in RemoteQuery node, as to whether the >>>>> query in the node should use extended protocol or not, rather than relying >>>>> on the presence of statement name and parameters etc. Amit has already >>>>> added a status with that effect. We need to leverage it. >>>>> >>>>> >>>>> On Tue, May 28, 2013 at 9:04 AM, Abbas Butt < >>>>> abb...@en...> wrote: >>>>> >>>>>> The patch fixes the dead code issue, that I described earlier. The >>>>>> code was dead because of two issues: >>>>>> >>>>>> 1. The function CompleteCachedPlan was wrongly setting stmt_name to >>>>>> NULL and this was the main reason ActivateDatanodeStatementOnNode was not >>>>>> being called in the function pgxc_start_command_on_connection. >>>>>> 2. The function SetRemoteStatementName was wrongly assuming that a >>>>>> prepared statement must have some parameters. >>>>>> >>>>>> Fixing these two issues makes sure that the function >>>>>> ActivateDatanodeStatementOnNode is now called and statements get prepared >>>>>> on the datanode. >>>>>> This patch would fix bug 3607975. It would however not fix the test >>>>>> case I described in my previous email because of reasons I described. >>>>>> >>>>>> >>>>>> On Tue, May 28, 2013 at 5:50 PM, Ashutosh Bapat < >>>>>> ash...@en...> wrote: >>>>>> >>>>>>> Can you please explain what this fix does? It would help to have an >>>>>>> elaborate explanation with code snippets. >>>>>>> >>>>>>> >>>>>>> On Sun, May 26, 2013 at 10:18 PM, Abbas Butt < >>>>>>> abb...@en...> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, May 24, 2013 at 7:04 PM, Ashutosh Bapat < >>>>>>>> ash...@en...> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 24, 2013 at 9:01 AM, Abbas Butt < >>>>>>>>> abb...@en...> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, May 24, 2013 at 7:22 AM, Ashutosh Bapat < >>>>>>>>>> ash...@en...> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, May 23, 2013 at 9:21 PM, Abbas Butt < >>>>>>>>>>> abb...@en...> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> While working on test case plancache it was brought up as a >>>>>>>>>>>> review comment that solving bug id 3607975 should solve the problem of the >>>>>>>>>>>> test case. >>>>>>>>>>>> However there is some confusion in the statement of bug id >>>>>>>>>>>> 3607975. >>>>>>>>>>>> >>>>>>>>>>>> "When a user does and PREPARE and then EXECUTEs multiple times, >>>>>>>>>>>> the coordinator keeps on preparing and executing the query on datanode al >>>>>>>>>>>> times, as against preparing once and executing multiple times. This is >>>>>>>>>>>> because somehow the remote query is being prepared as an unnamed statement." >>>>>>>>>>>> >>>>>>>>>>>> Consider this test case >>>>>>>>>>>> >>>>>>>>>>>> A. create table abc(a int, b int); >>>>>>>>>>>> B. insert into abc values(11, 22); >>>>>>>>>>>> C. prepare p1 as select * from abc; >>>>>>>>>>>> D. execute p1; >>>>>>>>>>>> E. execute p1; >>>>>>>>>>>> F. execute p1; >>>>>>>>>>>> >>>>>>>>>>>> Here are the confusions >>>>>>>>>>>> >>>>>>>>>>>> 1. The coordinator never prepares on datanode in response to a >>>>>>>>>>>> prepare issued by a user. >>>>>>>>>>>> In fact step C does nothing on the datanodes. >>>>>>>>>>>> Step D simply sends "SELECT a, b FROM abc" to all >>>>>>>>>>>> datanodes. >>>>>>>>>>>> >>>>>>>>>>>> 2. In step D, ExecuteQuery calls BuildCachedPlan to build a new >>>>>>>>>>>> generic plan, >>>>>>>>>>>> and steps E and F use the already built generic plan. >>>>>>>>>>>> For details see function GetCachedPlan. >>>>>>>>>>>> This means that executing a prepared statement again and >>>>>>>>>>>> again does use cached plans >>>>>>>>>>>> and does not prepare again and again every time we issue an >>>>>>>>>>>> execute. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> The problem is not here. The problem is in do_query() where >>>>>>>>>>> somehow the name of prepared statement gets wiped out and we keep on >>>>>>>>>>> preparing unnamed statements at the datanode. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We never prepare any named/unnamed statements on the datanode. I >>>>>>>>>> spent time looking at the code written in do_query and functions called >>>>>>>>>> from with in do_query to handle prepared statements but the code written in >>>>>>>>>> pgxc_start_command_on_connection to handle statements prepared on datanodes >>>>>>>>>> is dead as of now. It is never called during the complete regression run. >>>>>>>>>> The function ActivateDatanodeStatementOnNode is never called. The way >>>>>>>>>> prepared statements are being handled now is the same as I described >>>>>>>>>> earlier in the mail chain with the help of an example. >>>>>>>>>> The code that is dead was originally added by Mason through >>>>>>>>>> commit d6d2d3d925f571b0b58ff6b4f6504d88e96bb342, back in December 2010. >>>>>>>>>> This code has been changed a lot over the last two years. This commit does >>>>>>>>>> not contain any test cases so I am not sure how did it use to work back >>>>>>>>>> then. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> This code wasn't dead, when I worked on prepared statements. So, >>>>>>>>> something has gone wrong in-between. That's what we need to find out and >>>>>>>>> fix. Not preparing statements on the datanode is not good for performance >>>>>>>>> either. >>>>>>>>> >>>>>>>> >>>>>>>> I was able to find the reason why the code was dead and the >>>>>>>> attached patch (WIP) fixes the problem. This would now ensure that >>>>>>>> statements are prepared on datanodes whenever required. However there is a >>>>>>>> problem in the way prepared statements are handled. The problem is that >>>>>>>> unless a prepared statement is executed it is never prepared on datanodes, >>>>>>>> hence changing the path before executing the statement gives us incorrect >>>>>>>> results. For Example >>>>>>>> >>>>>>>> create schema s1 create table abc (f1 int) distribute by >>>>>>>> replication; >>>>>>>> create schema s2 create table abc (f1 int) distribute by >>>>>>>> replication; >>>>>>>> >>>>>>>> insert into s1.abc values(123); >>>>>>>> insert into s2.abc values(456); >>>>>>>> set search_path = s2; >>>>>>>> prepare p1 as select f1 from abc; >>>>>>>> set search_path = s1; >>>>>>>> execute p1; >>>>>>>> >>>>>>>> The last execute results in 123, where as it should have resulted >>>>>>>> in 456. >>>>>>>> I can finalize the attached patch by fixing any regression issues >>>>>>>> that may result and that would fix 3607975 and improve performance however >>>>>>>> the above test case would still fail. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> My conclusion is that the bug ID 3607975 is not reproducible. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Did you verify it under the debugger? If that would have been >>>>>>>>>>> the case, we would not have seen this problem if search_path changed in >>>>>>>>>>> between steps D and E. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If search path is changed between steps D & E, the problem occurs >>>>>>>>>> because when the remote query node is created, schema qualification is not >>>>>>>>>> added in the sql statement to be sent to the datanode, but changes in >>>>>>>>>> search path do get communicated to the datanode. The sql statement is built >>>>>>>>>> when execute is issued for the first time and is reused on subsequent >>>>>>>>>> executes. The datanode is totally unaware that the select that it just >>>>>>>>>> received is due to an execute of a prepared statement that was prepared >>>>>>>>>> when search path was some thing else. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Fixing the prepared statements the way I suggested, would fix the >>>>>>>>> problem, since the statement will get prepared at the datanode, with the >>>>>>>>> same search path settings, as it would on the coordinator. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Comments are welcome. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Abbas* >>>>>>>>>>>> Architect >>>>>>>>>>>> >>>>>>>>>>>> Ph: 92.334.5100153 >>>>>>>>>>>> Skype ID: gabbasb >>>>>>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>>>>>> * >>>>>>>>>>>> Follow us on Twitter* >>>>>>>>>>>> @EnterpriseDB >>>>>>>>>>>> >>>>>>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>> Try New Relic Now & We'll Send You this Cool Shirt >>>>>>>>>>>> New Relic is the only SaaS-based application performance >>>>>>>>>>>> monitoring service >>>>>>>>>>>> that delivers powerful full stack analytics. Optimize and >>>>>>>>>>>> monitor your >>>>>>>>>>>> browser, app, & servers with just a few lines of code. Try New >>>>>>>>>>>> Relic >>>>>>>>>>>> and get this awesome Nerd Life shirt! >>>>>>>>>>>> http://p.sf.net/sfu/newrelic_d2d_may >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Postgres-xc-developers mailing list >>>>>>>>>>>> Pos...@li... >>>>>>>>>>>> >>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best Wishes, >>>>>>>>>>> Ashutosh Bapat >>>>>>>>>>> EntepriseDB Corporation >>>>>>>>>>> The Postgres Database Company >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> *Abbas* >>>>>>>>>> Architect >>>>>>>>>> >>>>>>>>>> Ph: 92.334.5100153 >>>>>>>>>> Skype ID: gabbasb >>>>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>>>> * >>>>>>>>>> Follow us on Twitter* >>>>>>>>>> @EnterpriseDB >>>>>>>>>> >>>>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Wishes, >>>>>>>>> Ashutosh Bapat >>>>>>>>> EntepriseDB Corporation >>>>>>>>> The Postgres Database Company >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> *Abbas* >>>>>>>> Architect >>>>>>>> >>>>>>>> Ph: 92.334.5100153 >>>>>>>> Skype ID: gabbasb >>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>> * >>>>>>>> Follow us on Twitter* >>>>>>>> @EnterpriseDB >>>>>>>> >>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Wishes, >>>>>>> Ashutosh Bapat >>>>>>> EntepriseDB Corporation >>>>>>> The Postgres Database Company >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- >>>>>> *Abbas* >>>>>> Architect >>>>>> >>>>>> Ph: 92.334.5100153 >>>>>> Skype ID: gabbasb >>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>> * >>>>>> Follow us on Twitter* >>>>>> @EnterpriseDB >>>>>> >>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Wishes, >>>>> Ashutosh Bapat >>>>> EntepriseDB Corporation >>>>> The Postgres Database Company >>>>> >>>> >>>> >>>> >>>> -- >>>> -- >>>> *Abbas* >>>> Architect >>>> >>>> Ph: 92.334.5100153 >>>> Skype ID: gabbasb >>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>> * >>>> Follow us on Twitter* >>>> @EnterpriseDB >>>> >>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>> >>> >>> >>> >>> -- >>> Best Wishes, >>> Ashutosh Bapat >>> EntepriseDB Corporation >>> The Postgres Database Company >>> >> >> >> >> -- >> -- >> *Abbas* >> Architect >> >> Ph: 92.334.5100153 >> Skype ID: gabbasb >> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >> * >> Follow us on Twitter* >> @EnterpriseDB >> >> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Postgres Database Company > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> * Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |