From: Abbas B. <abb...@en...> - 2013-06-24 02:43:00
|
Hi, As discussed in the last F2F meeting, here is an updated patch that provides schema qualification of the following objects: Tables, Views, Functions, Types and Domains in case of remote queries. Sequence functions are never concerned with datanodes hence, schema qualification is not required in case of sequences. This solves plancache test case failure issue and does not introduce any more failures. I have also attached some tests with results to aid in review. Comments are welcome. Regards On Mon, Jun 10, 2013 at 5:31 PM, Abbas Butt <abb...@en...>wrote: > Hi, > Attached please find a WIP patch that provides the functionality of > preparing the statement at the datanodes as soon as it is prepared > on the coordinator. > This is to take care of a test case in plancache that makes sure that > change of search_path is ignored by replans. > While the patch fixes this replan test case and the regression works fine > there are still these two problems I have to take care of. > > 1. This test case fails > > CREATE TABLE xc_alter_table_3 (a int, b varchar(10)) DISTRIBUTE BY > HASH(a); > INSERT INTO xc_alter_table_3 VALUES (1, 'a'); > PREPARE d3 AS DELETE FROM xc_alter_table_3 WHERE a = $1; -- fails > > test=# explain verbose DELETE FROM xc_alter_table_3 WHERE a = 1; > QUERY PLAN > ------------------------------------------------------------------- > Delete on public.xc_alter_table_3 (cost=0.00..0.00 rows=1000 > width=14) > Node/s: data_node_1, data_node_2, data_node_3, data_node_4 > Remote query: DELETE FROM ONLY xc_alter_table_3 WHERE > ((xc_alter_table_3.ctid = $1) AND > (xc_alter_table_3.xc_node_id = $2)) > -> Data Node Scan on xc_alter_table_3 "_REMOTE_TABLE_QUERY_" > (cost=0.00..0.00 rows=1000 width=14) > Output: xc_alter_table_3.a, xc_alter_table_3.ctid, > xc_alter_table_3.xc_node_id > Node/s: data_node_3 > Remote query: SELECT a, ctid, xc_node_id FROM ONLY > xc_alter_table_3 WHERE (a = 1) > (7 rows) > > The reason of the failure is that the select query is selecting 3 > items, the first of which is an int, > whereas the delete query is comparing $1 with a ctid. > I am not sure how this works without prepare, but it fails when used > with prepare. > > The reason of this planning is this section of code in function > pgxc_build_dml_statement > else if (cmdtype == CMD_DELETE) > { > /* > * Since there is no data to update, the first param is going to be > * ctid. > */ > ctid_param_num = 1; > } > > Amit/Ashutosh can you suggest a fix for this problem? > There are a number of possibilities. > a) The select should not have selected column a. > b) The DELETE should have referred to $2 and $3 for ctid and xc_node_id > respectively. > c) Since the query works without PREPARE, we should make PREPARE work > the same way. > > > 2. This test case in plancache fails. > > -- Try it with a view, which isn't directly used in the resulting plan > -- but should trigger invalidation anyway > create table tab33 (a int, b int); > insert into tab33 values(1,2); > CREATE VIEW v_tab33 AS SELECT * FROM tab33; > PREPARE vprep AS SELECT * FROM v_tab33; > EXECUTE vprep; > CREATE OR REPLACE VIEW v_tab33 AS SELECT a, b/2 AS q2 FROM tab33; > -- does not cause plan invalidation because views are never created on > datanodes > EXECUTE vprep; > > and the reason of the failure is that views are never created on the > datanodes hence plan invalidation is not triggered. > This can be documented as an XC limitation. > > 3. I still have to add comments in the patch and some ifdefs may be > missing too. > > > In addition to the patch I have also attached some example Java programs > that test the some basic functionality through JDBC. I found that these > programs are working fine after my patch. > > 1. Prepared.java : Issues parameterized delete, insert and update through > JDBC. These are un-named prepared statements and works fine. > 2. NamedPrepared.java : Issues two named prepared statements through JDBC > and works fine. > 3. Retrieve.java : Runs a simple select to verify results. > The comments on top of the files explain their usage. > > Comments are welcome. > > Thanks > Regards > > > > On Mon, Jun 3, 2013 at 10:54 AM, Ashutosh Bapat < > ash...@en...> wrote: > >> >> >> >> On Mon, Jun 3, 2013 at 10:51 AM, Abbas Butt <abb...@en...>wrote: >> >>> >>> >>> On Mon, Jun 3, 2013 at 8:43 AM, Ashutosh Bapat < >>> ash...@en...> wrote: >>> >>>> >>>> >>>> >>>> On Mon, Jun 3, 2013 at 7:40 AM, Abbas Butt <abb...@en... >>>> > wrote: >>>> >>>>> Attached please find updated patch to fix the bug. The patch takes >>>>> care of the bug and the regression issues resulting from the changes done >>>>> in the patch. Please note that the issue in test case plancache still >>>>> stands unsolved because of the following test case (simplified but taken >>>>> from plancache.sql) >>>>> >>>>> create schema s1 create table abc (f1 int); >>>>> create schema s2 create table abc (f1 int); >>>>> >>>>> >>>>> insert into s1.abc values(123); >>>>> insert into s2.abc values(456); >>>>> >>>>> set search_path = s1; >>>>> >>>>> prepare p1 as select f1 from abc; >>>>> execute p1; -- works fine, results in 123 >>>>> >>>>> set search_path = s2; >>>>> execute p1; -- works fine after the patch, results in 123 >>>>> >>>>> alter table s1.abc add column f2 float8; -- force replan >>>>> execute p1; -- fails >>>>> >>>>> >>>> Huh! The beast bit us. >>>> >>>> I think the right solution here is either of two >>>> 1. Take your previous patch to always use qualified names (but you need >>>> to improve it not to affect the view dumps) >>>> 2. Prepare the statements at the datanode at the time of prepare. >>>> >>>> >>>> Is this test added new in 9.2? >>>> >>> >>> No, it was added by commit 547b6e537aa8bbae83a8a4c4d0d7f216390bdb9c in >>> March 2007. >>> >>> >>>> Why didn't we see this issue the first time prepare was implemented? I >>>> don't remember (but it was two years back). >>>> >>> >>> I was unable to locate the exact reason but since statements were not >>> being prepared on datanodes due to a merge issue this issue just surfaced >>> up. >>> >>> >> >> Well, even though statements were not getting prepared (actually prepared >> statements were not being used again and again) on datanodes, we never >> prepared them on datanode at the time of preparing the statement. So, this >> bug should have shown itself long back. >> >> >>> >>>> >>>>> The last execute should result in 123, whereas it results in 456. The >>>>> reason is that the search path has already been changed at the datanode and >>>>> a replan would mean select from abc in s2. >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, May 28, 2013 at 7:17 PM, Ashutosh Bapat < >>>>> ash...@en...> wrote: >>>>> >>>>>> Hi Abbas, >>>>>> I think the fix is on the right track. There are couple of >>>>>> improvements that we need to do here (but you may not do those if the time >>>>>> doesn't permit). >>>>>> >>>>>> 1. We should have a status in RemoteQuery node, as to whether the >>>>>> query in the node should use extended protocol or not, rather than relying >>>>>> on the presence of statement name and parameters etc. Amit has already >>>>>> added a status with that effect. We need to leverage it. >>>>>> >>>>>> >>>>>> On Tue, May 28, 2013 at 9:04 AM, Abbas Butt < >>>>>> abb...@en...> wrote: >>>>>> >>>>>>> The patch fixes the dead code issue, that I described earlier. The >>>>>>> code was dead because of two issues: >>>>>>> >>>>>>> 1. The function CompleteCachedPlan was wrongly setting stmt_name to >>>>>>> NULL and this was the main reason ActivateDatanodeStatementOnNode was not >>>>>>> being called in the function pgxc_start_command_on_connection. >>>>>>> 2. The function SetRemoteStatementName was wrongly assuming that a >>>>>>> prepared statement must have some parameters. >>>>>>> >>>>>>> Fixing these two issues makes sure that the function >>>>>>> ActivateDatanodeStatementOnNode is now called and statements get prepared >>>>>>> on the datanode. >>>>>>> This patch would fix bug 3607975. It would however not fix the test >>>>>>> case I described in my previous email because of reasons I described. >>>>>>> >>>>>>> >>>>>>> On Tue, May 28, 2013 at 5:50 PM, Ashutosh Bapat < >>>>>>> ash...@en...> wrote: >>>>>>> >>>>>>>> Can you please explain what this fix does? It would help to have an >>>>>>>> elaborate explanation with code snippets. >>>>>>>> >>>>>>>> >>>>>>>> On Sun, May 26, 2013 at 10:18 PM, Abbas Butt < >>>>>>>> abb...@en...> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 24, 2013 at 7:04 PM, Ashutosh Bapat < >>>>>>>>> ash...@en...> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, May 24, 2013 at 9:01 AM, Abbas Butt < >>>>>>>>>> abb...@en...> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, May 24, 2013 at 7:22 AM, Ashutosh Bapat < >>>>>>>>>>> ash...@en...> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 23, 2013 at 9:21 PM, Abbas Butt < >>>>>>>>>>>> abb...@en...> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> While working on test case plancache it was brought up as a >>>>>>>>>>>>> review comment that solving bug id 3607975 should solve the problem of the >>>>>>>>>>>>> test case. >>>>>>>>>>>>> However there is some confusion in the statement of bug id >>>>>>>>>>>>> 3607975. >>>>>>>>>>>>> >>>>>>>>>>>>> "When a user does and PREPARE and then EXECUTEs multiple >>>>>>>>>>>>> times, the coordinator keeps on preparing and executing the query on >>>>>>>>>>>>> datanode al times, as against preparing once and executing multiple times. >>>>>>>>>>>>> This is because somehow the remote query is being prepared as an unnamed >>>>>>>>>>>>> statement." >>>>>>>>>>>>> >>>>>>>>>>>>> Consider this test case >>>>>>>>>>>>> >>>>>>>>>>>>> A. create table abc(a int, b int); >>>>>>>>>>>>> B. insert into abc values(11, 22); >>>>>>>>>>>>> C. prepare p1 as select * from abc; >>>>>>>>>>>>> D. execute p1; >>>>>>>>>>>>> E. execute p1; >>>>>>>>>>>>> F. execute p1; >>>>>>>>>>>>> >>>>>>>>>>>>> Here are the confusions >>>>>>>>>>>>> >>>>>>>>>>>>> 1. The coordinator never prepares on datanode in response to a >>>>>>>>>>>>> prepare issued by a user. >>>>>>>>>>>>> In fact step C does nothing on the datanodes. >>>>>>>>>>>>> Step D simply sends "SELECT a, b FROM abc" to all >>>>>>>>>>>>> datanodes. >>>>>>>>>>>>> >>>>>>>>>>>>> 2. In step D, ExecuteQuery calls BuildCachedPlan to build a >>>>>>>>>>>>> new generic plan, >>>>>>>>>>>>> and steps E and F use the already built generic plan. >>>>>>>>>>>>> For details see function GetCachedPlan. >>>>>>>>>>>>> This means that executing a prepared statement again and >>>>>>>>>>>>> again does use cached plans >>>>>>>>>>>>> and does not prepare again and again every time we issue >>>>>>>>>>>>> an execute. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> The problem is not here. The problem is in do_query() where >>>>>>>>>>>> somehow the name of prepared statement gets wiped out and we keep on >>>>>>>>>>>> preparing unnamed statements at the datanode. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We never prepare any named/unnamed statements on the datanode. I >>>>>>>>>>> spent time looking at the code written in do_query and functions called >>>>>>>>>>> from with in do_query to handle prepared statements but the code written in >>>>>>>>>>> pgxc_start_command_on_connection to handle statements prepared on datanodes >>>>>>>>>>> is dead as of now. It is never called during the complete regression run. >>>>>>>>>>> The function ActivateDatanodeStatementOnNode is never called. The way >>>>>>>>>>> prepared statements are being handled now is the same as I described >>>>>>>>>>> earlier in the mail chain with the help of an example. >>>>>>>>>>> The code that is dead was originally added by Mason through >>>>>>>>>>> commit d6d2d3d925f571b0b58ff6b4f6504d88e96bb342, back in December 2010. >>>>>>>>>>> This code has been changed a lot over the last two years. This commit does >>>>>>>>>>> not contain any test cases so I am not sure how did it use to work back >>>>>>>>>>> then. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This code wasn't dead, when I worked on prepared statements. So, >>>>>>>>>> something has gone wrong in-between. That's what we need to find out and >>>>>>>>>> fix. Not preparing statements on the datanode is not good for performance >>>>>>>>>> either. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I was able to find the reason why the code was dead and the >>>>>>>>> attached patch (WIP) fixes the problem. This would now ensure that >>>>>>>>> statements are prepared on datanodes whenever required. However there is a >>>>>>>>> problem in the way prepared statements are handled. The problem is that >>>>>>>>> unless a prepared statement is executed it is never prepared on datanodes, >>>>>>>>> hence changing the path before executing the statement gives us incorrect >>>>>>>>> results. For Example >>>>>>>>> >>>>>>>>> create schema s1 create table abc (f1 int) distribute by >>>>>>>>> replication; >>>>>>>>> create schema s2 create table abc (f1 int) distribute by >>>>>>>>> replication; >>>>>>>>> >>>>>>>>> insert into s1.abc values(123); >>>>>>>>> insert into s2.abc values(456); >>>>>>>>> set search_path = s2; >>>>>>>>> prepare p1 as select f1 from abc; >>>>>>>>> set search_path = s1; >>>>>>>>> execute p1; >>>>>>>>> >>>>>>>>> The last execute results in 123, where as it should have resulted >>>>>>>>> in 456. >>>>>>>>> I can finalize the attached patch by fixing any regression issues >>>>>>>>> that may result and that would fix 3607975 and improve performance however >>>>>>>>> the above test case would still fail. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> My conclusion is that the bug ID 3607975 is not reproducible. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Did you verify it under the debugger? If that would have been >>>>>>>>>>>> the case, we would not have seen this problem if search_path changed in >>>>>>>>>>>> between steps D and E. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If search path is changed between steps D & E, the problem >>>>>>>>>>> occurs because when the remote query node is created, schema qualification >>>>>>>>>>> is not added in the sql statement to be sent to the datanode, but changes >>>>>>>>>>> in search path do get communicated to the datanode. The sql statement is >>>>>>>>>>> built when execute is issued for the first time and is reused on subsequent >>>>>>>>>>> executes. The datanode is totally unaware that the select that it just >>>>>>>>>>> received is due to an execute of a prepared statement that was prepared >>>>>>>>>>> when search path was some thing else. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Fixing the prepared statements the way I suggested, would fix the >>>>>>>>>> problem, since the statement will get prepared at the datanode, with the >>>>>>>>>> same search path settings, as it would on the coordinator. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Comments are welcome. >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Abbas* >>>>>>>>>>>>> Architect >>>>>>>>>>>>> >>>>>>>>>>>>> Ph: 92.334.5100153 >>>>>>>>>>>>> Skype ID: gabbasb >>>>>>>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>>>>>>> * >>>>>>>>>>>>> Follow us on Twitter* >>>>>>>>>>>>> @EnterpriseDB >>>>>>>>>>>>> >>>>>>>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>>> Try New Relic Now & We'll Send You this Cool Shirt >>>>>>>>>>>>> New Relic is the only SaaS-based application performance >>>>>>>>>>>>> monitoring service >>>>>>>>>>>>> that delivers powerful full stack analytics. Optimize and >>>>>>>>>>>>> monitor your >>>>>>>>>>>>> browser, app, & servers with just a few lines of code. Try New >>>>>>>>>>>>> Relic >>>>>>>>>>>>> and get this awesome Nerd Life shirt! >>>>>>>>>>>>> http://p.sf.net/sfu/newrelic_d2d_may >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Postgres-xc-developers mailing list >>>>>>>>>>>>> Pos...@li... >>>>>>>>>>>>> >>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Best Wishes, >>>>>>>>>>>> Ashutosh Bapat >>>>>>>>>>>> EntepriseDB Corporation >>>>>>>>>>>> The Postgres Database Company >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> -- >>>>>>>>>>> *Abbas* >>>>>>>>>>> Architect >>>>>>>>>>> >>>>>>>>>>> Ph: 92.334.5100153 >>>>>>>>>>> Skype ID: gabbasb >>>>>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>>>>> * >>>>>>>>>>> Follow us on Twitter* >>>>>>>>>>> @EnterpriseDB >>>>>>>>>>> >>>>>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best Wishes, >>>>>>>>>> Ashutosh Bapat >>>>>>>>>> EntepriseDB Corporation >>>>>>>>>> The Postgres Database Company >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> *Abbas* >>>>>>>>> Architect >>>>>>>>> >>>>>>>>> Ph: 92.334.5100153 >>>>>>>>> Skype ID: gabbasb >>>>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>>>> * >>>>>>>>> Follow us on Twitter* >>>>>>>>> @EnterpriseDB >>>>>>>>> >>>>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Wishes, >>>>>>>> Ashutosh Bapat >>>>>>>> EntepriseDB Corporation >>>>>>>> The Postgres Database Company >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> *Abbas* >>>>>>> Architect >>>>>>> >>>>>>> Ph: 92.334.5100153 >>>>>>> Skype ID: gabbasb >>>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>>>> * >>>>>>> Follow us on Twitter* >>>>>>> @EnterpriseDB >>>>>>> >>>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Wishes, >>>>>> Ashutosh Bapat >>>>>> EntepriseDB Corporation >>>>>> The Postgres Database Company >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> *Abbas* >>>>> Architect >>>>> >>>>> Ph: 92.334.5100153 >>>>> Skype ID: gabbasb >>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>>>> * >>>>> Follow us on Twitter* >>>>> @EnterpriseDB >>>>> >>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Wishes, >>>> Ashutosh Bapat >>>> EntepriseDB Corporation >>>> The Postgres Database Company >>>> >>> >>> >>> >>> -- >>> -- >>> *Abbas* >>> Architect >>> >>> Ph: 92.334.5100153 >>> Skype ID: gabbasb >>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >>> * >>> Follow us on Twitter* >>> @EnterpriseDB >>> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Postgres Database Company >> > > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > * > Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> * Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |