Re: [Postgres-xc-developers] Regarding bug id 3607975

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Can you please explain what this fix does? It would help to have an
elaborate explanation with code snippets.

On Sun, May 26, 2013 at 10:18 PM, Abbas Butt <abb...@en...>wrote:

>
>
> On Fri, May 24, 2013 at 7:04 PM, Ashutosh Bapat <
> ash...@en...> wrote:
>
>>
>>
>>
>> On Fri, May 24, 2013 at 9:01 AM, Abbas Butt <abb...@en...>wrote:
>>
>>>
>>>
>>> On Fri, May 24, 2013 at 7:22 AM, Ashutosh Bapat <
>>> ash...@en...> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, May 23, 2013 at 9:21 PM, Abbas Butt <
>>>> abb...@en...> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> While working on test case plancache it was brought up as a review
>>>>> comment that solving bug id 3607975 should solve the problem of the test
>>>>> case.
>>>>> However there is some confusion in the statement of bug id 3607975.
>>>>>
>>>>> "When a user does and PREPARE and then EXECUTEs multiple times, the
>>>>> coordinator keeps on preparing and executing the query on datanode al
>>>>> times, as against preparing once and executing multiple times. This is
>>>>> because somehow the remote query is being prepared as an unnamed statement."
>>>>>
>>>>> Consider this test case
>>>>>
>>>>> A. create table abc(a int, b int);
>>>>> B. insert into abc values(11, 22);
>>>>> C. prepare p1 as select * from abc;
>>>>> D. execute p1;
>>>>> E. execute p1;
>>>>> F. execute p1;
>>>>>
>>>>> Here are the confusions
>>>>>
>>>>> 1. The coordinator never prepares on datanode in response to a prepare
>>>>> issued by a user.
>>>>>      In fact step C does nothing on the datanodes.
>>>>>      Step D simply sends "SELECT a, b FROM abc" to all datanodes.
>>>>>
>>>>> 2. In step D, ExecuteQuery calls BuildCachedPlan to build a new
>>>>> generic plan,
>>>>>     and steps E and F use the already built generic plan.
>>>>>     For details see function GetCachedPlan.
>>>>>     This means that executing a prepared statement again and again
>>>>> does use cached plans
>>>>>     and does not prepare again and again every time we issue an
>>>>> execute.
>>>>>
>>>>>
>>>> The problem is not here. The problem is in do_query() where somehow the
>>>> name of prepared statement gets wiped out and we keep on preparing unnamed
>>>> statements at the datanode.
>>>>
>>>
>>> We never prepare any named/unnamed statements on the datanode. I spent
>>> time looking at the code written in do_query and functions called from with
>>> in do_query to handle prepared statements but the code written in
>>> pgxc_start_command_on_connection to handle statements prepared on datanodes
>>> is dead as of now. It is never called during the complete regression run.
>>> The function ActivateDatanodeStatementOnNode is never called. The way
>>> prepared statements are being handled now is the same as I described
>>> earlier in the mail chain with the help of an example.
>>> The code that is dead was originally added by Mason through commit
>>> d6d2d3d925f571b0b58ff6b4f6504d88e96bb342, back in December 2010. This code
>>> has been changed a lot over the last two years. This commit does not
>>> contain any test cases so I am not sure how did it use to work back then.
>>>
>>>
>>
>> This code wasn't dead, when I worked on prepared statements. So,
>> something has gone wrong in-between. That's what we need to find out and
>> fix. Not preparing statements on the datanode is not good for performance
>> either.
>>
>
> I was able to find the reason why the code was dead and the attached patch
> (WIP) fixes the problem. This would now ensure that statements are prepared
> on datanodes whenever required. However there is a problem in the way
> prepared statements are handled. The problem is that unless a prepared
> statement is executed it is never prepared on datanodes, hence changing the
> path before executing the statement gives us incorrect results. For Example
>
>   create schema s1 create table abc (f1 int) distribute by replication;
>   create schema s2 create table abc (f1 int) distribute by replication;
>
>   insert into s1.abc values(123);
>   insert into s2.abc values(456);
>   set search_path = s2;
>   prepare p1 as select f1 from abc;
>   set search_path = s1;
>   execute p1;
>
> The last execute results in 123, where as it should have resulted in 456.
> I can finalize the attached patch by fixing any regression issues that may
> result and that would fix 3607975 and improve performance however the above
> test case would still fail.
>
>
>>
>>
>>>
>>>>
>>>>> My conclusion is that the bug ID 3607975 is not reproducible.
>>>>>
>>>>>
>>>> Did you verify it under the debugger? If that would have been the case,
>>>> we would not have seen this problem if search_path changed in between steps
>>>> D and E.
>>>>
>>>
>>> If search path is changed between steps D & E, the problem occurs
>>> because when the remote query node is created, schema qualification is not
>>> added in the sql statement to be sent to the datanode, but changes in
>>> search path do get communicated to the datanode. The sql statement is built
>>> when execute is issued for the first time and is reused on subsequent
>>> executes. The datanode is totally unaware that the select that it just
>>> received is due to an execute of a prepared statement that was prepared
>>> when search path was some thing else.
>>>
>>>
>> Fixing the prepared statements the way I suggested, would fix the
>> problem, since the statement will get prepared at the datanode, with the
>> same search path settings, as it would on the coordinator.
>>
>>
>>>
>>>
>>>>
>>>>
>>>>>  Comments are welcome.
>>>>>
>>>>> --
>>>>> *Abbas*
>>>>>  Architect
>>>>>
>>>>> Ph: 92.334.5100153
>>>>>  Skype ID: gabbasb
>>>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/>
>>>>> *
>>>>> Follow us on Twitter*
>>>>> @EnterpriseDB
>>>>>
>>>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Try New Relic Now & We'll Send You this Cool Shirt
>>>>> New Relic is the only SaaS-based application performance monitoring
>>>>> service
>>>>> that delivers powerful full stack analytics. Optimize and monitor your
>>>>> browser, app, & servers with just a few lines of code. Try New Relic
>>>>> and get this awesome Nerd Life shirt!
>>>>> http://p.sf.net/sfu/newrelic_d2d_may
>>>>> _______________________________________________
>>>>> Postgres-xc-developers mailing list
>>>>> Pos...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes,
>>>> Ashutosh Bapat
>>>> EntepriseDB Corporation
>>>> The Postgres Database Company
>>>>
>>>
>>>
>>>
>>> --
>>> --
>>> *Abbas*
>>>  Architect
>>>
>>> Ph: 92.334.5100153
>>>  Skype ID: gabbasb
>>> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/>
>>> *
>>> Follow us on Twitter*
>>> @EnterpriseDB
>>>
>>> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community>
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Postgres Database Company
>>
>
>
>
> --
> --
> *Abbas*
>  Architect
>
> Ph: 92.334.5100153
>  Skype ID: gabbasb
> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/>
> *
> Follow us on Twitter*
> @EnterpriseDB
>
> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community>
>

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company