Menu

#403 PREPARE and EXECUTE do not work as expected

1.1 Dev Q
open
8
2013-06-11
2013-03-14
No

When a user does and PREPARE and then EXECUTEs multiple times, the coordinator keeps on preparing and executing the query on datanode al times, as against preparing once and executing multiple times. This is because somehow the remote query is being prepared as an unnamed statement.

Discussion

  • Koichi Suzuki

    Koichi Suzuki - 2013-06-11
    • assigned_to: amitdkhan --> gabbasb
    • priority: 5 --> 8
     
  • Koichi Suzuki

    Koichi Suzuki - 2013-06-11

    This is related to plancache regression test fix.

     
  • Abbas Butt

    Abbas Butt - 2013-06-27

    The problem is that the code in do_query to handle prepared statements is dead.
    The function ActivateDatanodeStatementOnNode is never called and hence we never
    prepare statements on datanode.

    The code was dead because of two issues:

    1. The function CompleteCachedPlan was wrongly setting stmt_name to NULL and this was the main reason ActivateDatanodeStatementOnNode was not being called in the function pgxc_start_command_on_connection.
    This was because of a merge issue, caused by c1dd6cb5fdea86bbddfb471b1da56bb54b604c45.

    @@ -318,9 +318,6 @@ CompleteCachedPlan(CachedPlanSource *plansource,
    plansource->parserSetupArg = parserSetupArg;
    plansource->cursor_options = cursor_options;
    plansource->fixed_result = fixed_result;
    -#ifdef PGXC
    - plansource->stmt_name = NULL;
    -#endif
    plansource->resultDesc = PlanCacheComputeResultDesc(querytree_list);

    2. The function SetRemoteStatementName was wrongly assuming that a prepared statement must have some parameters.

    @@ -491,10 +491,6 @@ SetRemoteStatementName(Plan *plan, const char *stmt_name, int num_params,
    if (!plan)
    return 0;

    - /* Leave if no parameters */
    - if (num_params == 0 || !param_types)
    - return 0;
    -
    if (IsA(plan, RemoteQuery))
    {

    Fixing these two issues makes sure that the function ActivateDatanodeStatementOnNode is now called and statements get prepared on the datanode.
    However trying JDBC programs crashes the server and for that we ned this small fix

    @@ -506,7 +502,7 @@ SetRemoteStatementName(Plan *plan, const char *stmt_name, int num_params,
    if (remotequery->rq_num_params != 0)
    return 0;

    - if (stmt_name)
    + if (stmt_name && stmt_name[0] != '\0')
    {
    strcpy(name, stmt_name);

    and after these fixes another issues surfaces up.

    Consider this test case

    (A) set enable_fast_query_shipping=false;
    (B) CREATE TABLE xc_alter_table_3 (a int, b varchar(10)) DISTRIBUTE BY HASH(a);
    (C) INSERT INTO xc_alter_table_3 VALUES (1, 'a');
    (D) PREPARE xc_alter_table_delete AS DELETE FROM xc_alter_table_3 WHERE a = $1;
    (E) EXECUTE xc_alter_table_delete(11);
    (F) ALTER TABLE xc_alter_table_3 DISTRIBUTE BY HASH(b);
    (G) EXECUTE xc_alter_table_delete(1);
    (H) SELECT b FROM xc_alter_table_3 WHERE a = 1;

    The execute in step (G) fails because of the following reason:
    Step (E) prepares the statement on datanodes.
    The select step of the delete looks like

    SELECT a, ctid, xc_node_id FROM ONLY public.xc_alter_table_3 WHERE (a = 11)

    Note that it should have been

    SELECT a, ctid, xc_node_id FROM ONLY public.xc_alter_table_3 WHERE (a = $1)

    The execute of step (G) sends only a bind to the datanode which is ignored by datanode
    since the select that was sent to it did not have any parameter. Hence the delete fails.

    To fix this issue we have to skip parameter substitution in eval_const_expressions_mutator
    using this

    @@ -2250,7 +2252,26 @@ eval_const_expressions_mutator(Node *node,
    case T_Param:
    {
    Param *param = (Param *) node;
    -
    +#ifdef PGXC
    + /*
    + * This parameter substitution should only happen on datanodes
    + * The reason is as follows:
    + * set enable_fast_query_shipping=false;
    + * CREATE TABLE tab_3 (a int, b varchar(10))
    + * DISTRIBUTE BY HASH(a);
    + * INSERT INTO tab_3 VALUES (1, 'a');
    + * PREPARE d3 AS DELETE FROM tab_3 WHERE a = $1;
    + * EXECUTE d3(1);
    + * The execute first sends a select to the datanode,
    + * that select should be
    + * SELECT a, ctid, xc_node_id FROM ONLY tab_3 WHERE (a = $1)
    + * If this substitution happens on coordinator it becomes
    + * SELECT a, ctid, xc_node_id FROM ONLY tab_3 WHERE (a = 1)
    + * which causes errors in subsequent use of the statement.
    + */
    + if (IS_PGXC_DATANODE)
    + {
    +#endif
    /* Look to see if we've been given a value for this Param */
    if (param->paramkind == PARAM_EXTERN &&
    context->boundParams != NULL &&

    Having fixed all these issues we are only left with one more issue to tackle.

    The problem is that unless a prepared statement is executed it is never prepared on datanodes,
    hence changing the path before executing the statement gives us incorrect results. For Example

    create schema s1 create table abc (f1 int) distribute by replication;
    create schema s2 create table abc (f1 int) distribute by replication;

    insert into s1.abc values(123);
    insert into s2.abc values(456);
    set search_path = s2;
    prepare p1 as select f1 from abc;
    set search_path = s1;
    execute p1;

    The last execute results in 123, where as it should have resulted in 456.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.