Re: [Postgres-xc-developers] Patch for fetching only required columns in case of DMLs - Bug 3609665

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Mon, Apr 1, 2013 at 12:14 PM, Amit Khandekar <
ami...@en...> wrote:

> It will not block or affect the trigger support as far as correctness is
> concerned. But for triggers, we need to fetch the complete row, so we don't
> want to include both the OLD row and  all the unnecessary columns in the
> SELECT list.
>
>
If it's so, I think, we will leave it aside for now, complete minimal
trigger work and then get back to it. Is that fine?


> Currently the SELECT is like this :
> SELECT col1, col2, col3 .... , ctid, xc_node_id from tab1
> For triggers, it would be :
> SELECT col1, col2, col3 .... , , tab1.* ctid, xc_node_id from tab1
>
> What we want is :
> Without triggers:
> SELECT ctid, xc_node_id from tab1
> With triggers:
> SELECT ctid, xc_node_id, tab1.*  from tab1
>
>
>
>
> On 1 April 2013 10:49, Ashutosh Bapat <ash...@en...>wrote:
>
>> Hi Amit,
>> Does this have any relation with triggers, if so, can you please explain
>> this relation?
>>
>> On Mon, Apr 1, 2013 at 9:57 AM, Amit Khandekar <
>> ami...@en...> wrote:
>>
>>> Attached is a patch dml-targetlist.patch which prevents unnecessary
>>> columns from being fetched from the ModifyTable SELECT subplans (bug
>>> 3609665<https://sourceforge.net/tracker/?func=detail&aid=3609665&group_id=311227&atid=1310232>
>>> )
>>>
>>> -------
>>>
>>> In expand_targetlist(), currently Vars of all the table attributes that
>>> are absent are added into the target list.  The patch adds NULL constants
>>> instead of Var nodes, so that those attributes will not be fetched from the
>>> SELECT. We still want to keep NULL constants because we need the complete
>>> tuple. We don't want to change the PG way of having all the attributes of
>>> the NEW ROW tuple descriptor of the source plan slot. Also other PG things
>>> like constraint expressions and triggers assume this format, so it is
>>> convenient to not modify this design.
>>>
>>> Using NULL constants will cause the remote SELECT statement to
>>> completely exclude all the unnecessary columns. It does not even have NULLs.
>>>
>>>
>>> Below is how the new plan will look now:
>>>
>>> # explain verbose update newtab *set name5 = f(name5) where name4 =
>>> f(name4)*;
>>>  Update on public.newtab  (cost=0.00..250.00 rows=1000 width=234)
>>>    Node/s: data_node_1, data_node_2
>>>    Node expr: newtab.id
>>>    Remote query: UPDATE ONLY public.newtab SET name5 = $6 WHERE ctid =
>>> $7 AND xc_node_id = $9
>>>    ->  Data Node Scan on newtab "_REMOTE_TABLE_QUERY_"
>>> (cost=0.00..250.00 rows=1000 width=234)
>>>          Output: newtab.id, NULL::character varying, NULL::character
>>> varying, NULL::character varying, NULL::character varying, f(newtab.name5),
>>> newtab.ctid, newtab.*, newtab.xc_node_id
>>>          Node/s: data_node_1, data_node_2
>>>          Remote query: *SELECT id, name5, ctid, newtab.*::newtab,
>>> xc_node_id, name4 *FROM ONLY newtab WHERE true
>>>          Coordinator quals: ((newtab.name4)::text =
>>> (f(newtab.name4))::text)
>>>
>>>
>>> Notice the following:
>>>
>>> 1. Remote query now does not have any other columns namely name1,
>>> name2, name3.
>>>
>>> 2. The column id is still present. This is because it is a distribution
>>> column, and currently we use this column to decide the target node at
>>> execution time. Unfortunately I could not exclude the distribution column
>>> from the SELECT subplan target list. Doing this required nodeid-based
>>> target datanode determination for updates/deletes. Currently update/deletes
>>> use distribution column even if we fetch xc_node_id. I had almost come up
>>> with this nodeid-based implementation except that EXPLAIN did not correctly
>>> show the en_expr expression if that expression is a xc_node_id Var. Have
>>> attached another patch (working_nodeid_based.patch) that includes both :
>>> selecting reqd columns, plus node_id based target datanode determination.
>>> Due to the explain-of-en_expr issue, we need to first checkin
>>> dml-targetlist.patch, because that optimization is important for trigger
>>> implementation. The nodeid-based implementation does not compliment any of
>>> the changes in this patch.
>>>
>>> -------
>>>
>>> In rewriteTargetListUD(), we had added into the parse->targetlist the
>>> attributes need for quals; the RemoteQuery->base_tlist anyways does this
>>> job of getting the attributes added up in the base_tlist so it is not
>>> required to do this in rewrite. I have removed it.
>>>
>>> -------
>>>
>>> Since now there are only required columns in the SELECT subplan, we
>>> cannot fire check and not-null constraints on coordinator. For this, in the
>>> patch, we now include those columns that are referred in the constraint
>>> expressions. Also, we keep the not-null constraint check deferred for the
>>> datanode. We anyways need the final values for the constraints to get
>>> fired, and on datanode we know these are the final values. So in general
>>> when possible we should not fire constraints on coordinator. When all the
>>> constraints are shippable, they should be fired on datanode, else they
>>> should be fired on coordinator. In fact, we don't currently have a check
>>> whether the constraint is shippable, so this is a bug. Have added this
>>> constraint-shippablity check also.
>>>
>>> ------
>>>
>>> In the future, we could also consider optimization in the BIND data row.
>>> Currently it includes NULL values for parameters that are not used in the
>>> remote statement. We could just skip those parameters instead and thus
>>> reduce the data row to some extent. But this requires tweaking the
>>> parameter numbers ($1, $2 etc) generated in the deparsed statement.
>>>
>>> ------
>>>
>>>
>>> Tests
>>> ======
>>>
>>> I have used xc_constraints test to add new check-constraint related
>>> tests. Otherwise, the existing tests are more than enough to test this
>>> patch. See the other regression test changes in the patch to get an idea of
>>> how this patch affects the plan target list.
>>>
>>> In xc_FQS.sql, the target list of FQSed queries are printed. I suspect
>>> that the target list does not reflect correct columns currently. I suspect
>>> set_plan_references changes the varattnos but they do not end up  correctly
>>> reflecting the actual target list. The diff in the test file is because we
>>> have lesser columns, but the column names anyway do not show the correct
>>> columns.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Own the Future-Intel&reg; Level Up Game Demo Contest 2013
>>> Rise to greatness in Intel's independent game demo contest.
>>> Compete for recognition, cash, and the chance to get your game
>>> on Steam. $5K grand prize plus 10 genre and skill prizes.
>>> Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
>>> _______________________________________________
>>> Postgres-xc-developers mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>
>>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company