Re: [Postgres-xc-general] XC Performance with Subquery

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

It might help, if you can send us the EXPLAIN VERBOSE output of this query.

On Thu, Jun 6, 2013 at 2:38 AM, Matt Warner <MW...@xi...> wrote:

> Just to follow up on this, the pre-1.2 version works much better—the query
> actually completes and I do see more CPU resources being used, but
> curiously, not all of them.****
>
> ** **
>
> Still trying to figure out why that would be, so if anyone has any
> suggestions, please let me know.****
>
> ** **
>
> Matt****
>
> ** **
>
> ** **
>
> *From:* Matt Warner
> *Sent:* Wednesday, June 05, 2013 8:52 AM
> *To:* 'Ashutosh Bapat'
> *Cc:* Pos...@li...
> *Subject:* RE: [Postgres-xc-general] XC Performance with Subquery****
>
> ** **
>
> I’m using 1.0.3, but will try out pre-1.2.****
>
> ** **
>
> BTW, I had to make some minor changes to get 1.0.3 to compile correctly on
> Solaris. Is anyone interested in receiving these changes? They’re things
> such as illegal return statement from void functions (which doesn’t
> actually make sense, as far as I know) that the Solaris compiler flags as
> errors.****
>
> ** **
>
> -bash-4.1$ psql****
>
> psql (PGXC 1.0.3, based on PG 9.1.9)****
>
> Type "help" for help.****
>
> ** **
>
> postgres=# explain select count(*) from accn a1 where exists (select null
> from accn_proc a2 where a2.fk_accn_id=a1.pk_accn_id and a2.fk_sta_id=52);*
> ***
>
>                                   QUERY
> PLAN                                  ****
>
>
> ------------------------------------------------------------------------------
> ****
>
> Aggregate  (cost=0.02..0.03 rows=1 width=0)****
>
>    ->  Nested Loop Semi Join  (cost=0.00..0.01 rows=1 width=0)****
>
>          Join Filter: ((a1.pk_accn_id)::text = (a2.fk_accn_id)::text)****
>
>          ->  Data Node Scan on a1  (cost=0.00..0.00 rows=1000 width=98)***
> *
>
>                Node/s: node1, node2, node3, node4, node5, node6, node7,
> node8****
>
>          ->  Data Node Scan on a2  (cost=0.00..0.00 rows=1000 width=98)***
> *
>
>                Node/s: node1, node2, node3, node4, node5, node6, node7,
> node8****
>
> (7 rows)****
>
> ** **
>
> create table accn(pk_accn_id character varying(40),****
>
> <lots of other column definitions deleted for brevity>)****
>
> distribute by hash(pk_accn_id)****
>
> to node node1, node2, node3, node4, node5, node6, node7,node8;****
>
> ** **
>
> create table accn(pk_accn_id character varying(40),****
>
> <lots of other column definitions deleted for brevity>)****
>
> distribute by hash(fk_accn_id)****
>
> to node node1, node2, node3, node4, node5, node6, node7,node8;****
>
> ** **
>
> ** **
>
> *From:* Ashutosh Bapat [mailto:ash...@en...<ash...@en...>]
>
> *Sent:* Tuesday, June 04, 2013 9:15 PM
> *To:* Matt Warner
> *Cc:* Pos...@li...
> *Subject:* Re: [Postgres-xc-general] XC Performance with Subquery****
>
> ** **
>
> Hi Matt,****
>
> Which version of XC are you using? There has been a lot of change in the
> planner since last release. You may try the latest master HEAD (to be
> released as 1.2 in about a month).****
>
> It will help if you can provide all the table definitions and EXPLAIN
> outputs.****
>
> ** **
>
> On Wed, Jun 5, 2013 at 5:40 AM, Matt Warner <MW...@xi...> wrote:****
>
> I need to correct item 3, below. The coordinator and only one of the data
> nodes goes to work. One by one, each of the data nodes appears to spin up
> to process the request and then go back to sleep.****
>
>  ****
>
> ?****
>
>  ****
>
> *From:* Matt Warner
> *Sent:* Tuesday, June 04, 2013 5:00 PM
> *To:* 'Pos...@li...'
> *Subject:* XC Performance with Subquery****
>
>  ****
>
> I’ve been experimenting with XC and see interesting results. I’m hoping
> someone can help explain something I’m seeing.****
>
>  ****
>
> 1.       I created two distributed tables, one with a primary key, one
> with a foreign key, and hashed both tables by that key. I’m expecting this
> to mean that the data for a given key is localized to a single node.****
>
> 2.       When I perform a simple “select count(*) from table1” I see all
> 8 data nodes consuming CPU (plus the coordinator), which I take to be a
> good sign—all nodes are working in parallel.****
>
> 3.       When I perform a join on the distribution key, I see only the
> coordinator go to work instead of all 8 data nodes.****
>
> 4.       I notice that the explain plan appears similar to page 55 of
> this document (
> http://www.pgcon.org/2012/schedule/attachments/224_Postgres-XC_tutorial.pdf
> ).****
>
> 5.       I have indexes on the distribution keys, but that does not seem
> to make any difference.****
>
>  ****
>
> How do I get XC to perform the join on the data nodes? To be verbose, I am
> expecting to see more CPU resources consumed in this query:****
>
>  ****
>
> select count(*) from tablea a1 where exists (select null from tableb a2
> where a2.fk_accn_id=a1.pk_accn_id and a2.fk_sta_id=52);****
>
>  ****
>
> Rewriting this as a simple join does not seem to work any better.****
>
>  ****
>
> What am I missing?****
>
>  ****
>
> TIA,****
>
>  ****
>
> Matt****
>
>
>
> ------------------------------------------------------------------------------
> How ServiceNow helps IT people transform IT departments:
> 1. A cloud service to automate IT design, transition and operations
> 2. Dashboards that offer high-level views of enterprise services
> 3. A single system of record for all IT processes
> http://p.sf.net/sfu/servicenow-d2d-j
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general****
>
>
>
>
> -- ****
>
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company****
>

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company