Re: [Postgres-xc-general] the cluster cost for normalized tables

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Thank you for your response.  Can I just give a simple schema example and query resulting from it and see if it would suffer in a cluster solution using the primary keys, which are system generated names (GUIDs)?

Table ORG (ORG_HANDLE VARCHAR(50) NOT NULL, ORG_NAME VARCHAR2(150) NOT NULL);
primary key: ORG_HANDLE

TABLE POC (POC_HANDLE VARCHAR(50), FIRST_NAME(VARCHAR50) NOT NULL, LAST_NAME VARCHAR(50) NOT NULL);
primary key: POC_HANDLE

Table ORG_POC_LINK (ORG_HANDLE VARCHAR(50) NOT NULL, POC_FUNCTION VARCHAR(2) NOT NULL, POC_HANDLE VARCHAR(50) NOT NULL);
primary key: ORG_HANDLE,POC_FUNCTION,POC_HANDLE

Query:
select POC.FIRST_NAME, POC.LAST_NAME, ORG.HANDLE, OPL.POC_FUNCTION FROM POC POC, ORG ORG, ORG_POC_LINK OPL
WHERE ORG.ORG_NAME = 'whatever' and ORG.ORG_HANDLE = OPL.ORG_HANDLE and OPL.POC_HANDLE = POC.POC_HANDLE
________________________________
From: Ashutosh Bapat [ash...@en...]
Sent: Thursday, April 19, 2012 8:05 AM
To: Michael Vitale
Cc: pos...@li...; pos...@li...
Subject: Re: [Postgres-xc-general] the cluster cost for normalized tables

HI Michael,
The distribution of data depends upon the distribution strategy used. In Postgres-XC, we distribute data based on the hash/modulo of the given column. It's usually advisable to choose the same distribution for the tables which have equi-joins on their distribution columns.

Choosing the right distribution for the tables involved is an art. We need the knowledge of table definitions and set of queries to decide the exact distribution. If the queries are such that they join on collocated data, the performance is greatly improved.

On Thu, Apr 19, 2012 at 4:56 PM, Michael Vitale <mic...@ar...<mailto:mic...@ar...>> wrote:
Hi you most honorable cluster folks!

Our company is moving from Oracle to PostgreSQL.  We initially thought we would be moving to MySQL Cluster, but an investigation of how clustering works in MySQL Cluster revealed that performance would suffer substantially since it is predicated on keys that segregate SQL-requested data to specific nodes and not to all or most of the nodes.  A highly normalized database would suffer in this situation where a result set would normally consist of rows gathered from most, if not all, of the back-end nodes.

Do you all have the same problem with Clustered PostgreSQL (Postgres-XC)?

Respectfully Yours,

Michael Vitale
ARIN DBA
mic...@ar...<mailto:mic...@ar...>
703-227-9885

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Postgres-xc-general mailing list
Pos...@li...<mailto:Pos...@li...>
https://lists.sourceforge.net/lists/listinfo/postgres-xc-general

--
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company