|
From: Mason S. <ms...@tr...> - 2014-06-02 09:26:18
|
On Thu, May 29, 2014 at 7:27 AM, Alan Bryant <ala...@hd...> wrote: > > > - Can we shard on schema? (i.e. put some schemas in one location and > other schemas in different locations, sharded the same way rows are?) Or if > not, can we shard one table, and have functions execute on the same node as > some controlling row? > > > No. > > Interesting. This leaves us with the "one big table" approach, just > spread across the data nodes. Aren't there performance issues with having > massive individual tables? What techniques exist to help with this? > The individual tables can be replicated or sharded, such as a hash based on one of the columns in the table. A parent-child type of relation can be sharded on a primary key of the parent, and the foreign key referencing that table in the child. The planner recognizes that this data will co-exist on the same node and "push-down" joins. Other tables can be replicated, with an exact copy on each node if desired. Here joins can always be pushed down. If a query only involves replicated tables, it will just be sent to one single node. If a coordinator and data node are on the same server, one can configure the system to prefer to get this data from a local data node. > > Partitioning is one option I guess, using a different (but compatible) > scheme than the sharding key, perhaps a different bitmask on the same key. > At the moment users cannot specify an arbitrary expression like a bitmask. A modulo option exists, however. I think this is an item that should be a relatively high-ish priority on the to do list. > > Another question is upgrade. We use schemas primarily to make logical > upgrades possible while online, but what about Postgres-XC itself... what > is the upgrade model for Postgres-XC? (say, between major PostgreSQL > versions) > > This is another area that can be improved, leveraging pg_upgrade. For now, dump and restore... > Thanks! > -Alan > Regards, -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |