From: Koichi S. <koi...@gm...> - 2013-10-04 02:28:07
|
Thanks Ashutosh for the quick and nice work. Please find my comment inline. Regards; --- Koichi Suzuki 2013/10/3 Ashutosh Bapat <ash...@en...> > Hi All, > I am working on supporting MVs in XC. Here are some notes I have written. > Please provide your comments. I am leaning towards creating MVs at the > coordinators. > > For any materialized views (called MV henceforth) implementation, we need > to > implement following functionality > 1. Creation of MV (bare creation without population) > 2. Population of MV > 3. Querying MV > 4. Dropping MV > 5. Alter MV > 6. Creating Indexes on MV > > Following write-up discussed various approaches for implementing MVs in XC > 1. Create materialised view only on the coordinators > While creating the materialised view, the CREATE MATERIALIZED VIEW > statement is > turned into a CTAS query. The new table is created as a materialised > view table. > In XC, CTAS is implemented by rewriting the statement as CREATE TABLE > ...; INSERT INTO > .... Hence we will need to stop the MV's CTAS being rewritten. > > A. Creation of MV > At the time of creation of the MV table, we will need to create the > table > locally at the coordinator. The subsequent heap_insert is expected to > automatically populate this local table. By propagating Create MV > statement > to each of the coordinator (and not the datanode/s) we will get the > same > effect. More details below. > We will need to set the distribution info of MV to NULL to indicate > that > it's a local table. > Given that the catalog tables are created locally on each of the > coordinators, we should be able to CREATE MV locally on the > coordinators. > > B. Population of materialized view (REFRESH MV command) > There are two approaches here. > i. REFRESH MV command would be sent to each of the coordinators, and > each of > the coordinator would populate its MV table separately. This means > that each > of the coordinators would fire the same query and would get the same > result, > which is waste of resources. In a sane MV implementation we won't > expect MVs > to be refreshed frequently, and thus having this wastage once in a > while > would not be an overhead. Given that this approach needs very small > efforts, > it might be acceptable in version 1. The planner code is sprinkled > with > !IsConnFromCoord()? where we do not create RemoteQueryPaths for > remote > relations. For population of MV, we need to lift this restriction. I > am not sure how > to distinguish normal scans from scans for refreshing MV. > Especially in DDL handling, this IsConnFromCoord() is widely used from place to place for specific needs. So in REFRESH MV command handling, you can control what to do in such a manner. utility.c contains such things. Because master_pg93_merge is under the work and utility.c may need some more correction, it may be better to look at REL1_1_STABLE or master. > > ii. REFRESH MV on the coordinator where the REFRESH command is > issued and > send the data using COPY or some bulk protocols to the other > coordinators. > This approach needs some extra efforts for constructing the COPY > command > and the corresponding data file to be sent over to the other > coordinators. > Also, it needs some changes to REFRESH MATERIALIZED handling so that > such > data can be sent along with it. We may try this approach in version > 2. > I agree on this. > > C. Querying MV > Nothing more needs to be done here. In most of the places, we have > treated > tables with no distribution info as local tables. Hence this won't be > difficult. There might be corner cases, testing will be required for > such > corner cases. > OK. I think this is very important step toward coordinator/datanode integration. > > D. ALTER MV > I don't think there is any change required here. We will need to > test it > to make sure that the things are working as expected. We should > propagate > the ALTER MV command to all the coordinators and not to the > datanodes. > I'm afraid native code for ALTER MV does not allow to run inside 2PC or transaction block, as ALTER TYPE ... ADD. If so, we should check this only at initiated node as done in utility.c and typecmds.c, such as #ifdef PGXC /* * We disallow this in transaction blocks, because we can't cope * with enum OID values getting into indexes and then having their * defining pg_enum entries go away. */ /* Allow this to be run inside transaction block on remote nodes */ if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) #endif PreventTransactionChain(isTopLevel, "ALTER TYPE ... ADD"); > > E. Creating indexes on MV > Given that we can create indexes on the catalog tables, it should be > possible to create indexes on the MV tables as well. Testing will > need to > be done to make sure that this works. CREATE INDEX command will need > to be > propagated to all the coordinators and not to the datanodes. > Yes, we need this change. > > Materialized views are a query caching technique. Usually a query would > be > served by a single materialized view or multiple of them. So, having > them > created on the coordinator/s reduces one trip to the datanode. This > will also > allow us to take a step towards coordinator - datanode merge. > At present, this means we cannot ship joins with MVIEW. I hope it is acceptable as the first implementation. > > 2. Create MV definition on both the coordinator and datanode but keep the > data on the datanode/s > While creating the materialised view, the CREATE MATERIALIZED VIEW > statement is > turned into a CTAS query. The new table is created as a materialised > view table. > In XC, CTAS is implemented by rewriting the statement as CREATE TABLE > ...; INSERT INTO > .... We need to filter MVs out of this rewriting. > > A. Creating MV > Pass MV definition to the datanode and all the coordinators. This > would be > done in two phases. First we create just the definition on all the > nodes. > In the second phase, the MV table created on the datanode will be > populated using method below. > > B. Populating an MV > We need to device a special INSERT INTO protocol which would run the > associated query on the initiating coordinator and insert the > results into > the MV table at the datanodes. REFRESH MV would be turned into > INSERT INTO > at the initiating coordinator and each other coordinator would get > REFRESH > MV command so that the MV would be set as scanable without causing > the > query to be fired. > > C. Querying MV > Like normal tables, we will have to create corresponding > REMOTE_QUERY_PATH for MV relations. > > D. ALTER MV > Passing down ALTER MV to the datanodes should work. But it needs some > testing to make sure that ALTER MV is working fine. > Please see the comment above. > > E. Creating indexes on MV > This looks straight forward, since we can create indexes on the > datanodes. > Needs some testing. > OK. > > If we create MV on the datanodes, and do not allow distribution to be > specified, > those MVs need to be replicated (since they are stable by definition) by > default > as against the default distribution by hash. Given that materialized views > are > query caching technique, we expect that a query would be entirely served > by a > materialized view or multiple of those. In such case, having materialized > views > on datanode/s would create unnecessary hops from coordinator to the > datanode. > > Distributing MV > --------------- > If we need to create materialized view only on selected nodes > (coordinators/datanodes) we will need to > a. extend the CREATE/ALTER MV syntax to specify nodes where to create > the MV. > b. disallow any REFRESH on the nodes where the MV does not exist. > We may take up this extension in version 2. > I think it's okay as the first implementation. There's two options for internal. i) Maintain MVIEW in pgxc_class or other catalog, with full distribution information, ii) Not having any pgxc-specific catalogs and hard code version-1 specific assumption. This could be determined by required resource. Anyway, we need option i) to control the distribution at the next improvement. > > What happens to an MV when a node is added or dropped? > Ouch! Yes, it is very important point to consider. Assuming the option ii) above, because all the catalogues are copied when a node is added, we can just REFRESH it. When nodes are dropped, maybe we don't have to do anything. If we take the option i) above, we need to provide ALTER MVIEW ... ADD NODE and ALTER MVIEW DELETE NODE to maintain the distribution and we should redistribute MVIEW to the new set of nodes. This is similar to ALTER TABLE approach. ------------------------------------------------------ > We need to test. Having MV at the coordinator helps here, since addition or > removal of datanode doesn't affect the MV. While adding a coordinator, the > catalog replication itself would create the MV. Dropping a node wouldn't > be a > problem. We need to test the feature. > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |