From: Ashutosh B. <ash...@en...> - 2013-10-08 08:55:48
|
It looks like having multiple connections to the datanodes from different coordinators could be a problem if there are writes involved. I am checking if 1.B.ii below is feasible. I remember, Michael had implementing some way by which rows can be sent to and fro between coordinator and the datanode for implementing alter table distribute by. It might be possible to use the same protocol. Michael, can you please point out relevant code in as much details as possible? B. Population of materialized view (REFRESH MV command) > There are two approaches here. > i. REFRESH MV command would be sent to each of the coordinators, and > each of > the coordinator would populate its MV table separately. This means > that each > of the coordinators would fire the same query and would get the same > result, > which is waste of resources. In a sane MV implementation we won't > expect MVs > to be refreshed frequently, and thus having this wastage once in a > while > would not be an overhead. Given that this approach needs very small > efforts, > it might be acceptable in version 1. The planner code is sprinkled > with > !IsConnFromCoord()? where we do not create RemoteQueryPaths for > remote > relations. For population of MV, we need to lift this restriction. I > am not sure how > to distinguish normal scans from scans for refreshing MV. > > ii. REFRESH MV on the coordinator where the REFRESH command is > issued and > send the data using COPY or some bulk protocols to the other > coordinators. > This approach needs some extra efforts for constructing the COPY > command > and the corresponding data file to be sent over to the other > coordinators. > Also, it needs some changes to REFRESH MATERIALIZED handling so that > such > data can be sent along with it. We may try this approach in version > 2. > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |