From: Andrea A. <and...@ge...> - 2011-07-29 20:31:42
|
On Fri, Jul 29, 2011 at 3:10 PM, Alessandro Ferrucci <ale...@gm...> wrote: > Hello, > I have many many points (roughly 100-200 millions) all throughout the > united states. > I am loading that data into a set of postGIS servers partitioned by > states or counties (partitioning by state is easier because it > requires less instances, but partitioning by county will provide more > "targeted" lookups). > > What I want to do is for a getMap request (that will be done per tile) > I want to quickly target the database to fetch the points from by > first doing an intersection lookup by state boundaries (if I > partitioned) or by county boundaries (if I partitioned by county). > > so let's assume I partitioned by state, what I would do is receive a > getMap request, fetch all the states that intersect that bounding box, > run through each state I come up with and paint all the points from > that state on the tile raster. > > What is the standard method with which people use geoserver to > "target" a particular sets of partitions to grab features from to > paint on a particular raster without having to run through all the > possible partitions? We don't have such a thing at the moment. 200 millions points is big, but not an outrageous number, we played with datasets with 150 million polygons stored in a Oracle RAC recently (which makes easily for 10-15 times as much coordinates as your dataset). Did you try putting all of them into a single postgis instance, mark the geometry as non null, and then cluster along the spatial index? http://postgis.refractions.net/documentation/manual-1.5/ch06.html#id2635907 This should significantly decrease the amount of I/O needed to fetch a small amount of points all close to each other. Also, I'm assuming you are going to display the points only very close to the ground, very zoomed in, so that no more than a few thousands show up in the map, right? In case the above does not work I'm building an aggregating data store that can load in parallel data from many stores having the same structure. It's meant to aggregate several wfs stores with the same type structure, but it could work the same in your case, and could be configured to issue all 50 queries in parallel to minimize the total time to query them all. The store would not know anything about the spatial partitioning though, but it could be improved to leverage a spatial partitioning information so that only the relevant child stores are queried. The store should be ready within the next month, if you want to play with it let me know and I'll ping you when it's ready Cheers Andrea -- ------------------------------------------------------- Ing. Andrea Aime GeoSolutions S.A.S. Tech lead Via Poggio alle Viti 1187 55054 Massarosa (LU) Italy phone: +39 0584 962313 fax: +39 0584 962313 http://www.geo-solutions.it http://geo-solutions.blogspot.com/ http://www.youtube.com/user/GeoSolutionsIT http://www.linkedin.com/in/andreaaime http://twitter.com/geowolf ------------------------------------------------------- -- ------------------------------------------------------- Ing. Andrea Aime GeoSolutions S.A.S. Tech lead Via Poggio alle Viti 1187 55054 Massarosa (LU) Italy phone: +39 0584 962313 fax: +39 0584 962313 http://www.geo-solutions.it http://geo-solutions.blogspot.com/ http://www.youtube.com/user/GeoSolutionsIT http://www.linkedin.com/in/andreaaime http://twitter.com/geowolf ------------------------------------------------------- |