[Geoserver-users] standard method of partitioning WMS data and retrieval

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Fri, Jul 29, 2011 at 3:10 PM, Alessandro Ferrucci
<ale...@gm...> wrote:
> Hello,
> I have many many points (roughly 100-200 millions) all throughout the
> united states.
> I am loading that data into a set of postGIS servers partitioned by
> states or counties (partitioning by state is easier because it
> requires less instances, but partitioning by county will provide more
> "targeted" lookups).
>
> What I want to do is for a getMap request (that will be done per tile)
> I want to quickly target the database to fetch the points from by
> first doing an intersection lookup by state boundaries (if I
> partitioned) or by county boundaries (if I partitioned by county).
>
> so let's assume I partitioned by state, what I would do is receive a
> getMap request, fetch all the states that intersect that bounding box,
> run through each state I come up with and paint all the points from
> that state on the tile raster.
>
> What is the standard method with which people use geoserver to
> "target" a particular sets of partitions to grab features from to
> paint on a particular raster without having to run through all the
> possible partitions?

We don't have such a thing at the moment.
200 millions points is big, but not an outrageous number, we played
with datasets with 150 million polygons stored in a Oracle RAC recently
(which makes easily for 10-15 times as much coordinates as your dataset).
Did you try putting all of them into a single postgis instance, mark the
geometry as non null, and then cluster along the spatial index?
http://postgis.refractions.net/documentation/manual-1.5/ch06.html#id2635907
This should significantly decrease the amount of I/O needed to
fetch a small amount of points all close to each other.

Also, I'm assuming you are going to display the points only
very close to the ground, very zoomed in, so that no more than
a few thousands show up in the map, right?

In case the above does not work I'm building an aggregating
data store that can load in parallel data from many stores having
the same structure. It's meant to aggregate several wfs stores
with the same type structure, but it could work the same in your
case, and could be configured to issue all 50 queries in parallel
to minimize the total time to query them all.

The store would not know anything about the spatial partitioning
though, but it could be improved to leverage a spatial partitioning
information so that only the relevant child stores are queried.

The store should be ready within the next month, if you want
to play with it let me know and I'll ping you when it's ready

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054  Massarosa (LU)
Italy

phone: +39 0584 962313
fax:      +39 0584 962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

-- 
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054  Massarosa (LU)
Italy

phone: +39 0584 962313
fax:      +39 0584 962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------