forgive me my curiosity - what is a need in SQL server for "encyclopaedia" data set? those postcodes, city names, etc are very stable, so it might be good to do it once as a text file (I mean - data collection), make an indexes, than forget it - it is done...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
right now there are two countries, UK and US, with two different CVS file contents (and the code to handle them). They need to be saved out into a uniform CSV layout for portability; different implementations of the library can use whatever format they like. The (very rudimentary) perl implementation takes about 10s to load the US file in -too long for a standalone cgi-bin. I would like a cgi bin version running from sourceforge for deployed use, and need the time and perl skills to do that.
The .net version loads it into memory and does some lookups using moderately naive geography for(lat,long) -> postcode and string/substring match for the other way. The fact that each country has a different match policy makes things more *entertaining*.
I think a database is the way to go for any production deployment and will explore all options. But for the early dev (when we dont know what different country postcode support is like), raw code and data keeps all our options open :-) . Also it makes deployment somewhat simpler.
-steve
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How about using Hypersonic SQL (http://sourceforge.net/projects/hsql/) to store the data... I've found it fairly fast and usable.
- brill
forgive me my curiosity - what is a need in SQL server for "encyclopaedia" data set? those postcodes, city names, etc are very stable, so it might be good to do it once as a text file (I mean - data collection), make an indexes, than forget it - it is done...
yes, rate of change is near zero. But you could collect more dynamic stuff (hit counts, where people are) to a database.
and if the db is persistent then applets, servlets, whatever can avoid the startup delays of the read.
seem reasonable?
right now there are two countries, UK and US, with two different CVS file contents (and the code to handle them). They need to be saved out into a uniform CSV layout for portability; different implementations of the library can use whatever format they like. The (very rudimentary) perl implementation takes about 10s to load the US file in -too long for a standalone cgi-bin. I would like a cgi bin version running from sourceforge for deployed use, and need the time and perl skills to do that.
The .net version loads it into memory and does some lookups using moderately naive geography for(lat,long) -> postcode and string/substring match for the other way. The fact that each country has a different match policy makes things more *entertaining*.
I think a database is the way to go for any production deployment and will explore all options. But for the early dev (when we dont know what different country postcode support is like), raw code and data keeps all our options open :-) . Also it makes deployment somewhat simpler.
-steve
...and looking at HSQL it looks the ideal choice for a java implementation. Thanks for the pointer.
-steve