Thread: [Opentrep-svn] SF.net SVN: opentrep:[119] trunk/opentrep
Status: Beta
Brought to you by:
denis_arnaud
From: <den...@us...> - 2009-07-12 12:59:06
|
Revision: 119 http://opentrep.svn.sourceforge.net/opentrep/?rev=119&view=rev Author: denis_arnaud Date: 2009-07-12 12:59:01 +0000 (Sun, 12 Jul 2009) Log Message: ----------- [Structure] Re-organised a little bit the hierarchy of directories, so as to add core, config and batches. Modified Paths: -------------- trunk/opentrep/config/soci.m4 trunk/opentrep/configure.ac trunk/opentrep/opentrep/Makefile.am trunk/opentrep/opentrep/dbadaptor/Makefile.am trunk/opentrep/opentrep/sources.mk trunk/opentrep/po/POTFILES.in trunk/opentrep/test/testIndexer.sh trunk/opentrep/test/testSearcher.sh Added Paths: ----------- trunk/opentrep/opentrep/batches/ trunk/opentrep/opentrep/batches/Makefile.am trunk/opentrep/opentrep/batches/indexer.cpp trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/batches/sources.mk trunk/opentrep/opentrep/config/ trunk/opentrep/opentrep/config/Makefile.am trunk/opentrep/opentrep/core/ trunk/opentrep/opentrep/core/Makefile.am trunk/opentrep/opentrep/core/sources.mk Removed Paths: ------------- trunk/opentrep/opentrep/indexer.cpp trunk/opentrep/opentrep/searcher.cpp Property Changed: ---------------- trunk/opentrep/config/ trunk/opentrep/opentrep/ Property changes on: trunk/opentrep/config ___________________________________________________________________ Modified: svn:ignore - install-sh missing depcomp mdate-sh texinfo.tex ltmain.sh lib-ld.m4 lib-link.m4 lib-prefix.m4 config.rpath config.sub config.guess mkinstalldirs printf-posix.m4 uintmax_t.m4 signed.m4 iconv.m4 longlong.m4 inttypes.m4 glibc21.m4 codeset.m4 inttypes_h.m4 longdouble.m4 nls.m4 po.m4 intmax.m4 xsize.m4 lcmessage.m4 wint_t.m4 ulonglong.m4 progtest.m4 inttypes-pri.m4 stdint_h.m4 intdiv0.m4 isc-posix.m4 size_max.m4 gettext.m4 wchar_t.m4 + install-sh missing depcomp mdate-sh texinfo.tex ltmain.sh libtool.m4 lt~obsolete.m4 ltsugar.m4 ltversion.m4 ltoptions.m4 lib-ld.m4 lib-link.m4 lib-prefix.m4 config.rpath config.sub config.guess mkinstalldirs printf-posix.m4 uintmax_t.m4 signed.m4 iconv.m4 longlong.m4 inttypes.m4 glibc21.m4 codeset.m4 inttypes_h.m4 longdouble.m4 nls.m4 po.m4 intmax.m4 xsize.m4 lcmessage.m4 wint_t.m4 ulonglong.m4 progtest.m4 inttypes-pri.m4 stdint_h.m4 intdiv0.m4 isc-posix.m4 size_max.m4 gettext.m4 wchar_t.m4 Modified: trunk/opentrep/config/soci.m4 =================================================================== --- trunk/opentrep/config/soci.m4 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/config/soci.m4 2009-07-12 12:59:01 UTC (rev 119) @@ -4,115 +4,137 @@ dnl dnl We define the following configure script flags: dnl -dnl --with-soci: Give prefix for both library and headers, and try -dnl to guess subdirectory names for each. (e.g. add /lib and -dnl /include onto given dir names, and other common schemes.) +dnl --with-soci: Give prefix for both library and headers, and try +dnl to guess subdirectory names for each. (e.g. Tack /lib and +dnl /include onto given dir name, and other common schemes.) +dnl --with-soci-lib: Similar to --with-soci, but for library only. +dnl --with-soci-include: Similar to --with-soci, but for headers +dnl only. dnl -dnl @version 1.3, 2009/05/02 -dnl @author Denis Arnaud <den...@us...> -dnl dnl @version 1.2, 2007/02/20 dnl @author Warren Young <so...@et...> AC_DEFUN([AX_SOCI], [ -# -# Set up configure script macros -# -AC_ARG_WITH(soci, - [--with-soci=<path> root directory path of Soci installation], - [SOCI_lib_check="$with_soci/lib64/soci $with_soci/lib/soci $with_soci/lib64 $with_soci/lib" - SOCI_inc_check="$with_soci/include $with_soci/include/soci"], - [SOCI_lib_check="/usr/lib64 /usr/lib /usr/lib64/soci /usr/lib/soci /usr/local/lib64 /usr/local/lib /opt/soci/lib64 /opt/soci/lib /usr/local/lib64/soci /usr/local/lib/soci /usr/local/soci/lib64 /usr/local/soci/lib /opt/soci/lib64/soci /opt/soci/lib/soci" - SOCI_inc_check="/usr/include /usr/include/soci /usr/local/include /opt/soci/include /usr/local/include/soci /usr/local/soci/include /usr/local/soci/include/soci /opt/soci/include/soci"]) + # + # Set up configure script macros + # + AC_ARG_WITH(soci, + [ --with-soci=<path> root directory path of Soci installation], + [SOCI_lib_check="$with_soci/lib64/soci $with_soci/lib/soci $with_soci/lib64 $with_soci/lib" + SOCI_inc_check="$with_soci/include $with_soci/include/soci"], + [SOCI_lib_check="/usr/lib64 /usr/lib /usr/lib64/soci /usr/lib/soci /usr/local/lib64 /usr/local/lib /usr/local/lib/soci /usr/local/soci/lib /usr/local/soci/lib/soci /opt/soci/lib /opt/soci/lib/soci" + SOCI_inc_check="/usr/include /usr/include/soci /usr/local/include/soci /usr/local/soci/include /usr/local/soci/include/soci /opt/soci/include /opt/soci/include/soci"]) -# SOCI library -SOCI_CORE_LIB=soci_core -SOCI_MYSQL_LIB=soci_mysql + AC_ARG_WITH(soci-lib, + [ --with-soci-lib=<path> directory path of Soci library installation], + [SOCI_lib_check="$with_soci_lib $with_soci_lib/lib64 $with_soci_lib/lib $with_soci_lib/lib64/soci $with_soci_lib/lib/soci"]) -# -# Look for Soci Core API library -# -AC_MSG_CHECKING([for Soci library directory]) -SOCI_libdir= -for m in $SOCI_lib_check -do - if test -d "$m" && \ - (test -f "$m/lib$SOCI_CORE_LIB.so" || test -f "$m/lib$SOCI_CORE_LIB.a") + AC_ARG_WITH(soci-include, + [ --with-soci-include=<path> directory path of Soci header installation], + [SOCI_inc_check="$with_soci_include $with_soci_include/include $with_soci_include/include/soci"]) + + # SOCI library + SOCI_CORE_LIB=soci_core + SOCI_MYSQL_LIB=soci_mysql + SOCI_LIB_SUFFIX=gcc-3_0 + + # + # Look for Soci Core API library + # + AC_MSG_CHECKING([for Soci library directory]) + SOCI_libdir= + for m in $SOCI_lib_check + do + if test -d "$m" + then + for socilib in "$SOCI_CORE_LIB $SOCI_CORE_LIB-${SOCI_LIB_SUFFIX}" + do + if (test -f "$m/lib$SOCI_CORE_LIB.so" || test -f "$m/lib$SOCI_CORE_LIB.a") + then + SOCI_libdir=$m + fi + if (test -f "$m/lib${SOCI_CORE_LIB}-${SOCI_LIB_SUFFIX}.so" \ + || test -f "$m/lib${SOCI_CORE_LIB}-${SOCI_LIB_SUFFIX}.a") + then + SOCI_CORE_LIB=${SOCI_CORE_LIB}-${SOCI_LIB_SUFFIX} + SOCI_MYSQL_LIB=${SOCI_MYSQL_LIB}-${SOCI_LIB_SUFFIX} + SOCI_libdir=$m + fi + done + break + fi + done + + if test -z "$SOCI_libdir" then - SOCI_libdir=$m - break + AC_MSG_ERROR([Didn't find $SOCI_CORE_LIB library in '$SOCI_lib_check']) fi -done -if test -z "$SOCI_libdir" -then - AC_MSG_ERROR([Didn't find $SOCI_CORE_LIB library in '$SOCI_lib_check']) -fi + case "$SOCI_libdir" in + /* ) ;; + * ) AC_MSG_ERROR([The Soci library directory ($SOCI_libdir) must be an absolute path.]) ;; + esac -case "$SOCI_libdir" in - /* ) ;; - * ) AC_MSG_ERROR([The Soci library directory ($SOCI_libdir) must be an absolute path.]) ;; -esac + AC_MSG_RESULT([$SOCI_libdir]) -AC_MSG_RESULT([$SOCI_libdir]) + case "$SOCI_libdir" in + /usr/lib) ;; + *) LDFLAGS="$LDFLAGS -L${SOCI_libdir}" ;; + esac -case "$SOCI_libdir" in - /usr/lib64) ;; - /usr/lib) ;; - *) SOCI_LIBS="-L${SOCI_libdir}" ;; -esac -LDFLAGS="$LDFLAGS ${SOCI_LIBS}" + # + # Look for Soci Core API headers + # + AC_MSG_CHECKING([for Soci include directory]) + SOCI_incdir= + for m in $SOCI_inc_check + do + if test -d "$m" && (test -f "$m/soci/core/soci.h" || test -f "$m/soci/soci.h") + then + SOCI_incdir=$m + break + fi + done -# -# Look for Soci Core API headers -# -AC_MSG_CHECKING([for Soci include directory]) -SOCI_incdir= -for m in $SOCI_inc_check -do - if test -d "$m" && test -f "$m/soci/core/soci.h" + if test -z "$SOCI_incdir" then - SOCI_incdir=$m - break + AC_MSG_ERROR([Didn't find the Soci include dir in '$SOCI_inc_check']) fi -done -if test -z "$SOCI_incdir" -then - AC_MSG_ERROR([Didn't find the Soci include dir in '$SOCI_inc_check']) -fi + case "$SOCI_incdir" in + /* ) ;; + * ) AC_MSG_ERROR([The Soci include directory ($SOCI_incdir) must be an absolute path.]) ;; + esac -case "$SOCI_incdir" in - /* ) ;; - * ) AC_MSG_ERROR([The Soci include directory ($SOCI_incdir) must be an absolute path.]) ;; -esac + AC_MSG_RESULT([$SOCI_incdir]) -AC_MSG_RESULT([$SOCI_incdir]) + if test "$SOCI_incdir" != "/usr/include" + then + SOCI_CFLAGS="-I${SOCI_incdir}" + fi + if test "$SOCI_libdir" != "/usr/lib" -a "$SOCI_libdir" != "/usr/lib64" + then + SOCI_LIBS="-L${SOCI_libdir}" + fi + SOCI_CFLAGS="-DSOCI_HEADERS_BURIED -DSOCI_MYSQL_HEADERS_BURIED $SOCI_CFLAGS" + SOCI_LIBS="$SOCI_LIBS -l${SOCI_CORE_LIB} -l${SOCI_MYSQL_LIB} -ldl" + AC_SUBST(SOCI_CFLAGS) + AC_SUBST(SOCI_LIBS) -case "$SOCI_incdir" in - /usr/include) ;; - *) SOCI_CFLAGS="-I${SOCI_incdir}" ;; -esac - -SOCI_LIBS="${SOCI_LIBS} -l${SOCI_CORE_LIB} -l${SOCI_MYSQL_LIB} -ldl" - -AC_SUBST(SOCI_CFLAGS) -AC_SUBST(SOCI_LIBS) - # Test linking with soci (note that it needs MySQL client to have been defined # before) -save_LIBS="$LIBS" -if test -z "$MYSQL_LIBS" -then - MYSQL_LIBS="-L/usr/lib64/mysql -L/usr/lib/mysql -lmysqlclient" -fi -LIBS="$LIBS $MYSQL_LIBS $SOCI_LIBS" -AC_CHECK_LIB($SOCI_CORE_LIB, soci_begin, - [], - [AC_MSG_ERROR([Could not find working Soci client library!])] - ) -LIBS="$save_LIBS" -AC_SUBST(SOCI_CORE_LIB) + save_LIBS="$LIBS" + if test -z "$MYSQL_LIBS" + then + MYSQL_LIBS="-L/usr/lib64/mysql -L/usr/lib/mysql -lmysqlclient" + fi + LIBS="$LIBS $MYSQL_LIBS $SOCI_LIBS" + AC_CHECK_LIB($SOCI_CORE_LIB, soci_begin, + [], + [AC_MSG_ERROR([Could not find working Soci client library!])] + ) + LIBS="$save_LIBS" + AC_SUBST(SOCI_CORE_LIB) ]) dnl AX_SOCI Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/configure.ac 2009-07-12 12:59:01 UTC (rev 119) @@ -218,6 +218,9 @@ opentrep/dbadaptor/Makefile opentrep/command/Makefile opentrep/service/Makefile + opentrep/config/Makefile + opentrep/core/Makefile + opentrep/batches/Makefile man/Makefile info/Makefile doc/Makefile Property changes on: trunk/opentrep/opentrep ___________________________________________________________________ Modified: svn:ignore - .libs .deps stamp-h1 config.h config.h.in Makefile Makefile.in opentrep-paths.hpp opentrep_indexer opentrep_searcher + .libs .deps stamp-h1 config.h config.h.in Makefile Makefile.in Modified: trunk/opentrep/opentrep/Makefile.am =================================================================== --- trunk/opentrep/opentrep/Makefile.am 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/opentrep/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) @@ -7,7 +7,7 @@ MAINTAINERCLEANFILES = Makefile.in -SUBDIRS = basic bom factory dbadaptor command service +SUBDIRS = basic bom factory dbadaptor command service core config batches EXTRA_DIST = config_msvc.h @@ -24,33 +24,6 @@ $(top_builddir)/@PACKAGE@/service/libsvc.la lib@PACKAGE@_la_LDFLAGS = -version-info $(GENERIC_LIBRARY_VERSION) -# Binaries -bin_PROGRAMS = opentrep_indexer opentrep_searcher - -opentrep_indexer_SOURCES = $(bin_idx_h_sources) $(bin_idx_cc_sources) -opentrep_indexer_CXXFLAGS = $(BOOST_CFLAGS) -opentrep_indexer_LDFLAGS = $(BOOST_PROGRAM_OPTIONS_LIB) $(SOCI_LIBS) -opentrep_indexer_LDADD = lib@PACKAGE@.la - -opentrep_searcher_SOURCES = $(bin_srh_h_sources) $(bin_srh_cc_sources) -opentrep_searcher_CXXFLAGS = $(BOOST_CFLAGS) -opentrep_searcher_LDFLAGS = $(BOOST_PROGRAM_OPTIONS_LIB) $(SOCI_LIBS) -opentrep_searcher_LDADD = lib@PACKAGE@.la - # Header files nobase_pkginclude_HEADERS = $(service_h_sources) nobase_nodist_pkginclude_HEADERS = $(top_builddir)/@PACKAGE@/config.h - - -# Targets -all-local: @PACKAGE@-paths.hpp - -@PACKAGE@-paths.hpp: Makefile - @echo '#ifndef __OPENTREP_PATHS_HPP' > $@ - @echo '#define __OPENTREP_PATHS_HPP' >> $@ - @echo '#define PREFIXDIR "$(prefix)"' >> $@ - @echo '#define BINDIR "$(bindir)"' >> $@ - @echo '#define LIBEXECDIR "$(libexecdir)"' >> $@ - @echo '#define DATADIR "$(datadir)"' >> $@ - @echo '#define DOCDIR "$(docdir)"' >> $@ - @echo '#endif // __OPENTREP_PATHS_HPP' >> $@ Property changes on: trunk/opentrep/opentrep/batches ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile Makefile.in opentrep_indexer opentrep_searcher Added: trunk/opentrep/opentrep/batches/Makefile.am =================================================================== --- trunk/opentrep/opentrep/batches/Makefile.am (rev 0) +++ trunk/opentrep/opentrep/batches/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,23 @@ +include $(top_srcdir)/Makefile.common +include $(srcdir)/sources.mk + +## Source directory + +MAINTAINERCLEANFILES = Makefile.in + + +# Binaries (batches) +bin_PROGRAMS = opentrep_indexer opentrep_searcher + +opentrep_indexer_SOURCES = $(batches_idx_h_sources) $(batches_idx_cc_sources) +opentrep_indexer_CXXFLAGS = $(BOOST_CFLAGS) +#opentrep_indexer_LDADD = +opentrep_indexer_LDFLAGS = $(BOOST_PROGRAM_OPTIONS_LIB) $(SOCI_LIBS) \ + $(top_builddir)/@PACKAGE@/core/lib@PACKAGE@.la + + +opentrep_searcher_SOURCES = $(batches_srh_h_sources) $(batches_srh_cc_sources) +opentrep_searcher_CXXFLAGS = $(BOOST_CFLAGS) +#opentrep_searcher_LDADD = +opentrep_searcher_LDFLAGS = $(BOOST_PROGRAM_OPTIONS_LIB) $(SOCI_LIBS) \ + $(top_builddir)/@PACKAGE@/core/lib@PACKAGE@.la Copied: trunk/opentrep/opentrep/batches/indexer.cpp (from rev 114, trunk/opentrep/opentrep/indexer.cpp) =================================================================== --- trunk/opentrep/opentrep/batches/indexer.cpp (rev 0) +++ trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,153 @@ +// C +#include <assert.h> +// STL +#include <iostream> +#include <sstream> +#include <fstream> +#include <map> +#include <vector> +// Boost (Extended STL) +#include <boost/date_time/posix_time/posix_time.hpp> +#include <boost/date_time/gregorian/gregorian.hpp> +#include <boost/program_options.hpp> +// OPENTREP +#include <opentrep/OPENTREP_Service.hpp> + +// ///////// Parsing of Options & Configuration ///////// +// A helper function to simplify the main part. +template<class T> std::ostream& operator<< (std::ostream& os, + const std::vector<T>& v) { + std::copy (v.begin(), v.end(), std::ostream_iterator<T> (std::cout, " ")); + return os; +} + +int readConfiguration (int argc, char* argv[]) { + int opt; + + // Declare a group of options that will be + // allowed only on command line + boost::program_options::options_description generic("Generic options"); + generic.add_options() + ("version,v", "print version string") + ("help,h", "produce help message"); + + // Declare a group of options that will be allowed both on command line and in + // config file + boost::program_options::options_description config("Configuration"); + config.add_options() + ("optimization", + boost::program_options::value<int>(&opt)->default_value(10), + "optimization level") + ("include-path,I", + boost::program_options::value< std::vector<std::string> >()->composing(), + "include path"); + + // Hidden options, will be allowed both on command line and + // in config file, but will not be shown to the user. + boost::program_options::options_description hidden("Hidden options"); + hidden.add_options() + ("input-file", + boost::program_options::value< std::vector<std::string> >(), + "input file"); + + boost::program_options::options_description cmdline_options; + cmdline_options.add(generic).add(config).add(hidden); + + boost::program_options::options_description config_file_options; + config_file_options.add(config).add(hidden); + + boost::program_options::options_description visible("Allowed options"); + visible.add(generic).add(config); + + boost::program_options::positional_options_description p; + p.add("input-file", -1); + + boost::program_options::variables_map vm; + boost::program_options:: + store (boost::program_options::command_line_parser(argc, argv). + options (cmdline_options).positional(p).run(), vm); + + std::ifstream ifs ("request_parser.cfg"); + boost::program_options::store (parse_config_file (ifs, config_file_options), + vm); + boost::program_options::notify (vm); + + if (vm.count ("help")) { + std::cout << visible << std::endl; + return 0; + } + + if (vm.count ("version")) { + std::cout << "Open Travel Request Parser, version 1.0" << std::endl; + return 0; + } + + if (vm.count ("include-path")) { + std::cout << "Include paths are: " + << vm["include-path"].as< std::vector<std::string> >() + << std::endl; + } + + if (vm.count ("input-file")) { + std::cout << "Input files are: " + << vm["input-file"].as< std::vector<std::string> >() + << std::endl; + } + + std::cout << "Optimization level is " << opt << std::endl; + + return 0; +} + + +// /////////////// M A I N ///////////////// +int main (int argc, char* argv[]) { + try { + + // Output log File + std::string lLogFilename ("indexer.log"); + + // Xapian database name (directory of the index) + OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); + + if (argc >= 1 && argv[1] != NULL) { + std::istringstream istr (argv[1]); + istr >> lLogFilename; + } + + if (argc >= 2 && argv[2] != NULL) { + std::istringstream istr (argv[2]); + istr >> lXapianDatabaseName; + } + + // Set the log parameters + std::ofstream logOutputFile; + // open and clean the log outputfile + logOutputFile.open (lLogFilename.c_str()); + logOutputFile.clear(); + + // Initialise the context + OPENTREP::OPENTREP_Service opentrepService; + opentrepService.init (logOutputFile, lXapianDatabaseName); + + // Launch the indexation + opentrepService.buildSearchIndex(); + + // Close the Log outputFile + logOutputFile.close(); + + + } catch (const OPENTREP::RootException& otexp) { + std::cerr << "Standard exception: " << otexp.what() << std::endl; + return -1; + + } catch (const std::exception& stde) { + std::cerr << "Standard exception: " << stde.what() << std::endl; + return -1; + + } catch (...) { + return -1; + } + + return 0; +} Copied: trunk/opentrep/opentrep/batches/searcher.cpp (from rev 114, trunk/opentrep/opentrep/searcher.cpp) =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp (rev 0) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,160 @@ +// C +#include <assert.h> +// STL +#include <iostream> +#include <sstream> +#include <fstream> +#include <map> +#include <vector> +// Boost (Extended STL) +#include <boost/date_time/posix_time/posix_time.hpp> +#include <boost/date_time/gregorian/gregorian.hpp> +#include <boost/program_options.hpp> +// OPENTREP +#include <opentrep/OPENTREP_Service.hpp> + +// ///////// Parsing of Options & Configuration ///////// +// A helper function to simplify the main part. +template<class T> std::ostream& operator<< (std::ostream& os, + const std::vector<T>& v) { + std::copy (v.begin(), v.end(), std::ostream_iterator<T> (std::cout, " ")); + return os; +} + +int readConfiguration (int argc, char* argv[]) { + int opt; + + // Declare a group of options that will be + // allowed only on command line + boost::program_options::options_description generic("Generic options"); + generic.add_options() + ("version,v", "print version string") + ("help,h", "produce help message"); + + // Declare a group of options that will be allowed both on command line and in + // config file + boost::program_options::options_description config("Configuration"); + config.add_options() + ("optimization", + boost::program_options::value<int>(&opt)->default_value(10), + "optimization level") + ("include-path,I", + boost::program_options::value< std::vector<std::string> >()->composing(), + "include path"); + + // Hidden options, will be allowed both on command line and + // in config file, but will not be shown to the user. + boost::program_options::options_description hidden("Hidden options"); + hidden.add_options() + ("input-file", + boost::program_options::value< std::vector<std::string> >(), + "input file"); + + boost::program_options::options_description cmdline_options; + cmdline_options.add(generic).add(config).add(hidden); + + boost::program_options::options_description config_file_options; + config_file_options.add(config).add(hidden); + + boost::program_options::options_description visible("Allowed options"); + visible.add(generic).add(config); + + boost::program_options::positional_options_description p; + p.add("input-file", -1); + + boost::program_options::variables_map vm; + boost::program_options:: + store (boost::program_options::command_line_parser(argc, argv). + options (cmdline_options).positional(p).run(), vm); + + std::ifstream ifs ("request_parser.cfg"); + boost::program_options::store (parse_config_file (ifs, config_file_options), + vm); + boost::program_options::notify (vm); + + if (vm.count ("help")) { + std::cout << visible << std::endl; + return 0; + } + + if (vm.count ("version")) { + std::cout << "Open Travel Request Parser, version 1.0" << std::endl; + return 0; + } + + if (vm.count ("include-path")) { + std::cout << "Include paths are: " + << vm["include-path"].as< std::vector<std::string> >() + << std::endl; + } + + if (vm.count ("input-file")) { + std::cout << "Input files are: " + << vm["input-file"].as< std::vector<std::string> >() + << std::endl; + } + + std::cout << "Optimization level is " << opt << std::endl; + + return 0; +} + + +// /////////////// M A I N ///////////////// +int main (int argc, char* argv[]) { + try { + + // Travel query + OPENTREP::TravelQuery_T lTravelQuery ("cdg"); + + // Output log File + std::string lLogFilename ("searcher.log"); + + // Xapian database name (directory of the index) + OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); + + if (argc >= 1 && argv[1] != NULL) { + std::istringstream istr (argv[1]); + istr >> lTravelQuery; + } + + if (argc >= 2 && argv[2] != NULL) { + std::istringstream istr (argv[2]); + istr >> lLogFilename; + } + + if (argc >= 3 && argv[3] != NULL) { + std::istringstream istr (argv[3]); + istr >> lXapianDatabaseName; + } + + // Set the log parameters + std::ofstream logOutputFile; + // open and clean the log outputfile + logOutputFile.open (lLogFilename.c_str()); + logOutputFile.clear(); + + // Initialise the context + OPENTREP::OPENTREP_Service opentrepService; + opentrepService.init (logOutputFile, lXapianDatabaseName); + + // Query the Xapian database (index) + opentrepService.interpretTravelRequest (lTravelQuery); + + // Close the Log outputFile + logOutputFile.close(); + + } catch (const OPENTREP::RootException& otexp) { + std::cerr << "Standard exception: " << otexp.what() << std::endl; + return -1; + + } catch (const std::exception& stde) { + std::cerr << "Standard exception: " << stde.what() << std::endl; + return -1; + + } catch (...) { + return -1; + } + + return 0; +} Added: trunk/opentrep/opentrep/batches/sources.mk =================================================================== --- trunk/opentrep/opentrep/batches/sources.mk (rev 0) +++ trunk/opentrep/opentrep/batches/sources.mk 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,4 @@ +batches_idx_h_sources = +batches_idx_cc_sources = $(top_srcdir)/opentrep/batches/indexer.cpp +batches_srh_h_sources = +batches_srh_cc_sources = $(top_srcdir)/opentrep/batches/searcher.cpp Property changes on: trunk/opentrep/opentrep/config ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile Makefile.in opentrep-paths.hpp Added: trunk/opentrep/opentrep/config/Makefile.am =================================================================== --- trunk/opentrep/opentrep/config/Makefile.am (rev 0) +++ trunk/opentrep/opentrep/config/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,25 @@ +include $(top_srcdir)/Makefile.common + +## Source directory + +DISTCLEANFILES = @PACKAGE@-paths.hpp + +MAINTAINERCLEANFILES = Makefile.in + +EXTRA_DIST = @PACKAGE@-paths.hpp + +# Targets +all-local: @PACKAGE@-paths.hpp + +@PACKAGE@-paths.hpp: Makefile + @echo '#ifndef __@PACKAGE_NAME@_PATHS_HPP' > $@ + @echo '#define __@PACKAGE_NAME@_PATHS_HPP' >> $@ + @echo '#define PACKAGE "@PACKAGE@"' >> $@ + @echo '#define PACKAGE_NAME "@PACKAGE_NAME@"' >> $@ + @echo '#define PACKAGE_VERSION "@VERSION@"' >> $@ + @echo '#define PREFIXDIR "$(prefix)"' >> $@ + @echo '#define BINDIR "$(bindir)"' >> $@ + @echo '#define LIBEXECDIR "$(libexecdir)"' >> $@ + @echo '#define DATADIR "$(datadir)"' >> $@ + @echo '#define DOCDIR "$(docdir)"' >> $@ + @echo '#endif // __@PACKAGE_NAME@_PATHS_HPP' >> $@ Property changes on: trunk/opentrep/opentrep/core ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile Makefile.in Added: trunk/opentrep/opentrep/core/Makefile.am =================================================================== --- trunk/opentrep/opentrep/core/Makefile.am (rev 0) +++ trunk/opentrep/opentrep/core/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,24 @@ +include $(top_srcdir)/Makefile.common +include $(srcdir)/sources.mk + +## Source directory + +MAINTAINERCLEANFILES = Makefile.in + +SUBDIRS = + + +# Library +lib_LTLIBRARIES = lib@PACKAGE@.la + +lib@PACKAGE@_la_SOURCES = $(service_h_sources) $(service_cc_sources) +lib@PACKAGE@_la_LIBADD = \ + $(top_builddir)/@PACKAGE@/basic/libbas.la \ + $(top_builddir)/@PACKAGE@/bom/libbom.la \ + $(top_builddir)/@PACKAGE@/factory/libfac.la \ + $(top_builddir)/@PACKAGE@/dbadaptor/libdba.la \ + $(top_builddir)/@PACKAGE@/command/libcmd.la \ + $(top_builddir)/@PACKAGE@/service/libsvc.la +lib@PACKAGE@_la_LDFLAGS = \ + $(BOOST_DATE_TIME_LIB) $(BOOST_PROGRAM_OPTIONS_LIB) \ + $(SOCI_LIBS) -version-info $(GENERIC_LIBRARY_VERSION) Added: trunk/opentrep/opentrep/core/sources.mk =================================================================== --- trunk/opentrep/opentrep/core/sources.mk (rev 0) +++ trunk/opentrep/opentrep/core/sources.mk 2009-07-12 12:59:01 UTC (rev 119) @@ -0,0 +1,3 @@ +service_h_sources = $(top_srcdir)/opentrep/OPENTREP_Types.hpp \ + $(top_srcdir)/opentrep/OPENTREP_Service.hpp +service_cc_sources = Modified: trunk/opentrep/opentrep/dbadaptor/Makefile.am =================================================================== --- trunk/opentrep/opentrep/dbadaptor/Makefile.am 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/opentrep/dbadaptor/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) @@ -3,16 +3,10 @@ include $(srcdir)/sources.mk noinst_LTLIBRARIES= libdba.la -if ENABLE_DEBUG -noinst_LTLIBRARIES += libdba_debug.la -endif + libdba_la_SOURCES= $(dba_h_sources) $(dba_cc_sources) libdba_la_CXXFLAGS = $(CXXFLAGS_OPT) $(SOCI_CFLAGS) libdba_la_LIBADD = $(SOCI_LIBS) -libdba_debug_la_SOURCES = $(dba_h_sources) $(dba_cc_sources) -libdba_debug_la_CXXFLAGS = $(CXXFLAGS_DEBUG) $(SOCI_CFLAGS) -libdba_debug_la_LIBADD = $(SOCI_LIBS) - #pkgincludedir = $(includedir)/@PACKAGE@/dba #pkginclude_HEADERS = $(dba_h_sources) Deleted: trunk/opentrep/opentrep/indexer.cpp =================================================================== --- trunk/opentrep/opentrep/indexer.cpp 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/opentrep/indexer.cpp 2009-07-12 12:59:01 UTC (rev 119) @@ -1,153 +0,0 @@ -// C -#include <assert.h> -// STL -#include <iostream> -#include <sstream> -#include <fstream> -#include <map> -#include <vector> -// Boost (Extended STL) -#include <boost/date_time/posix_time/posix_time.hpp> -#include <boost/date_time/gregorian/gregorian.hpp> -#include <boost/program_options.hpp> -// OPENTREP -#include <opentrep/OPENTREP_Service.hpp> - -// ///////// Parsing of Options & Configuration ///////// -// A helper function to simplify the main part. -template<class T> std::ostream& operator<< (std::ostream& os, - const std::vector<T>& v) { - std::copy (v.begin(), v.end(), std::ostream_iterator<T> (std::cout, " ")); - return os; -} - -int readConfiguration (int argc, char* argv[]) { - int opt; - - // Declare a group of options that will be - // allowed only on command line - boost::program_options::options_description generic("Generic options"); - generic.add_options() - ("version,v", "print version string") - ("help,h", "produce help message"); - - // Declare a group of options that will be allowed both on command line and in - // config file - boost::program_options::options_description config("Configuration"); - config.add_options() - ("optimization", - boost::program_options::value<int>(&opt)->default_value(10), - "optimization level") - ("include-path,I", - boost::program_options::value< std::vector<std::string> >()->composing(), - "include path"); - - // Hidden options, will be allowed both on command line and - // in config file, but will not be shown to the user. - boost::program_options::options_description hidden("Hidden options"); - hidden.add_options() - ("input-file", - boost::program_options::value< std::vector<std::string> >(), - "input file"); - - boost::program_options::options_description cmdline_options; - cmdline_options.add(generic).add(config).add(hidden); - - boost::program_options::options_description config_file_options; - config_file_options.add(config).add(hidden); - - boost::program_options::options_description visible("Allowed options"); - visible.add(generic).add(config); - - boost::program_options::positional_options_description p; - p.add("input-file", -1); - - boost::program_options::variables_map vm; - boost::program_options:: - store (boost::program_options::command_line_parser(argc, argv). - options (cmdline_options).positional(p).run(), vm); - - std::ifstream ifs ("request_parser.cfg"); - boost::program_options::store (parse_config_file (ifs, config_file_options), - vm); - boost::program_options::notify (vm); - - if (vm.count ("help")) { - std::cout << visible << std::endl; - return 0; - } - - if (vm.count ("version")) { - std::cout << "Open Travel Request Parser, version 1.0" << std::endl; - return 0; - } - - if (vm.count ("include-path")) { - std::cout << "Include paths are: " - << vm["include-path"].as< std::vector<std::string> >() - << std::endl; - } - - if (vm.count ("input-file")) { - std::cout << "Input files are: " - << vm["input-file"].as< std::vector<std::string> >() - << std::endl; - } - - std::cout << "Optimization level is " << opt << std::endl; - - return 0; -} - - -// /////////////// M A I N ///////////////// -int main (int argc, char* argv[]) { - try { - - // Output log File - std::string lLogFilename ("indexer.log"); - - // Xapian database name (directory of the index) - OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); - - if (argc >= 1 && argv[1] != NULL) { - std::istringstream istr (argv[1]); - istr >> lLogFilename; - } - - if (argc >= 2 && argv[2] != NULL) { - std::istringstream istr (argv[2]); - istr >> lXapianDatabaseName; - } - - // Set the log parameters - std::ofstream logOutputFile; - // open and clean the log outputfile - logOutputFile.open (lLogFilename.c_str()); - logOutputFile.clear(); - - // Initialise the context - OPENTREP::OPENTREP_Service opentrepService; - opentrepService.init (logOutputFile, lXapianDatabaseName); - - // Launch the indexation - opentrepService.buildSearchIndex(); - - // Close the Log outputFile - logOutputFile.close(); - - - } catch (const OPENTREP::RootException& otexp) { - std::cerr << "Standard exception: " << otexp.what() << std::endl; - return -1; - - } catch (const std::exception& stde) { - std::cerr << "Standard exception: " << stde.what() << std::endl; - return -1; - - } catch (...) { - return -1; - } - - return 0; -} Deleted: trunk/opentrep/opentrep/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/searcher.cpp 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/opentrep/searcher.cpp 2009-07-12 12:59:01 UTC (rev 119) @@ -1,160 +0,0 @@ -// C -#include <assert.h> -// STL -#include <iostream> -#include <sstream> -#include <fstream> -#include <map> -#include <vector> -// Boost (Extended STL) -#include <boost/date_time/posix_time/posix_time.hpp> -#include <boost/date_time/gregorian/gregorian.hpp> -#include <boost/program_options.hpp> -// OPENTREP -#include <opentrep/OPENTREP_Service.hpp> - -// ///////// Parsing of Options & Configuration ///////// -// A helper function to simplify the main part. -template<class T> std::ostream& operator<< (std::ostream& os, - const std::vector<T>& v) { - std::copy (v.begin(), v.end(), std::ostream_iterator<T> (std::cout, " ")); - return os; -} - -int readConfiguration (int argc, char* argv[]) { - int opt; - - // Declare a group of options that will be - // allowed only on command line - boost::program_options::options_description generic("Generic options"); - generic.add_options() - ("version,v", "print version string") - ("help,h", "produce help message"); - - // Declare a group of options that will be allowed both on command line and in - // config file - boost::program_options::options_description config("Configuration"); - config.add_options() - ("optimization", - boost::program_options::value<int>(&opt)->default_value(10), - "optimization level") - ("include-path,I", - boost::program_options::value< std::vector<std::string> >()->composing(), - "include path"); - - // Hidden options, will be allowed both on command line and - // in config file, but will not be shown to the user. - boost::program_options::options_description hidden("Hidden options"); - hidden.add_options() - ("input-file", - boost::program_options::value< std::vector<std::string> >(), - "input file"); - - boost::program_options::options_description cmdline_options; - cmdline_options.add(generic).add(config).add(hidden); - - boost::program_options::options_description config_file_options; - config_file_options.add(config).add(hidden); - - boost::program_options::options_description visible("Allowed options"); - visible.add(generic).add(config); - - boost::program_options::positional_options_description p; - p.add("input-file", -1); - - boost::program_options::variables_map vm; - boost::program_options:: - store (boost::program_options::command_line_parser(argc, argv). - options (cmdline_options).positional(p).run(), vm); - - std::ifstream ifs ("request_parser.cfg"); - boost::program_options::store (parse_config_file (ifs, config_file_options), - vm); - boost::program_options::notify (vm); - - if (vm.count ("help")) { - std::cout << visible << std::endl; - return 0; - } - - if (vm.count ("version")) { - std::cout << "Open Travel Request Parser, version 1.0" << std::endl; - return 0; - } - - if (vm.count ("include-path")) { - std::cout << "Include paths are: " - << vm["include-path"].as< std::vector<std::string> >() - << std::endl; - } - - if (vm.count ("input-file")) { - std::cout << "Input files are: " - << vm["input-file"].as< std::vector<std::string> >() - << std::endl; - } - - std::cout << "Optimization level is " << opt << std::endl; - - return 0; -} - - -// /////////////// M A I N ///////////////// -int main (int argc, char* argv[]) { - try { - - // Travel query - OPENTREP::TravelQuery_T lTravelQuery ("cdg"); - - // Output log File - std::string lLogFilename ("searcher.log"); - - // Xapian database name (directory of the index) - OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); - - if (argc >= 1 && argv[1] != NULL) { - std::istringstream istr (argv[1]); - istr >> lTravelQuery; - } - - if (argc >= 2 && argv[2] != NULL) { - std::istringstream istr (argv[2]); - istr >> lLogFilename; - } - - if (argc >= 3 && argv[3] != NULL) { - std::istringstream istr (argv[3]); - istr >> lXapianDatabaseName; - } - - // Set the log parameters - std::ofstream logOutputFile; - // open and clean the log outputfile - logOutputFile.open (lLogFilename.c_str()); - logOutputFile.clear(); - - // Initialise the context - OPENTREP::OPENTREP_Service opentrepService; - opentrepService.init (logOutputFile, lXapianDatabaseName); - - // Query the Xapian database (index) - opentrepService.interpretTravelRequest (lTravelQuery); - - // Close the Log outputFile - logOutputFile.close(); - - } catch (const OPENTREP::RootException& otexp) { - std::cerr << "Standard exception: " << otexp.what() << std::endl; - return -1; - - } catch (const std::exception& stde) { - std::cerr << "Standard exception: " << stde.what() << std::endl; - return -1; - - } catch (...) { - return -1; - } - - return 0; -} Modified: trunk/opentrep/opentrep/sources.mk =================================================================== --- trunk/opentrep/opentrep/sources.mk 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/opentrep/sources.mk 2009-07-12 12:59:01 UTC (rev 119) @@ -1,7 +1,4 @@ -service_h_sources = $(top_srcdir)/opentrep/OPENTREP_Types.hpp \ +service_h_sources = \ + $(top_srcdir)/opentrep/OPENTREP_Types.hpp \ $(top_srcdir)/opentrep/OPENTREP_Service.hpp service_cc_sources = -bin_idx_h_sources = -bin_idx_cc_sources = $(top_srcdir)/opentrep/indexer.cpp -bin_srh_h_sources = -bin_srh_cc_sources = $(top_srcdir)/opentrep/searcher.cpp Modified: trunk/opentrep/po/POTFILES.in =================================================================== --- trunk/opentrep/po/POTFILES.in 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/po/POTFILES.in 2009-07-12 12:59:01 UTC (rev 119) @@ -7,8 +7,8 @@ opentrep/service/Logger.hpp opentrep/service/OPENTREP_ServiceContext.cpp opentrep/service/OPENTREP_ServiceContext.hpp -opentrep/indexer.cpp -opentrep/searcher.cpp +opentrep/batches/indexer.cpp +opentrep/batches/searcher.cpp opentrep/dbadaptor/DbaPlace.hpp opentrep/dbadaptor/DbaAbstract.cpp opentrep/dbadaptor/DbaPlace.cpp Modified: trunk/opentrep/test/testIndexer.sh =================================================================== --- trunk/opentrep/test/testIndexer.sh 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/test/testIndexer.sh 2009-07-12 12:59:01 UTC (rev 119) @@ -1,7 +1,7 @@ #!/bin/sh INSTALL_DIR=`grep "^prefix =" ../Makefile | cut -d"=" -d" " -f3` -TST_PROG=../opentrep/opentrep_indexer +TST_PROG=../opentrep/batches/opentrep_indexer OPENTREP=`grep "^PACKAGE_VERSION =" ../Makefile | cut -d"=" -d" " -f3` OPENTREP_LIBRARY_NAME=`grep "^PACKAGE =" ../Makefile | cut -d"=" -d" " -f3` OPENTREP_LIB=lib${OPENTREP_LIBRARY_NAME}-${OPENTREP_API_VERSION}.so Modified: trunk/opentrep/test/testSearcher.sh =================================================================== --- trunk/opentrep/test/testSearcher.sh 2009-05-30 16:41:07 UTC (rev 118) +++ trunk/opentrep/test/testSearcher.sh 2009-07-12 12:59:01 UTC (rev 119) @@ -1,7 +1,7 @@ #!/bin/sh INSTALL_DIR=`grep "^prefix =" ../Makefile | cut -d"=" -d" " -f3` -TST_PROG=../opentrep/opentrep_searcher +TST_PROG=../opentrep/batches/opentrep_searcher OPENTREP=`grep "^PACKAGE_VERSION =" ../Makefile | cut -d"=" -d" " -f3` OPENTREP_LIBRARY_NAME=`grep "^PACKAGE =" ../Makefile | cut -d"=" -d" " -f3` OPENTREP_LIB=lib${OPENTREP_LIBRARY_NAME}-${OPENTREP_API_VERSION}.so This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-12 13:36:46
|
Revision: 120 http://opentrep.svn.sourceforge.net/opentrep/?rev=120&view=rev Author: denis_arnaud Date: 2009-07-12 13:36:35 +0000 (Sun, 12 Jul 2009) Log Message: ----------- [Test] Added some tests on parsers. Modified Paths: -------------- trunk/opentrep/configure.ac trunk/opentrep/test/Makefile.am Added Paths: ----------- trunk/opentrep/test/parsers/ trunk/opentrep/test/parsers/Makefile.am trunk/opentrep/test/parsers/full_calculator.cpp trunk/opentrep/test/parsers/levenshtein.cpp trunk/opentrep/test/parsers/parameter_parser.cpp trunk/opentrep/test/parsers/schedule_parser.cpp trunk/opentrep/test/parsers/test_full_calculator.sh trunk/opentrep/test/parsers/test_parameter_parser.sh trunk/opentrep/test/parsers/test_schedule_parser.sh trunk/opentrep/test/parsers/world_schedule.csv Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-12 12:59:01 UTC (rev 119) +++ trunk/opentrep/configure.ac 2009-07-12 13:36:35 UTC (rev 120) @@ -233,6 +233,7 @@ doc/sourceforge/howto_release_opentrep.html po/Makefile.in test/com/Makefile + test/parsers/Makefile test/Makefile win32/Makefile) AC_OUTPUT Modified: trunk/opentrep/test/Makefile.am =================================================================== --- trunk/opentrep/test/Makefile.am 2009-07-12 12:59:01 UTC (rev 119) +++ trunk/opentrep/test/Makefile.am 2009-07-12 13:36:35 UTC (rev 120) @@ -4,7 +4,7 @@ MAINTAINERCLEANFILES = Makefile.in ## -SUBDIRS = com +SUBDIRS = com parsers ## check_PROGRAMS = IndexBuildingTestSuite Property changes on: trunk/opentrep/test/parsers ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in parameter_parser* full_calculator* schedule_parser* levenshtein* Added: trunk/opentrep/test/parsers/Makefile.am =================================================================== --- trunk/opentrep/test/parsers/Makefile.am (rev 0) +++ trunk/opentrep/test/parsers/Makefile.am 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,24 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +check_PROGRAMS = full_calculator parameter_parser schedule_parser levenshtein + +full_calculator_SOURCES = full_calculator.cpp +full_calculator_CXXFLAGS = $(BOOST_CFLAGS) +full_calculator_LDADD = $(BOOST_LIB) + +parameter_parser_SOURCES = parameter_parser.cpp +parameter_parser_CXXFLAGS = $(BOOST_CFLAGS) +parameter_parser_LDADD = $(BOOST_LIB) + +schedule_parser_SOURCES = schedule_parser.cpp +schedule_parser_CXXFLAGS = $(BOOST_CFLAGS) +schedule_parser_LDADD = $(BOOST_LIBS) $(BOOST_DATE_TIME_LIB) + +levenshtein_SOURCES = levenshtein.cpp +levenshtein_LDADD = + +EXTRA_DIST = test_full_calculator.sh test_parameter_parser.sh \ + test_schedule_parser.sh Added: trunk/opentrep/test/parsers/full_calculator.cpp =================================================================== --- trunk/opentrep/test/parsers/full_calculator.cpp (rev 0) +++ trunk/opentrep/test/parsers/full_calculator.cpp 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,131 @@ +/*============================================================================= + Copyright (c) 2002-2003 Joel de Guzman + http://spirit.sourceforge.net/ + + Use, modification and distribution is subject to the Boost Software + License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at + http://www.boost.org/LICENSE_1_0.txt) + =============================================================================*/ +//////////////////////////////////////////////////////////////////////////// +// +// Full calculator example demonstrating Phoenix +// This is discussed in the "Closures" chapter in the Spirit User's Guide. +// +// [ JDG 6/29/2002 ] +// +//////////////////////////////////////////////////////////////////////////// +#include <boost/spirit/core.hpp> +#include <boost/spirit/attribute.hpp> +#include <iostream> +#include <string> + +//////////////////////////////////////////////////////////////////////////// +using namespace std; +using namespace boost::spirit; +using namespace phoenix; + +//////////////////////////////////////////////////////////////////////////// +// +// Our calculator grammar using phoenix to do the semantics +// +// Note: The top rule propagates the expression result (value) upwards +// to the calculator grammar self.val closure member which is +// then visible outside the grammar (i.e. since self.val is the +// member1 of the closure, it becomes the attribute passed by +// the calculator to an attached semantic action. See the +// driver code that uses the calculator below). +// +//////////////////////////////////////////////////////////////////////////// +struct calc_closure : boost::spirit::closure<calc_closure, double> +{ + member1 val; +}; + +struct calculator : public grammar<calculator, calc_closure::context_t> +{ + template <typename ScannerT> + struct definition + { + definition(calculator const& self) + { + top = expression[self.val = arg1]; + + expression + = term[expression.val = arg1] + >> *( ('+' >> term[expression.val += arg1]) + | ('+' >> term[expression.val -= arg1]) + | ('-' >> term[expression.val -= arg1]) + ) + ; + + term + = factor[term.val = arg1] + >> *( ('*' >> factor[term.val *= arg1]) + | ('/' >> factor[term.val /= arg1]) + ) + ; + + factor + = ureal_p[factor.val = arg1] + | '(' >> expression[factor.val = arg1] >> ')' + | ('-' >> factor[factor.val = -arg1]) + | ('+' >> factor[factor.val = arg1]) + ; + } + + typedef rule<ScannerT, calc_closure::context_t> rule_t; + rule_t expression, term, factor; + rule<ScannerT> top; + + rule<ScannerT> const& + start() const { return top; } + }; +}; + +//////////////////////////////////////////////////////////////////////////// +// +// Main program +// +//////////////////////////////////////////////////////////////////////////// +int +main() +{ + cout << "/////////////////////////////////////////////////////////\n\n"; + cout << "\t\tExpression parser using Phoenix...\n\n"; + cout << "/////////////////////////////////////////////////////////\n\n"; + cout << "Type an expression...or [q or Q] to quit\n\n"; + + calculator calc; // Our parser + + string str; + while (getline(cin, str)) + { + if (str.empty() || str[0] == 'q' || str[0] == 'Q') + break; + + double n = 0; + parse_info<> info = parse(str.c_str(), calc[var(n) = arg1], space_p); + + // calc[var(n) = arg1] invokes the calculator and extracts + // the result of the computation. See calculator grammar + // note above. + + if (info.full) + { + cout << "-------------------------\n"; + cout << "Parsing succeeded\n"; + cout << "result = " << n << endl; + cout << "-------------------------\n"; + } + else + { + cout << "-------------------------\n"; + cout << "Parsing failed\n"; + cout << "stopped at: \": " << info.stop << "\"\n"; + cout << "-------------------------\n"; + } + } + + cout << "Bye... :-) \n\n"; + return 0; +} Added: trunk/opentrep/test/parsers/levenshtein.cpp =================================================================== --- trunk/opentrep/test/parsers/levenshtein.cpp (rev 0) +++ trunk/opentrep/test/parsers/levenshtein.cpp 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,131 @@ +// Levenshtein Distance Algorithm: C++ Implementation by Anders Sewerin Johansen +// STL +#include <iostream> +#include <string> +#include <vector> + +// ////////////////////////////////////////////////////////////////// +int getLevenshteinDistance (const std::string& source, + const std::string& target) { + + // Step 1 + + const int n = source.length(); + const int m = target.length(); + if (n == 0) { + return m; + } + if (m == 0) { + return n; + } + + // Definition of Matrix Type + typedef std::vector< std::vector<int> > Matrix_T; + + Matrix_T matrix (n+1); + + // Size the vectors in the 2.nd dimension. Unfortunately C++ doesn't + // allow for allocation on declaration of 2.nd dimension of vec of vec + + for (int i = 0; i <= n; i++) { + matrix[i].resize(m+1); + } + + // Step 2 + + for (int i = 0; i <= n; i++) { + matrix[i][0]=i; + } + + for (int j = 0; j <= m; j++) { + matrix[0][j]=j; + } + + // Step 3 + + for (int i = 1; i <= n; i++) { + + const char s_i = source[i-1]; + + // Step 4 + + for (int j = 1; j <= m; j++) { + + const char t_j = target[j-1]; + + // Step 5 + + int cost; + if (s_i == t_j) { + cost = 0; + } + else { + cost = 1; + } + + // Step 6 + + const int above = matrix[i-1][j]; + const int left = matrix[i][j-1]; + const int diag = matrix[i-1][j-1]; + int cell = std::min ( above + 1, std::min (left + 1, diag + cost)); + + // Step 6A: Cover transposition, in addition to deletion, + // insertion and substitution. This step is taken from: + // Berghel, Hal ; Roach, David : "An Extension of Ukkonen's + // Enhanced Dynamic Programming ASM Algorithm" + // (http://www.acm.org/~hlb/publications/asm/asm.html) + + if (i>2 && j>2) { + int trans = matrix[i-2][j-2] + 1; + if (source[i-2] != t_j) { + trans++; + } + if (s_i != target[j-2]) { + trans++; + } + if (cell > trans) { + cell = trans; + } + } + + matrix[i][j] = cell; + } + } + + // Step 7 + + return matrix[n][m]; +} + + +// /////////// M A I N //////////////// +int main (int argc, char* argv[]) { + + const std::string lLax1Str = "los angeles"; + const std::string lLax2Str = "lso angeles"; + const std::string lRio1Str = "rio de janeiro"; + const std::string lRio2Str = "rio de janero"; + const std::string lRek1Str = "reikjavik"; + const std::string lRek2Str = "rekyavik"; + const std::string lSfoRio1Str = "san francisco rio de janeiro"; + const std::string lSfoRio2Str = "san francicso rio de janero"; + + std::cout << "Distance between '" << lLax1Str + << "' and '" << lLax2Str << "' is: " + << getLevenshteinDistance (lLax1Str, lLax2Str) << std::endl; + + std::cout << "Distance between '" << lRio1Str + << "' and '" << lRio2Str << "' is: " + << getLevenshteinDistance (lRio1Str, lRio2Str) << std::endl; + + std::cout << "Distance between '" << lRek1Str + << "' and '" << lRek2Str << "' is: " + << getLevenshteinDistance (lRek1Str, lRek2Str) << std::endl; + + std::cout << "Distance between '" << lSfoRio1Str + << "' and '" << lSfoRio2Str << "' is: " + << getLevenshteinDistance (lSfoRio1Str, lSfoRio2Str) << std::endl; + + return 0; +} Added: trunk/opentrep/test/parsers/parameter_parser.cpp =================================================================== --- trunk/opentrep/test/parsers/parameter_parser.cpp (rev 0) +++ trunk/opentrep/test/parsers/parameter_parser.cpp 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,216 @@ +/*============================================================================= + Copyright (c) 2001-2003 Hartmut Kaiser + http://spirit.sourceforge.net/ + + Use, modification and distribution is subject to the Boost Software + License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at + http://www.boost.org/LICENSE_1_0.txt) + =============================================================================*/ +/////////////////////////////////////////////////////////////////////////////// +// This sample show the usage of parser parameters. +// +// Parser parameters are used to pass some values from the outer parsing scope +// to the next inner scope. They can be imagined as the opposite to the return +// value paradigm, which returns some value from the inner to the next outer +// scope. See the "Closures" chapter in the User's Guide. + +#include <string> +#include <iostream> +#include <cassert> + +#if defined(_MSC_VER) /*&& !defined(__COMO__)*/ +#pragma warning(disable: 4244) +#pragma warning(disable: 4355) +#endif // defined(_MSC_VER) && !defined(__COMO__) + +#include <boost/spirit/core.hpp> +#include <boost/spirit/symbols/symbols.hpp> + +#include <boost/spirit/phoenix/tuples.hpp> +#include <boost/spirit/phoenix/tuple_helpers.hpp> +#include <boost/spirit/phoenix/primitives.hpp> +#include <boost/spirit/attribute/closure.hpp> + +/////////////////////////////////////////////////////////////////////////////// +// used namespaces +using namespace boost::spirit; +using namespace phoenix; +using namespace std; + +/////////////////////////////////////////////////////////////////////////////// +// Helper class for encapsulation of the type for the parsed variable names +class declaration_type +{ +public: + enum vartype { + vartype_unknown = 0, // unknown variable type + vartype_int = 1, // 'int' + vartype_real = 2 // 'real' + }; + + declaration_type() : type(vartype_unknown) + { + } + template <typename ItT> + declaration_type(ItT const &first, ItT const &last) + { + init(string(first, last-first-1)); + } + declaration_type(declaration_type const &type_) : type(type_.type) + { + } + declaration_type(string const &type_) : type(vartype_unknown) + { + init(type_); + } + + // access to the variable type + operator vartype const &() const { return type; } + operator string () + { + switch(type) { + default: + case vartype_unknown: break; + case vartype_int: return string("int"); + case vartype_real: return string("real"); + } + return string ("unknown"); + } + + void swap(declaration_type &s) { std::swap(type, s.type); } + +protected: + void init (string const &type_) + { + if (type_ == "int") + type = vartype_int; + else if (type_ == "real") + type = vartype_real; + else + type = vartype_unknown; + } + +private: + vartype type; +}; + +/////////////////////////////////////////////////////////////////////////////// +// +// used closure type +// +/////////////////////////////////////////////////////////////////////////////// +struct var_decl_closure : boost::spirit::closure<var_decl_closure, declaration_type> +{ + member1 val; +}; + +/////////////////////////////////////////////////////////////////////////////// +// +// symbols_with_data +// +// Helper class for inserting an item with data into a symbol table +// +/////////////////////////////////////////////////////////////////////////////// +template <typename T, typename InitT> +class symbols_with_data +{ +public: + typedef + symbol_inserter<T, boost::spirit::impl::tst<T, char> > + symbol_inserter_t; + + symbols_with_data(symbol_inserter_t const &add_, InitT const &data_) : + add(add_), data(as_actor<InitT>::convert(data_)) + { + } + + template <typename IteratorT> + symbol_inserter_t const & + operator()(IteratorT const &first_, IteratorT const &last) const + { + IteratorT first = first_; + return add(first, last, data()); + } + +private: + symbol_inserter_t const &add; + typename as_actor<InitT>::type data; +}; + +template <typename T, typename CharT, typename InitT> +inline +symbols_with_data<T, InitT> +symbols_gen(symbol_inserter<T, boost::spirit::impl::tst<T, CharT> > const &add_, + InitT const &data_) +{ + return symbols_with_data<T, InitT>(add_, data_); +} + +/////////////////////////////////////////////////////////////////////////////// +// The var_decl_list grammar parses variable declaration list + +struct var_decl_list : + public grammar<var_decl_list, var_decl_closure::context_t> +{ + template <typename ScannerT> + struct definition + { + definition(var_decl_list const &self) + { + // pass variable type returned from 'type' to list closure member 0 + decl = type[self.val = arg1] >> +space_p >> list(self.val); + + // m0 to access arg 0 of list --> passing variable type down to ident + list = ident(list.val) >> *(',' >> ident(list.val)); + + // store identifier and type into the symbol table + ident = (*alnum_p)[symbols_gen(symtab.add, ident.val)]; + + // the type of the decl is returned in type's closure member 0 + type = + str_p("int")[type.val = construct_<string>(arg1, arg2)] + | str_p("real")[type.val = construct_<string>(arg1, arg2)] + ; + + BOOST_SPIRIT_DEBUG_RULE(decl); + BOOST_SPIRIT_DEBUG_RULE(list); + BOOST_SPIRIT_DEBUG_RULE(ident); + BOOST_SPIRIT_DEBUG_RULE(type); + } + + rule<ScannerT> const& + start() const { return decl; } + + private: + typedef rule<ScannerT, var_decl_closure::context_t> rule_t; + rule_t type; + rule_t list; + rule_t ident; + symbols<declaration_type> symtab; + + rule<ScannerT> decl; // start rule + }; +}; + +/////////////////////////////////////////////////////////////////////////////// +// main entry point +int main() +{ + var_decl_list decl; + declaration_type type; + char const *pbegin = "int var1"; + + if (parse (pbegin, decl[assign(type)]).full) { + cout << endl + << "Parsed variable declarations successfully!" << endl + << "Detected type: " << declaration_type::vartype(type) + << " (" << string(type) << ")" + << endl; + } else { + cout << endl + << "Parsing the input stream failed!" + << endl; + } + return 0; +} + Added: trunk/opentrep/test/parsers/schedule_parser.cpp =================================================================== --- trunk/opentrep/test/parsers/schedule_parser.cpp (rev 0) +++ trunk/opentrep/test/parsers/schedule_parser.cpp 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,911 @@ +// C +#include <assert.h> +// STL +#include <iostream> +#include <sstream> +#include <fstream> +#include <string> +#include <map> +#include <set> +#include <vector> +// Boost (Extended STL) +#include <boost/date_time/posix_time/posix_time.hpp> +#include <boost/date_time/gregorian/gregorian.hpp> +// Boost Spirit (Parsing) +//#define BOOST_SPIRIT_DEBUG +#include <boost/spirit/core.hpp> +#include <boost/spirit/attribute.hpp> +#include <boost/spirit/utility/functor_parser.hpp> +#include <boost/spirit/utility/loops.hpp> +#include <boost/spirit/utility/chset.hpp> +#include <boost/spirit/utility/confix.hpp> +#include <boost/spirit/iterator/file_iterator.hpp> +#include <boost/spirit/actor/push_back_actor.hpp> +#include <boost/spirit/actor/assign_actor.hpp> + +// Type definitions +typedef char char_t; +//typedef char const* iterator_t; +typedef boost::spirit::file_iterator<char_t> iterator_t; +typedef boost::spirit::scanner<iterator_t> scanner_t; +typedef boost::spirit::rule<scanner_t> rule_t; + +/** LegCabin-Details. */ +struct Cabin_T { + // Attributes + std::string _cabinCode; + double _capacity; + + void display() const { + std::cout << " " << _cabinCode << " " << _capacity << ", "; + } +}; + +/** List of Cabin-Detail strucutres. */ +typedef std::vector<Cabin_T> CabinList_T; + +/** Leg. */ +struct Leg_T { + // Attributes + std::string _boardPoint; + boost::posix_time::time_duration _boardTime; + boost::gregorian::date_duration _boardDateOffSet; + std::string _offPoint; + boost::posix_time::time_duration _offTime; + boost::gregorian::date_duration _offDateOffSet; + boost::posix_time::time_duration _elapsed; + CabinList_T _cabinList; + + /** Constructor. */ + Leg_T () : _boardDateOffSet (0), _offDateOffSet (0) {} + + void display() const { + std::cout << " " << _boardPoint << " / " + << boost::posix_time::to_simple_string (_boardTime) + << " -- " << _offPoint << " / " + << boost::posix_time::to_simple_string (_offTime) + << " --> " << boost::posix_time::to_simple_string (_elapsed) + << std::endl; + for (CabinList_T::const_iterator itCabin = _cabinList.begin(); + itCabin != _cabinList.end(); itCabin++) { + const Cabin_T& lCabin = *itCabin; + lCabin.display(); + } + std::cout << std::endl; + } +}; + +/** List of Leg strucutres. */ +typedef std::vector<Leg_T> LegList_T; + +/** SegmentCabin-Details. */ +struct SegmentCabin_T { + // Attributes + std::string _cabinCode; + std::string _classes; + + void display() const { + std::cout << " " << _cabinCode << " " << _classes << ", "; + } +}; + +/** List of SegmentCabin-Detail strucutres. */ +typedef std::vector<SegmentCabin_T> SegmentCabinList_T; + +/** Segment. */ +struct Segment_T { + // Attributes + std::string _boardPoint; + boost::posix_time::time_duration _boardTime; + boost::gregorian::date_duration _boardDateOffSet; + std::string _offPoint; + boost::posix_time::time_duration _offTime; + boost::gregorian::date_duration _offDateOffSet; + boost::posix_time::time_duration _elapsed; + SegmentCabinList_T _cabinList; + + /** Constructor. */ + Segment_T () : _boardDateOffSet (0), _offDateOffSet (0) {} + + void display() const { + std::cout << " " << _boardPoint << " / " + << boost::posix_time::to_simple_string (_boardTime) + << " -- " << _offPoint << " / " + << boost::posix_time::to_simple_string (_offTime) + << " --> " << boost::posix_time::to_simple_string (_elapsed) + << std::endl; + for (SegmentCabinList_T::const_iterator itCabin = _cabinList.begin(); + itCabin != _cabinList.end(); itCabin++) { + const SegmentCabin_T& lCabin = *itCabin; + lCabin.display(); + } + std::cout << std::endl; + } +}; + +/** List of Segment strucutres. */ +typedef std::vector<Segment_T> SegmentList_T; + +/** Flight-Period. */ +struct FlightPeriod_T { + // Attributes + std::string _airlineCode; + unsigned int _flightNumber; + boost::gregorian::date _dateRangeStart; + boost::gregorian::date _dateRangeEnd; + std::string _dow; + LegList_T _legList; + SegmentList_T _segmentList; + + /** Constructor. */ + FlightPeriod_T () : _legAlreadyDefined (false), _itSeconds (0) {} + + /** Set the date from the staging details. */ + boost::gregorian::date getDate() const { + return boost::gregorian::date (_itYear, _itMonth, _itDay); + } + + /** Set the time from the staging details. */ + boost::posix_time::time_duration getTime() const { + return boost::posix_time::hours (_itHours) + + boost::posix_time::minutes (_itMinutes) + + boost::posix_time::seconds (_itSeconds); + } + + void display() const { + std::cout << _airlineCode << _flightNumber + << ", " << boost::gregorian::to_simple_string (_dateRangeStart) + << " - " << boost::gregorian::to_simple_string (_dateRangeEnd) + << " - " << _dow + << std::endl; + + for (LegList_T::const_iterator itLeg = _legList.begin(); + itLeg != _legList.end(); itLeg++) { + const Leg_T& lLeg = *itLeg; + lLeg.display(); + } + + for (SegmentList_T::const_iterator itSegment = _segmentList.begin(); + itSegment != _segmentList.end(); itSegment++) { + const Segment_T& lSegment = *itSegment; + lSegment.display(); + } + + //std::cout << "[Debug] - Staging Leg: "; + //_itLeg.display(); + //std::cout << "[Debug] - Staging Cabin: "; + //_itCabin.display(); + //std::cout << "[Debug] - Staging Segment: "; + //_itSegment.display(); + } + + /** Add the given airport to the internal lists (if not already existing). */ + void addAirport (const std::string& iAirport) { + std::set<std::string>::const_iterator itAirport = + _airportList.find (iAirport); + if (itAirport == _airportList.end()) { + // Add the airport code to the airport set + const bool insertSuccessful = _airportList.insert (iAirport).second; + + if (insertSuccessful == false) { + // TODO: throw an exception + } + // Add the airport code to the airport vector + _airportOrderedList.push_back (iAirport); + } + } + + /** Build the routing (segments). */ + void buildSegments () { + // The list of airports encompasses all the airports on which + // the flight takes off or lands. Moreover, that list is + // time-ordered: the first airport is the initial departure of + // the flight, and the last airport is the eventual point of + // rest of the flight. + // Be l the size of the ordered list of airports. + // We want to generate all the segment combinations from the legs + // and, hence, from all the possible (time-ordered) airport pairs. + // Thus, we both iterator on i=0...l-1 and j=i+1...l + assert (_airportOrderedList.size() >= 2); + + _segmentList.clear(); + for (std::vector<std::string>::const_iterator itAirport_i = + _airportOrderedList.begin(); + itAirport_i != _airportOrderedList.end()-1; ++itAirport_i) { + for (std::vector<std::string>::const_iterator itAirport_j = + itAirport_i + 1; + itAirport_j != _airportOrderedList.end(); ++itAirport_j) { + Segment_T lSegment; + lSegment._boardPoint = *itAirport_i; + lSegment._offPoint = *itAirport_j; + + _segmentList.push_back (lSegment); + } + } + + // Clear the lists of airports, so that it is ready for the next flight + _airportList.clear(); + _airportOrderedList.clear(); + } + + /** Add, to the Segment whose key corresponds to the + given (board point, off point) pair, the specific segment cabin + details (mainly, the list of the class codes). + <br>Note that the Segment structure is retrieved from the internal + list, already filled by a previous step (the buildSegments() + method). */ + void addSegmentCabin (const Segment_T& iSegment, + const SegmentCabin_T& iCabin) { + // Retrieve the Segment structure corresponding to the (board, off) point + // pair. + SegmentList_T::iterator itSegment = _segmentList.begin(); + for ( ; itSegment != _segmentList.end(); ++itSegment) { + const Segment_T& lSegment = *itSegment; + + const std::string& lBoardPoint = iSegment._boardPoint; + const std::string& lOffPoint = iSegment._offPoint; + if (lSegment._boardPoint == lBoardPoint + && lSegment._offPoint == lOffPoint) { + break; + } + } + + // If the segment key (airport pair) given in the schedule input file + // does not correspond to the leg (board, off) points, throw an exception + // so that the user knows the schedule input file is corrupted. + if (itSegment == _segmentList.end()) { + std::cerr << "Within the schedule input file, there is a flight for which the airports of segments and those of the legs do not correspond."; + throw std::exception(); + } + + // Add the Cabin structure to the Segment Cabin structure. + assert (itSegment != _segmentList.end()); + Segment_T& lSegment = *itSegment; + lSegment._cabinList.push_back (iCabin); + } + + /** Add, to all the Segment, the general segment cabin details + (mainly, the list of the class codes). + <br>Note that the Segment structures are stored within the internal + list, already filled by a previous step (the buildSegments() + method). */ + void addSegmentCabin (const SegmentCabin_T& iCabin) { + // Iterate on all the Segment (as they get the same cabin definitions) + for (SegmentList_T::iterator itSegment = _segmentList.begin(); + itSegment != _segmentList.end(); ++itSegment) { + Segment_T& lSegment = *itSegment; + lSegment._cabinList.push_back (iCabin); + } + } + + /** Staging Leg (resp. Cabin) structure, gathering the result of the iteration + on one leg (resp. cabin). */ + bool _legAlreadyDefined; + Leg_T _itLeg; + Cabin_T _itCabin; + + /** Staging Date. */ + unsigned int _itYear; + unsigned int _itMonth; + unsigned int _itDay; + + /** Staging Time. */ + long _itHours; + long _itMinutes; + long _itSeconds; + int _dateOffSet; + + /** Staging Airport List (helper to derive the list of Segment + structures). */ + std::set<std::string> _airportList; + std::vector<std::string> _airportOrderedList; + + /** Staging Segment-related attributes. */ + bool _areSegmentDefinitionsSpecific; + Segment_T _itSegment; + SegmentCabin_T _itSegmentCabin; +}; + +/////////////////////////////////////////////////////////////////////////////// +// +// Semantic actions +// +/////////////////////////////////////////////////////////////////////////////// +namespace { + + /** Store the parsed airline code. */ + struct store_airline_code { + store_airline_code (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lAirlineCode (iStr, iStrEnd); + _flightPeriod._airlineCode = lAirlineCode; + // std::cout << "Airline code: " << lAirlineCode << std::endl; + + // As that's the beginning of a new flight, the list of legs must be reset + _flightPeriod._legList.clear(); + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the parsed flight number. */ + struct store_flight_number { + store_flight_number (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (unsigned int iNumber) const { + _flightPeriod._flightNumber = iNumber; + // std::cout << "Flight number: " << iNumber << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the start of the date range. */ + struct store_date_range_start { + store_date_range_start (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + _flightPeriod._dateRangeStart = _flightPeriod.getDate(); + // std::cout << "Date Range Start: " + // << _flightPeriod._dateRangeStart << std::endl; + + // Reset the number of seconds + _flightPeriod._itSeconds = 0; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the end of the date range. */ + struct store_date_range_end { + store_date_range_end (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + _flightPeriod._dateRangeEnd = _flightPeriod.getDate(); + // std::cout << "Date Range End: " + // << _flightPeriod._dateRangeEnd << std::endl; + + // Reset the number of seconds + _flightPeriod._itSeconds = 0; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the DOW (day of the Week). */ + struct store_dow { + store_dow (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lDow (iStr, iStrEnd); + _flightPeriod._dow = lDow; + // std::cout << "DOW: " << lDow << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the parsed board point. */ + struct store_board_point { + store_board_point (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lBoardPoint (iStr, iStrEnd); + // std::cout << "Board point: " << lBoardPoint << std::endl; + + // If a leg has already been parsed, add it to the FlightPeriod + if (_flightPeriod._legAlreadyDefined == true) { + _flightPeriod._legList.push_back (_flightPeriod._itLeg); + } else { + _flightPeriod._legAlreadyDefined = true; + } + + // Set the (new) board point + _flightPeriod._itLeg._boardPoint = lBoardPoint; + + // As that's the beginning of a new leg, the list of cabins must be reset + _flightPeriod._itLeg._cabinList.clear(); + + // Add the airport code if it is not already stored in the airport lists + _flightPeriod.addAirport (lBoardPoint); + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the parsed off point. */ + struct store_off_point { + store_off_point (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lOffPoint (iStr, iStrEnd); + _flightPeriod._itLeg._offPoint = lOffPoint; + // std::cout << "Off point: " << lOffPoint << std::endl; + + // Add the airport code if it is not already stored in the airport lists + _flightPeriod.addAirport (lOffPoint); + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the board time. */ + struct store_board_time { + store_board_time (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + _flightPeriod._itLeg._boardTime = _flightPeriod.getTime(); + + // Reset the number of seconds + _flightPeriod._itSeconds = 0; + + // Reset the date off-set + _flightPeriod._dateOffSet = 0; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the off time. */ + struct store_off_time { + store_off_time (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + _flightPeriod._itLeg._offTime = _flightPeriod.getTime(); + + // Reset the number of seconds + _flightPeriod._itSeconds = 0; + + // As the board date off set is optional, it can be set only afterwards, + // based on the staging date off-set value (_flightPeriod._dateOffSet). + const boost::gregorian::date_duration lDateOffSet (_flightPeriod._dateOffSet); + _flightPeriod._itLeg._boardDateOffSet = lDateOffSet; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the elapsed time. */ + struct store_elapsed_time { + store_elapsed_time (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + _flightPeriod._itLeg._elapsed = _flightPeriod.getTime(); + + // Reset the number of seconds + _flightPeriod._itSeconds = 0; + + // As the board date off set is optional, it can be set only afterwards, + // based on the staging date off-set value (_flightPeriod._dateOffSet). + const boost::gregorian::date_duration lDateOffSet (_flightPeriod._dateOffSet); + _flightPeriod._itLeg._offDateOffSet = lDateOffSet; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the parsed cabin code. */ + struct store_cabin_code { + store_cabin_code (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (char iChar) const { + _flightPeriod._itCabin._cabinCode = iChar; + // std::cout << "Cabin code: " << iChar << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the parsed capacity. */ + struct store_capacity { + store_capacity (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + void operator() (double iReal) const { + _flightPeriod._itCabin._capacity = iReal; + // std::cout << "Capacity: " << iReal << std::endl; + + // The capacity is the last (according to arrival order) detail + // of the cabin. Hence, when a capacity is parsed, it means that + // the full cabin details have already been parsed as well: the + // cabin can thus be added to the leg. + _flightPeriod._itLeg._cabinList.push_back (_flightPeriod._itCabin); + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store whether or not all the segments are the same. */ + struct store_segment_specificity { + store_segment_specificity (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + void operator() (char iChar) const { + if (iChar == '0') { + _flightPeriod._areSegmentDefinitionsSpecific = false; + } else { + _flightPeriod._areSegmentDefinitionsSpecific = true; + } + + // Do a few sanity checks: the two lists should get exactly the same + // content (in terms of airport codes). The only difference is that one + // is a STL set, and the other a STL vector. + assert (_flightPeriod._airportList.size() + == _flightPeriod._airportOrderedList.size()); + assert (_flightPeriod._airportList.size() >= 2); + + // Since all the legs have now been parsed, we get all the airports + // and the segments may be built. + _flightPeriod.buildSegments(); + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the board point of the segment. */ + struct store_segment_board_point { + store_segment_board_point (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lBoardPoint (iStr, iStrEnd); + _flightPeriod._itSegment._boardPoint = lBoardPoint; + // std::cout << "Board point: " << lBoardPoint << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the off point of the segment. */ + struct store_segment_off_point { + store_segment_off_point (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lOffPoint (iStr, iStrEnd); + _flightPeriod._itSegment._offPoint = lOffPoint; + // std::cout << "Off point: " << lOffPoint << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the off point of the segment. */ + struct store_segment_cabin_code { + store_segment_cabin_code (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + void operator() (char iChar) const { + _flightPeriod._itSegmentCabin._cabinCode = iChar; + // std::cout << "Cabin code: " << iChar << std::endl; + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Store the classes of the segment-cabin. */ + struct store_classes { + store_classes (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + std::string lClasses (iStr, iStrEnd); + _flightPeriod._itSegmentCabin._classes = lClasses; + // std::cout << "Classes: " << lClasses << std::endl; + + // The list of classes is the last (according to the arrival order + // within the schedule input file) detail of the segment cabin. Hence, + // when a list of classes is parsed, it means that the full segment + // cabin details have already been parsed as well: the segment cabin + // can thus be added to the segment. + if (_flightPeriod._areSegmentDefinitionsSpecific == true) { + _flightPeriod.addSegmentCabin (_flightPeriod._itSegment, + _flightPeriod._itSegmentCabin); + } else { + _flightPeriod.addSegmentCabin (_flightPeriod._itSegmentCabin); + } + } + + FlightPeriod_T& _flightPeriod; + }; + + /** Mark the end of the flight-period parsing. */ + struct do_end_flight { + do_end_flight (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) {} + + // void operator() (char iChar) const { + void operator() (iterator_t iStr, iterator_t iStrEnd) const { + // std::cout << "End of Flight-Period " << std::endl; + + assert (_flightPeriod._legAlreadyDefined == true); + _flightPeriod._legList.push_back (_flightPeriod._itLeg); + + // The lists of legs and cabins must be reset + _flightPeriod._legAlreadyDefined = false; + _flightPeriod._itLeg._cabinList.clear(); + + // Display the result + _flightPeriod.display(); + } + + FlightPeriod_T& _flightPeriod; + }; +} + +// /////////// Utilities ///////////// +/** 1-digit-integer parser */ +boost::spirit::int_parser<unsigned int, 10, 1, 1> int1_p; +/** 2-digit-integer parser */ +boost::spirit::uint_parser<int, 10, 2, 2> uint2_p; +/** 4-digit-integer parser */ +boost::spirit::uint_parser<int, 10, 4, 4> uint4_p; +/** Up-to-4-digit-integer parser */ +boost::spirit::uint_parser<int, 10, 1, 4> uint1_4_p; + +/////////////////////////////////////////////////////////////////////////////// +// +// Our calculator grammar (using subrules) +// +/////////////////////////////////////////////////////////////////////////////// + /** + AirlineCode; FlightNumber; DateRangeStart; DateRangeEnd; DOW; + (list) BoardPoint; OffPoint; BoardTime; DateOffSet; OffTime; + ElapsedTime; + (list) CabinCode; Capacity; + SegmentSpecificty (0 or 1); + (list) (optional BoardPoint; OffPoint); CabinCode; Classes + + BA; 9; 2007-04-20; 2007-06-30; 0000011; + LHR; BKK; 22:00; 15:15 / +1; 11:15; F; 5; J; 12; W; 20; Y; 300; + BKK; SYD; 18:10 / +1; 06:05 / +2; 08:55; F; 5; J; 12; W; 20; Y; 300; + 0; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; + BA; 9; 2007-04-20; 2007-06-30; 1111100; + LHR; BKK; 22:00; 15:15 / +1; 11:15; F; 5; J; 12; W; 20; Y; 300; + BKK; SYD; 18:10 / +1; 06:05 / +2; 08:55; F; 5; J; 12; W; 20; Y; 300; + 1; LHR; BKK; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; + BKK; SYD; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; + LHR; SYD; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; + + Grammar: + DOW ::= int + FlightKey ::= AirlineCode ';' FlightNumber + ';' DateRangeStart ';' DateRangeEnd ';' DOW + LegKey ::= BoardPoint ';' OffPoint + LegDetails ::= BoardTime ['/' BoardDateOffSet] + ';' OffTime ['/' BoardDateOffSet] + ';' Elapsed + LegCabinDetails ::= CabinCode ';' Capacity + Leg ::= LegKey ';' LegDetails (';' CabinDetails)+ + SegmentKey ::= BoardPoint ';' OffPoint + SegmentCabinDetails ::= CabinCode ';' Classes + FullSegmentCabinDetails::= (';' SegmentCabinDetails)+ + GeneralSegments ::= '0' (';' SegmentCabinDetails)+ + SpecificSegments ::= '1' (';' SegmentKey + ';' FullSegmentCabinDetails)+ + Segment ::= GeneralSegment | SpecificSegment + FlightPeriod ::= FlightKey (';' Leg)+ + (';' Segment)+ ';' EndOfFlight + EndOfFlight ::= ';' + */ + +using namespace boost::spirit; + +/** Grammar for the Flight-Period parser. */ +struct FlightPeriodParser : + public boost::spirit::grammar<FlightPeriodParser> { + + FlightPeriodParser (FlightPeriod_T& ioFlightPeriod) + : _flightPeriod (ioFlightPeriod) { + } + + template <typename ScannerT> + struct definition { + definition (FlightPeriodParser const& self) { + + flight_period_list = *( boost::spirit::comment_p("//") + | boost::spirit::comment_p("/*", "*/") + | flight_period ) + ; + + flight_period = flight_key + >> +( ';' >> leg ) + >> +( ';' >> segment ) + >> flight_period_end[do_end_flight(self._flightPeriod)] + ; + + flight_period_end = + boost::spirit::ch_p(';') + ; + + flight_key = airline_code + >> ';' >> flight_number + >> ';' >> date[store_date_range_start(self._flightPeriod)] + >> ';' >> date[store_date_range_end(self._flightPeriod)] + >> ';' >> dow[store_dow(self._flightPeriod)] + ; + + airline_code = + lexeme_d[ (repeat_p(2,3)[chset_p("0-9A-Z")])[store_airline_code(self._flightPeriod)] ] + ; + + flight_number = + lexeme_d[ limit_d(0u, 9999u)[uint1_4_p][store_flight_number(self._flightPeriod)] ] + ; + + date = + lexeme_d[ limit_d(2000u,2099u)[uint4_p][assign_a(self._flightPeriod._itYear)] + >> '-' >> limit_d(1u,12u)[uint2_p][assign_a(self._flightPeriod._itMonth)] + >> '-' >> limit_d(1u,31u)[uint2_p][assign_a(self._flightPeriod._itDay)] ] + ; + + dow = + lexeme_d[ repeat_p(7)[chset_p("0-1")] ] + ; + + leg = leg_key >> ';' >> leg_details >> +( ';' >> cabin_details ) + ; + + leg_key = + (repeat_p(3)[chset_p("0-9A-Z")])[store_board_point(self._flightPeriod)] + >> ';' + >> (repeat_p(3)[chset_p("0-9A-Z")])[store_off_point(self._flightPeriod)] + ; + + leg_details = + time[store_board_time(self._flightPeriod)] + >> !(date_offset) + >> ';' + >> time[store_off_time(self._flightPeriod)] + >> !(date_offset) + >> ';' + >> time[store_elapsed_time(self._flightPeriod)] + ; + + time = lexeme_d[ limit_d(0u,23u)[uint2_p][assign_a(self._flightPeriod._itHours)] + >> ':' >> limit_d(0u,59u)[uint2_p][assign_a(self._flightPeriod._itMinutes)] + >> !(':' >> limit_d(0u,59u)[uint2_p][assign_a(self._flightPeriod._itSeconds)]) ] + ; + + date_offset = + boost::spirit::ch_p('/') + >> (int1_p)[boost::spirit::assign_a(self._flightPeriod._dateOffSet)] + ; + + cabin_details = (chset_p("A-Z"))[store_cabin_code(self._flightPeriod)] + >> ';' >> (boost::spirit::ureal_p)[store_capacity(self._flightPeriod)] + ; + + segment_key = + (repeat_p(3)[chset_p("0-9A-Z")])[store_segment_board_point(self._flightPeriod)] + >> ';' + >> (repeat_p(3)[chset_p("0-9A-Z")])[store_segment_off_point(self._flightPeriod)] + ; + + segment = + general_segments | specific_segments + ; + + general_segments = + boost::spirit::ch_p('0')[store_segment_specificity(self._flightPeriod)] + >> +(';' >> segment_cabin_details) + ; + + specific_segments = + boost::spirit::ch_p('1')[store_segment_specificity(self._flightPeriod)] + >> +(';' >> segment_key >> full_segment_cabin_details) + ; + + full_segment_cabin_details = + +(';' >> segment_cabin_details) + ; + + segment_cabin_details = + (chset_p("A-Z"))[store_segment_cabin_code(self._flightPeriod)] + >> ';' >> (repeat_p(1,26)[chset_p("A-Z")])[store_classes(self._flightPeriod)] + ; + + BOOST_SPIRIT_DEBUG_NODE (flight_period_list); + BOOST_SPIRIT_DEBUG_NODE (flight_period); + BOOST_SPIRIT_DEBUG_NODE (flight_period_end); + BOOST_SPIRIT_DEBUG_NODE (flight_key); + BOOST_SPIRIT_DEBUG_NODE (airline_code); + BOOST_SPIRIT_DEBUG_NODE (flight_number); + BOOST_SPIRIT_DEBUG_NODE (date); + BOOST_SPIRIT_DEBUG_NODE (dow); + BOOST_SPIRIT_DEBUG_NODE (leg); + BOOST_SPIRIT_DEBUG_NODE (leg_key); + BOOST_SPIRIT_DEBUG_NODE (leg_details); + BOOST_SPIRIT_DEBUG_NODE (time); + BOOST_SPIRIT_DEBUG_NODE (date_offset); + BOOST_SPIRIT_DEBUG_NODE (cabin_details); + BOOST_SPIRIT_DEBUG_NODE (segment); + BOOST_SPIRIT_DEBUG_NODE (segment_key); + BOOST_SPIRIT_DEBUG_NODE (general_segments); + BOOST_SPIRIT_DEBUG_NODE (specific_segments); + BOOST_SPIRIT_DEBUG_NODE (full_segment_cabin_details); + BOOST_SPIRIT_DEBUG_NODE (segment_cabin_details); + } + + boost::spirit::rule<ScannerT> flight_period_list, flight_period, + flight_period_end, flight_key, airline_code, flight_number, + date, dow, leg, leg_key, leg_details, time, date_offset, cabin_details, + segment, segment_key, general_segments, specific_segments, + full_segment_cabin_details, segment_cabin_details; + + boost::spirit::rule<ScannerT> const& start() const { return flight_period_list; } + }; + + FlightPeriod_T& _flightPeriod; +}; + +// /////////////// M A I N ///////////////// +int main (int argc, char* argv[]) { + try { + + // File to be parsed + std::string lFilename ("world_schedule.csv"); + + // Read the command-line parameters + if (argc >= 1 && argv[1] != NULL) { + std::istringstream istr (argv[1]); + istr >> lFilename; + } + + // Open the file + iterator_t lFileIterator (lFilename); + if (!lFileIterator) { + std::cerr << "The file " << lFilename << " can not be open." << std::endl; + } + + // Create an EOF iterator + iterator_t lFileIteratorEnd = lFileIterator.make_end(); + + // Instantiate the structure that will hold the result of the parsing. + FlightPeriod_T lFlightPeriod; + FlightPeriodParser lFlightPeriodParser (lFlightPeriod); + boost::spirit::parse_info<iterator_t> info = + boost::spirit::parse (lFileIterator, lFileIteratorEnd, + lFlightPeriodParser, + boost::spirit::space_p); + + // DEBUG + std::cout << "Flight Period:" << std::endl; + lFlightPeriod.display(); + + std::cout << "-------------------------" << std::endl; + if (info.full) { + std::cout << "Parsing succeeded" << std::endl; + + } else { + std::cout << "Parsing failed" << std::endl; + } + std::cout << "-------------------------" << std::endl; + + } catch (const std::exception& stde) { + std::cerr << "Standard exception: " << stde.what() << std::endl; + return -1; + + } catch (...) { + return -1; + } + + return 0; +} Added: trunk/opentrep/test/parsers/test_full_calculator.sh =================================================================== --- trunk/opentrep/test/parsers/test_full_calculator.sh (rev 0) +++ trunk/opentrep/test/parsers/test_full_calculator.sh 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,25 @@ +#!/bin/sh + +INSTALL_DIR=`grep "^prefix =" ../Makefile | cut -d"=" -d" " -f3` +TST_PROG=./full_calculator +LATUS_API_VERSION=`grep "^LATUS_API_VERSION =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIBRARY_NAME=`grep "^LATUS_LIBRARY_NAME =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIB=lib${LATUS_LIBRARY_NAME}-${LATUS_API_VERSION}.so + +if [ ! -x ${TST_PROG} ]; +then + echo "The sample program does not seem to have been compiled. Try 'make check' first." + exit -1 +fi + +if [ "$1" = "-h" -o "$1" = "-H" -o "$1" = "--h" -o "$1" = "--help" ]; +then + echo "Usage: $0 [<String to be parsed>]" + echo " The list to be parsed should contain floating point numbers" + echo " separated by commas, and should not contain spaces." + echo " Example: 10.2,5.4" + echo "The program parses a line and fills a flight-period structure." + exit 0 +fi + +${TST_PROG} $1 Property changes on: trunk/opentrep/test/parsers/test_full_calculator.sh ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/test/parsers/test_parameter_parser.sh =================================================================== --- trunk/opentrep/test/parsers/test_parameter_parser.sh (rev 0) +++ trunk/opentrep/test/parsers/test_parameter_parser.sh 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,25 @@ +#!/bin/sh + +INSTALL_DIR=`grep "^prefix =" ../Makefile | cut -d"=" -d" " -f3` +TST_PROG=./parameter_parser +LATUS_API_VERSION=`grep "^LATUS_API_VERSION =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIBRARY_NAME=`grep "^LATUS_LIBRARY_NAME =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIB=lib${LATUS_LIBRARY_NAME}-${LATUS_API_VERSION}.so + +if [ ! -x ${TST_PROG} ]; +then + echo "The sample program does not seem to have been compiled. Try 'make check' first." + exit -1 +fi + +if [ "$1" = "-h" -o "$1" = "-H" -o "$1" = "--h" -o "$1" = "--help" ]; +then + echo "Usage: $0 [<String to be parsed>]" + echo " The list to be parsed should contain floating point numbers" + echo " separated by commas, and should not contain spaces." + echo " Example: 10.2,5.4" + echo "The program parses a line and fills a flight-period structure." + exit 0 +fi + +${TST_PROG} $1 Property changes on: trunk/opentrep/test/parsers/test_parameter_parser.sh ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/test/parsers/test_schedule_parser.sh =================================================================== --- trunk/opentrep/test/parsers/test_schedule_parser.sh (rev 0) +++ trunk/opentrep/test/parsers/test_schedule_parser.sh 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,25 @@ +#!/bin/sh + +INSTALL_DIR=`grep "^prefix =" ../Makefile | cut -d"=" -d" " -f3` +TST_PROG=./schedule_parser +LATUS_API_VERSION=`grep "^LATUS_API_VERSION =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIBRARY_NAME=`grep "^LATUS_LIBRARY_NAME =" ../Makefile | cut -d"=" -d" " -f3` +LATUS_LIB=lib${LATUS_LIBRARY_NAME}-${LATUS_API_VERSION}.so + +if [ ! -x ${TST_PROG} ]; +then + echo "The sample program does not seem to have been compiled. Try 'make check' first." + exit -1 +fi + +if [ "$1" = "-h" -o "$1" = "-H" -o "$1" = "--h" -o "$1" = "--help" ]; +then + echo "Usage: $0 [<String to be parsed>]" + echo " The list to be parsed should contain floating point numbers" + echo " separated by commas, and should not contain spaces." + echo " Example: 10.2,5.4" + echo "The program parses a line and fills a flight-period structure." + exit 0 +fi + +${TST_PROG} $1 Property changes on: trunk/opentrep/test/parsers/test_schedule_parser.sh ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/test/parsers/world_schedule.csv =================================================================== --- trunk/opentrep/test/parsers/world_schedule.csv (rev 0) +++ trunk/opentrep/test/parsers/world_schedule.csv 2009-07-12 13:36:35 UTC (rev 120) @@ -0,0 +1,21 @@ +// Flights: AirlineCode; FlightNumber; Date-Range; ; DOW; Legs; Segments; +// Legs: BoardPoint; OffPoint; BoardTime; ArrivalDateOffSet; ArrivalTime; +// ElapsedTime; LegCabins; +// LegCabins: CabinCode; Capacity; +// Segments: Specific; +BA; 9; 2007-04-20; 2007-06-30; 0000011; LHR; BKK; 22:00; 15:15 / +1; 11:15; F; 5; J; 12; W; 20; Y; 300; BKK; SYD; 18:10 / +1; 06:05 / +2; 08:55; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; +BA; 9; 2007-04-20; 2007-06-30; 1111100; LHR; BKK; 22:00; 15:15 / +1; 11:15; F; 5; J; 12; W; 20; Y; 300; BKK; SYD; 18:10 / +1; 06:05 / +2; 08:55; F; 5; J; 12; W; 20; Y; 300; 1; LHR; BKK; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; BKK; SYD; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; LHR; SYD; F; FA; J; JCDI; W; WT; Y; YBHKMLSQ; +BA; 117; 2007-04-20; 2007-06-30; 1111111; LHR; JFK; 08:20; 11:00; 07:40; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKM; +BA; 175; 2007-04-20; 2007-06-30; 1111111; LHR; JFK; 10:55; 13:35; 07:40; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMRL; +BA; 179; 2007-04-20; 2007-06-30; 1111111; LHR; JFK; 18:05; 20:45; 07:40; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMRVNELSQO; +BA; 207; 2007-04-20; 2007-06-30; 1111111; LHR; MIA; 09:40; 14:25; 09:45; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMRVNELSQO; +BA; 279; 2007-04-20; 2007-06-30; 1111111; LHR; LAX; 10:05; 13:10; 11:05; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMRVNELSQO; +BA; 295; 2007-04-20; 2007-06-30; 1111111; LHR; ORD; 11:35; 14:00; 08:25; F; 5; J; 12; W; 20; Y; 300; 0; F; FA; J; JCDI; W; WT; Y; YBHKMRVNELSQO; +BA; 341; 2007-04-20; 2007-06-30; 1111111; NCE; LHR; 08:55; 10:05; 02:10; J; 12; Y; 300; 0; J; JCDI; Y; YBHKMRVNEQLSO; +BA; 343; 2007-04-20; 2007-06-30; 1111111; NCE; LHR; 11:00; 12:15; 02:15; J; 12; Y; 300; 0; J; JCDI; Y; YBHKMRVNEQLSO; +BA; 345; 2007-04-20; 2007-06-30; 1111111; NCE; LHR; 16:20; 17:25; 02:05; J; 12; Y; 300; 0; J; JCDI; Y; YBHKMRVNEQLSO; +BA; 347; 2007-04-20; 2007-06-30; 1111111; NCE; LHR; 13:55; 15:00; 02:05; J; 12; Y; 300; 0; J; JCDI; Y; YBHKMRVNEQLSO; +AA; 101; 2007-04-20; 2007-06-30; 1111111; LHR; JFK; 09:55; 12:50; 07:55; G; 300; 0; G; GHQKLMVSOWN; +AA; 117; 2007-04-20; 2007-06-30; 1111111; JFK; LAX; 14:20; 17:25; 06:05; F; 12; J; 20; Y; 300; 0; F; FA; J; JDI; Y; YBGHQKLMVSOWN; +AA; 181; 2007-04-20; 2007-06-30; 1111111; JFK; LAX; 17:00; 20:00; 06:00; F; 12; J; 20; Y; 300; 0; F; FA; J; JDI; Y; YBHKMLWVGSNOQ; +AA; 585; 2007-04-20; 2007-06-30; 1111111; JFK; MIA; 15:40; 18:50; 03:10; F; 12; Y; 300; 0; F; FAP; Y; YBHKMLWVGSONQ; \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-13 19:55:07
|
Revision: 125 http://opentrep.svn.sourceforge.net/opentrep/?rev=125&view=rev Author: denis_arnaud Date: 2009-07-13 19:55:04 +0000 (Mon, 13 Jul 2009) Log Message: ----------- [Indexer] The words are no longer indexed with positions within documents. Modified Paths: -------------- trunk/opentrep/opentrep/command/IndexBuilder.cpp trunk/opentrep/test/xapian/string_search.cpp Modified: trunk/opentrep/opentrep/command/IndexBuilder.cpp =================================================================== --- trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-13 18:03:01 UTC (rev 124) +++ trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-13 19:55:04 UTC (rev 125) @@ -48,15 +48,18 @@ (lCityCode.empty())?lPlaceCode:lCityCode; const std::string& lStateCode = ioPlace.getStateCode(); const std::string lDBStateCode = (lStateCode.empty())?"NA":lStateCode; + + // Word index/position within the Xapian document + unsigned short idx = 1; // Add indexing terms - lDocument.add_posting (lPlaceCode, 1); - lDocument.add_posting (lDBCityCode, 2); - lDocument.add_posting (lDBStateCode, 3); - lDocument.add_posting (ioPlace.getCountryCode(), 4); - lDocument.add_posting (ioPlace.getRegionCode(), 5); - lDocument.add_posting (ioPlace.getContinentCode(), 6); - lDocument.add_posting (ioPlace.getTimeZoneGroup(), 7); + lDocument.add_term (lPlaceCode); ++idx; + lDocument.add_term (lDBCityCode); ++idx; + lDocument.add_term (lDBStateCode); ++idx; + lDocument.add_term (ioPlace.getCountryCode()); ++idx; + lDocument.add_term (ioPlace.getRegionCode()); ++idx; + lDocument.add_term (ioPlace.getContinentCode()); ++idx; + lDocument.add_term (ioPlace.getTimeZoneGroup()); ++idx; // Add terms to the spelling dictionnary ioDatabase.add_spelling (lPlaceCode); @@ -65,18 +68,20 @@ ioDatabase.add_spelling (lStateCode); } - // Retrieve the map of name lists - unsigned int i = 1; - const NameMatrix_T& lNameMatrix = ioPlace.getNameMatrix(); + // Retrieve the place names in all the available languages + const NameMatrix_T& lNameMatrix = ioPlace.getNameMatrix (); for (NameMatrix_T::const_iterator itNameList = lNameMatrix.begin(); - itNameList != lNameMatrix.end(); ++itNameList, ++i) { + itNameList != lNameMatrix.end(); ++itNameList) { + // Retrieve the language code and locale + const Language::EN_Language& lLanguage = itNameList->first; const Names& lNames = itNameList->second; - // Add the language code (e.g., en_US) - lDocument.add_posting (lNames.describeShortKey(), 7+i); - ++i; + // Add that language code and locale to the Xapian document + lDocument.add_term (Language::getLongLabel (lLanguage)); ++idx; + // For a given language, retrieve the list of place names const NameList_T& lNameList = lNames.getNameList(); + for (NameList_T::const_iterator itName = lNameList.begin(); itName != lNameList.end(); ++itName) { const std::string& lName = *itName; @@ -84,9 +89,9 @@ // Add the place name (it can be the classical one, or // extended, alternate, etc.) if (lName.empty() == false) { - lDocument.add_posting (lName, 8+i); + // OPENTREP_LOG_DEBUG ("Added name: " << lName); + lDocument.add_term (lName); ++idx; ioDatabase.add_spelling (lName); - ++i; } } } Modified: trunk/opentrep/test/xapian/string_search.cpp =================================================================== --- trunk/opentrep/test/xapian/string_search.cpp 2009-07-13 18:03:01 UTC (rev 124) +++ trunk/opentrep/test/xapian/string_search.cpp 2009-07-13 19:55:04 UTC (rev 125) @@ -28,6 +28,7 @@ for (int idx=2; idx != argc; ++idx) { if (idx != 2) { oStr << " "; +// oStr << " AND "; } const std::string lWord (argv[idx]); const std::string lSuggestedWord = This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-14 10:45:31
|
Revision: 126 http://opentrep.svn.sourceforge.net/opentrep/?rev=126&view=rev Author: denis_arnaud Date: 2009-07-14 10:45:19 +0000 (Tue, 14 Jul 2009) Log Message: ----------- [Ternary Trees] Added the ternary trees structure. Added Paths: ----------- trunk/opentrep/ternary_tree/ trunk/opentrep/ternary_tree/README trunk/opentrep/ternary_tree/doxygen_input/ trunk/opentrep/ternary_tree/doxygen_input/blather.hpp trunk/opentrep/ternary_tree/doxygen_input/concepts.txt trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css trunk/opentrep/ternary_tree/doxygen_input/doxygen.css trunk/opentrep/ternary_tree/doxygen_input/external.png trunk/opentrep/ternary_tree/doxygen_input/featuretable.html trunk/opentrep/ternary_tree/doxygen_input/footer_inc.html trunk/opentrep/ternary_tree/doxygen_input/header_inc.html trunk/opentrep/ternary_tree/doxygen_input/performancetable.html trunk/opentrep/ternary_tree/doxygen_input/tree - trie concepts.txt trunk/opentrep/ternary_tree/doxygen_input/usage.hpp trunk/opentrep/ternary_tree/examples/ trunk/opentrep/ternary_tree/examples/examples.vcproj trunk/opentrep/ternary_tree/examples/locale_less.hpp trunk/opentrep/ternary_tree/examples.cpp trunk/opentrep/ternary_tree/fill_dictionary.cpp trunk/opentrep/ternary_tree/full-docs-index.html trunk/opentrep/ternary_tree/html/ trunk/opentrep/ternary_tree/html/annotated.html trunk/opentrep/ternary_tree/html/blather_8hpp.html trunk/opentrep/ternary_tree/html/class_data_t.html trunk/opentrep/ternary_tree/html/class_data_t_01_5.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list_1_1iterator-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1search__results__list_1_1iterator.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map_1_1value__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__map_1_1value__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap_1_1value__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multimap_1_1value__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multiset-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__multiset.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__set-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1structured__set.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree_1_1key__compare-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1ternary__tree_1_1key__compare.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1_base_t.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1tst__iterator__base-members.html trunk/opentrep/ternary_tree/html/classcontainers_1_1tst__detail_1_1tst__iterator__base.html trunk/opentrep/ternary_tree/html/classstd_1_1back__insert__iterator.html trunk/opentrep/ternary_tree/html/classstd_1_1binary__function.html trunk/opentrep/ternary_tree/html/dir_0df55976ff011c1ef61da79183e9e28f.html trunk/opentrep/ternary_tree/html/dir_59457a7c227558cb0e28f31428e14f54.html trunk/opentrep/ternary_tree/html/dirs.html trunk/opentrep/ternary_tree/html/doxygen.css trunk/opentrep/ternary_tree/html/doxygen.png trunk/opentrep/ternary_tree/html/files.html trunk/opentrep/ternary_tree/html/functions.html trunk/opentrep/ternary_tree/html/functions_0x62.html trunk/opentrep/ternary_tree/html/functions_0x63.html trunk/opentrep/ternary_tree/html/functions_0x64.html trunk/opentrep/ternary_tree/html/functions_0x65.html trunk/opentrep/ternary_tree/html/functions_0x66.html trunk/opentrep/ternary_tree/html/functions_0x67.html trunk/opentrep/ternary_tree/html/functions_0x68.html trunk/opentrep/ternary_tree/html/functions_0x69.html trunk/opentrep/ternary_tree/html/functions_0x6b.html trunk/opentrep/ternary_tree/html/functions_0x6c.html trunk/opentrep/ternary_tree/html/functions_0x6d.html trunk/opentrep/ternary_tree/html/functions_0x6e.html trunk/opentrep/ternary_tree/html/functions_0x6f.html trunk/opentrep/ternary_tree/html/functions_0x70.html trunk/opentrep/ternary_tree/html/functions_0x72.html trunk/opentrep/ternary_tree/html/functions_0x73.html trunk/opentrep/ternary_tree/html/functions_0x74.html trunk/opentrep/ternary_tree/html/functions_0x75.html trunk/opentrep/ternary_tree/html/functions_0x76.html trunk/opentrep/ternary_tree/html/functions_0x77.html trunk/opentrep/ternary_tree/html/functions_0x7e.html trunk/opentrep/ternary_tree/html/functions_enum.html trunk/opentrep/ternary_tree/html/functions_eval.html trunk/opentrep/ternary_tree/html/functions_func.html trunk/opentrep/ternary_tree/html/functions_func_0x62.html trunk/opentrep/ternary_tree/html/functions_func_0x63.html trunk/opentrep/ternary_tree/html/functions_func_0x64.html trunk/opentrep/ternary_tree/html/functions_func_0x65.html trunk/opentrep/ternary_tree/html/functions_func_0x66.html trunk/opentrep/ternary_tree/html/functions_func_0x67.html trunk/opentrep/ternary_tree/html/functions_func_0x68.html trunk/opentrep/ternary_tree/html/functions_func_0x69.html trunk/opentrep/ternary_tree/html/functions_func_0x6b.html trunk/opentrep/ternary_tree/html/functions_func_0x6c.html trunk/opentrep/ternary_tree/html/functions_func_0x6d.html trunk/opentrep/ternary_tree/html/functions_func_0x6e.html trunk/opentrep/ternary_tree/html/functions_func_0x6f.html trunk/opentrep/ternary_tree/html/functions_func_0x70.html trunk/opentrep/ternary_tree/html/functions_func_0x72.html trunk/opentrep/ternary_tree/html/functions_func_0x73.html trunk/opentrep/ternary_tree/html/functions_func_0x74.html trunk/opentrep/ternary_tree/html/functions_func_0x75.html trunk/opentrep/ternary_tree/html/functions_func_0x76.html trunk/opentrep/ternary_tree/html/functions_func_0x7e.html trunk/opentrep/ternary_tree/html/functions_rela.html trunk/opentrep/ternary_tree/html/functions_type.html trunk/opentrep/ternary_tree/html/functions_type_0x62.html trunk/opentrep/ternary_tree/html/functions_type_0x63.html trunk/opentrep/ternary_tree/html/functions_type_0x64.html trunk/opentrep/ternary_tree/html/functions_type_0x66.html trunk/opentrep/ternary_tree/html/functions_type_0x68.html trunk/opentrep/ternary_tree/html/functions_type_0x69.html trunk/opentrep/ternary_tree/html/functions_type_0x6b.html trunk/opentrep/ternary_tree/html/functions_type_0x6c.html trunk/opentrep/ternary_tree/html/functions_type_0x6d.html trunk/opentrep/ternary_tree/html/functions_type_0x6e.html trunk/opentrep/ternary_tree/html/functions_type_0x70.html trunk/opentrep/ternary_tree/html/functions_type_0x72.html trunk/opentrep/ternary_tree/html/functions_type_0x73.html trunk/opentrep/ternary_tree/html/functions_type_0x74.html trunk/opentrep/ternary_tree/html/functions_type_0x76.html trunk/opentrep/ternary_tree/html/functions_vars.html trunk/opentrep/ternary_tree/html/globals.html trunk/opentrep/ternary_tree/html/globals_defs.html trunk/opentrep/ternary_tree/html/globals_func.html trunk/opentrep/ternary_tree/html/graph_legend.dot trunk/opentrep/ternary_tree/html/graph_legend.html trunk/opentrep/ternary_tree/html/graph_legend.png trunk/opentrep/ternary_tree/html/hierarchy.html trunk/opentrep/ternary_tree/html/index.html trunk/opentrep/ternary_tree/html/iteration__impl_8hpp.html trunk/opentrep/ternary_tree/html/iterator__wrapper_8hpp.html trunk/opentrep/ternary_tree/html/namespacecontainers.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1smap__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1sset__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__detail_1_1mpl__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1tst__erase__impl__detail.html trunk/opentrep/ternary_tree/html/namespacecontainers_1_1util.html trunk/opentrep/ternary_tree/html/namespaceiterators.html trunk/opentrep/ternary_tree/html/namespacemembers.html trunk/opentrep/ternary_tree/html/namespacemembers_func.html trunk/opentrep/ternary_tree/html/namespaces.html trunk/opentrep/ternary_tree/html/namespacestd.html trunk/opentrep/ternary_tree/html/new__iterator__base_8ipp.html trunk/opentrep/ternary_tree/html/pages.html trunk/opentrep/ternary_tree/html/perf_notes.html trunk/opentrep/ternary_tree/html/structcontainers_1_1smap__detail_1_1multimap__iterator-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1smap__detail_1_1multimap__iterator.html trunk/opentrep/ternary_tree/html/structcontainers_1_1sset__detail_1_1multiset__iterator-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1sset__detail_1_1multiset__iterator.html trunk/opentrep/ternary_tree/html/structcontainers_1_1ternary__tree_1_1find__result-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1ternary__tree_1_1find__result.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1always__heap__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1always__heap__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1back__push__pop-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1back__push__pop.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1dummy__sequence-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1dummy__sequence.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1heap__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1heap__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inorder__seek-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inorder__seek.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inplace__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1inplace__node.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1iter__method__forward-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1iter__method__forward.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1key__access-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1key__access.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info_1_1search-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1levenshtein__search__info_1_1search.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c_3_01false_00_01_t1_00_01_t2_01_4-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1mpl__detail_1_1if__c_3_01false_00_01_t1_00_01_t2_01_4.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1node__base-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1node__base.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1size__policy__node-members.html trunk/opentrep/ternary_tree/html/structcontainers_1_1tst__detail_1_1size__policy__node.html trunk/opentrep/ternary_tree/html/structiterators_1_1const__traits-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1const__traits.html trunk/opentrep/ternary_tree/html/structiterators_1_1iterator__wrapper-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1iterator__wrapper.html trunk/opentrep/ternary_tree/html/structiterators_1_1nonconst__traits-members.html trunk/opentrep/ternary_tree/html/structiterators_1_1nonconst__traits.html trunk/opentrep/ternary_tree/html/structured__map_8hpp.html trunk/opentrep/ternary_tree/html/structured__set_8hpp.html trunk/opentrep/ternary_tree/html/structured_concept.html trunk/opentrep/ternary_tree/html/tab_b.gif trunk/opentrep/ternary_tree/html/tab_l.gif trunk/opentrep/ternary_tree/html/tab_r.gif trunk/opentrep/ternary_tree/html/tabs.css trunk/opentrep/ternary_tree/html/ternary__tree_8hpp.html trunk/opentrep/ternary_tree/html/todo.html trunk/opentrep/ternary_tree/html/tst__implementation_8ipp.html trunk/opentrep/ternary_tree/html/tst__iterator__base_8ipp.html trunk/opentrep/ternary_tree/html/tst__iterator__facade_8hpp.html trunk/opentrep/ternary_tree/html/tst__node_8hpp.html trunk/opentrep/ternary_tree/html/tst__search__results_8ipp.html trunk/opentrep/ternary_tree/html/tst_impl.html trunk/opentrep/ternary_tree/html/tst_links.html trunk/opentrep/ternary_tree/html/tst_reference.html trunk/opentrep/ternary_tree/html/tst_tests.html trunk/opentrep/ternary_tree/html/tst_usage.html trunk/opentrep/ternary_tree/html/usage_8hpp.html trunk/opentrep/ternary_tree/index.html trunk/opentrep/ternary_tree/iterator_compile_test.cpp trunk/opentrep/ternary_tree/iterator_wrapper.hpp trunk/opentrep/ternary_tree/readme.txt trunk/opentrep/ternary_tree/structured_map.hpp trunk/opentrep/ternary_tree/structured_set.hpp trunk/opentrep/ternary_tree/ternary_tree.hpp trunk/opentrep/ternary_tree/test/ trunk/opentrep/ternary_tree/test/basic_insertion_test.hpp trunk/opentrep/ternary_tree/test/check_iteration.hpp trunk/opentrep/ternary_tree/test/copy_test.hpp trunk/opentrep/ternary_tree/test/element_range_test.hpp trunk/opentrep/ternary_tree/test/erase_test.cpp trunk/opentrep/ternary_tree/test/hamming_search_test.cpp trunk/opentrep/ternary_tree/test/iterator_test.cpp trunk/opentrep/ternary_tree/test/localization_test.cpp trunk/opentrep/ternary_tree/test/longest_match_test.cpp trunk/opentrep/ternary_tree/test/mapped_value_test.cpp trunk/opentrep/ternary_tree/test/partial_match_test.cpp trunk/opentrep/ternary_tree/test/prefix_range_test.cpp trunk/opentrep/ternary_tree/test/scrabble_search_test.cpp trunk/opentrep/ternary_tree/test/test.vcproj trunk/opentrep/ternary_tree/test/test_tst.cpp trunk/opentrep/ternary_tree/test/tests_common.hpp trunk/opentrep/ternary_tree/tst.doxy trunk/opentrep/ternary_tree/tst_concept_checks.cpp trunk/opentrep/ternary_tree/tst_detail/ trunk/opentrep/ternary_tree/tst_detail/iteration_impl.hpp trunk/opentrep/ternary_tree/tst_detail/new_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_implementation.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_facade.hpp trunk/opentrep/ternary_tree/tst_detail/tst_node.hpp trunk/opentrep/ternary_tree/tst_detail/tst_search_results.ipp trunk/opentrep/ternary_tree/tst_public.doxy trunk/opentrep/test/ternary/ Added: trunk/opentrep/ternary_tree/README =================================================================== --- trunk/opentrep/ternary_tree/README (rev 0) +++ trunk/opentrep/ternary_tree/README 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,3 @@ + +Source: http://abc.se/~re/code/tst and http://abc.se/~re/code/tst/ternary_tree.zip + Added: trunk/opentrep/ternary_tree/doxygen_input/blather.hpp =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/blather.hpp (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/blather.hpp 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,558 @@ +/** \mainpage Structured Associative Containers + +Ternary Search Tree containers to replace \c set<string> and \c map<string, Value> </h2> + +<center><table bgcolor="#fbf9e5" style="border: thin dotted #808000;" width="95%" border=0> +<tr> +<td> +<h3>Table of contents</h3> +<dl> + <dt>\ref introduction "Introduction"</dt> + <dt>\ref subkey_search_overview "Advanced searches overview"</dt> + <dt>\ref tst_usage "Tutorial"</dt> + <dt>\ref tst_reference "Reference"</dt> <dd> + <dd>\ref structured_concept "Structured Container concept" \n + Class \ref containers::structured_set "structured_set" \n + Class \ref containers::structured_map "structured_map" \n + Class \ref containers::structured_multiset "structured_multiset" \n + Class \ref containers::structured_multimap "structured_multimap" \n + Implementation class \ref containers::ternary_tree "ternary_tree" + </dd></dt> + <dt>\ref perf_notes "Performance notes"</dt> + <dt>\ref tst_impl "Implementation details"</dt> + <dt>\ref tst_links "Links"</dt> + <dt>\ref tst_tests "Test Suite"</dt> +</dl> +</td> +</tr></table></center> + +Download: Latest version (0.684) http://abc.se/~re/code/tst/ternary_tree.zip\n + +Copyleft: <a href="mailto:rasmus%20point%20ekman%20at%20abc%20point%20se?subject=Structured Containers suck/rule"> +rasmus ekman</a> 2007-2009 \n +Weblink: http://abc.se/~re/code/tst + +\anchor introduction <hr> +<h2>Introduction</h2> +<b>Structured containers</b> are \c map and \c set -like containers specialized for strings. +They are commonly used for dictionaries.\n +Structured containers have two major benefits: +- They offer near-match searches (wildcard search, partial match etc) that are hard to implement + with other containers. +- Lookup performance is on a par with hashed containers for many common applications, +and 2-5 times faster than standard maps and sets (with string-like keys). + +Of course there is a price to pay: structured containers use much more memory than +other containers: Around 6-8 bytes <b>per letter</b> inserted (whether \c char or \c wchar_t); +an English 150 k word dictionary uses eg 7.3 MB to store 1.2 MB words (2.4 MB of \c wchar_t words). + +The container classes in this library can be used as drop-in replacements for \c set and \c map +(or \c unordered_set, \c unordered_map): + - \ref containers::structured_set "structured_set": This stores unique keys and allows structured key searches. + - \ref containers::structured_multiset "structured_multiset": This stores non-unique keys. + - \ref containers::structured_map "structured_map": This is a + <a target="sgi" href="http://www.sgi.com/tech/stl/PairAssociativeContainer.html">Pair Associative Container</a>, + as it allows associating a value with each key. + - \ref containers::structured_multimap "structured_multimap": Technically, a + <a target="sgi" href="http://www.sgi.com/tech/stl/MultipleSortedAssociativeContainer.html">Multiple, Sorted, + Pair Associative Container</a> - it allows storing several values with each key. + +While the STL standard associative containers are normally backed by a binary tree structure, +Structured Containers are backed by a Ternary Search Tree, as presented by +\ref note_1 "Jon Bentley and Robert Sedgewick in [1]". + +Class \ref containers::ternary_tree "ternary_tree<Key, Value, Comp, Alloc>" provides the implementation backend. +Due to its internals, its interface cannot easily be made to conform with standard STL concepts, +so it is used internally by the structured* wrapper classes (much like STL's internal \c rb_tree class). + +Basically, if you have code using sets or maps, you have code to use structured containers. +And with 1-3 lines of code, you're ready to make advanced imprecise searches in your dictionaries.\n +See \ref tst_usage "the usage section" for examples of how to use these classes. + +<table bgcolor="#f0f0ff" style="border: thin dotted #808000;" border=0> +<tr><th>Library status</th></tr> +<tr><td valign="top" align="right">Compatibility:</td> +<td>Note that the file \b tst_concept_checks.cpp is currently broken. Will investigate.\n +<!-- This used to compile with Mingw GCC 3.4.2 and with MSVC7.1 (with STLport 5). Requires Boost 1.33. +Not sure what happened in Boost 1.36-37 or if I've mangled something. \n +Due to recent changes, ternary tree does not support stateful allocators (earlier versions did this by implication) --> +</td> +<tr><td valign="top" align="right">version 0.684: (Jan 2009)</td> +<td>Fix standard-breakage in multimap/multiset return from <code>insert(const value_type&)</code>.<br> +Added <code>operator-></code> to iterator wrapper for C++0x compatibility. +Thanks to Geoffrey Noel for reports.</td> +</tr> +<tr><td valign="top" align="right">version 0.683: (March 2007)</td> +<td>Fix portability issues for GCC and non-STLport libraries. Fix longest_match.<br> +Thanks to Arjen Wagenaar for several reports, fixes and encouragement. Thanks also to Michel Tourn for reports.</td> +</tr> +<tr><td valign="top" align="right">version 0.68: (Dec 2006)</td> +<td>Implement TST_NODE_COUNT_TYPE macro, which can be used to control node size on 64-bit systems. + See \ref containers::ternary_tree "class ternary_tree"</td> +</tr> +<tr><td valign="top" align="right">version 0.68 (alpha):</td> +<td>Reimplemented node type. Do proper management of value type (was inconsistent, partly unimplemented - duh!)</td> +</tr> +<!--tr> +<tr><td valign="top" align="right">version 0.676:</td> +<td>Modified containers to follow C++0x draft standard: \n +Added \c cbegin, \c cend methods returning \c const_iterator, and \c crbegin, \c crend +returning \c const_reverse_iterator, to make it easier to code with const-correctness. \n +\c erase(iterator pos); and \c erase(iterator first, iterator last); methods now return iterators.</td> +<tr><td valign="top" align="right">version 0.675:</td> +<td>All Structured Container classes implemented. Structured search interface TBD. +</td--> +</table> + + +\anchor subkey_search_overview <hr> +<h2>Sub-key, or Structure Searches</h2> +<span style="color:#905050;">(a new interface for these searches will be specified in the future)</span> + +Ternary trees allow searches that match parts of keys and ignores mismatches in other parts.\n +In the current interface we specify a small number of searches facilitated by the tree structure; +the Partial Match and Hamming searches are defined in several other implementations +(showcased in \ref note_1 "Bentley and Sedgewick" code). +The Levenshtein and combinatorial searches are not found in other ternary trees (that I know of). + +<table border="1" cellspacing="0"> + <tr><th bgcolor="#f0f0ff">Name (function name)</th><th bgcolor="#f0f0ff">Description</th></tr> + <tr><th> + Prefix match (\ref containers::ternary_tree::prefix_range "prefix_range")</th><td> + Finds keys sharing a common prefix, returns a pair of iterators.</td></tr> + <tr><th> + Longest match (\ref containers::ternary_tree::longest_match "longest_match")</th><td> + Finds the longest key that matches beginning of search string. + A typical application is to tokenize a string using the ternary tree as dictionary.</td></tr> + <tr><th> + Partial match, or wildcard search (\ref containers::ternary_tree::partial_match_search "partial_match_search")</th><td> + Accepts a search string with wildcard characters that will match any letter, + eg "b?nd" would match "band", "bend", "bind", "bond" in an English dictionary.</td></tr> + <tr><th> + Search allowing \c N mismatches, + (\ref containers::ternary_tree::hamming_search "hamming_search"<span style="font-weight:normal;"></span>)</th><td> + Accepts a search string and an integer \c dist indicating how many non-matching letters are allowed, + then finds keys matching search string that have at most \a dist mismatches. + This works like a partial match search with all combinations of \a dist + wildcards in the search string.\n + \c hamming_search("band", 1) matches the wildcard search plus "bald", "bane" and "wand", etc. \n + The version here, following DDJ code, extends the strict Hamming search by also allowing shorter and longer + strings; a search for "band", \a dist = 1, also finds "ban" and "bandy" etc.\n + See also http://wikipedia.org/wiki/Hamming_distance</td></tr> + <tr><th> + Levenshtein distance search</b> (\ref containers::ternary_tree::levenshtein_search "levenshtein_search" + <span style="font-weight:normal;">- consider descriptive name</span>)</th><td> + + Hamming search matches characters in fixed position, allowing substitution of \a dist chars. + Levenshtein search also allows shifting parts of the search string by insertion or skipping chars (in \a dist places). + So <code>levenshtein_search("band", 1) </code> extends the hamming_search set with "and" and "bland", etc. + A typical application is to match mispelt words.\n + See also http://wikipedia.org/wiki/Levenshtein_distance</td></tr> + <tr><th> + Combinatorial or "scrabble" search (\ref containers::ternary_tree::combinatorial_search "combinatorial_search")</th><td> + Finds all keys using the characters in search string. \c combinatorial_search("band") finds + "ad", "and", "bad", "dab", "nab", etc. A count of wildcards can be added, also allowing + nonmatching characters (use with care, values over 10% of average key length + may cause the algorithm to traverse a large part of the tree).</td></tr> +</table> + +See \ref usage_imprecise_searches "advanced search overview" in the tutorial. + +These searches are defined for all containers in this library. +But they are also marked as deprecated (to be replaced by generic algorithms with same interface). +For a relative performance comparison of imprecise searches, see the second table in \ref perf_notes. + +<h3>Future directions</h3> +The searches currently defined are clearly special cases in a sea of search possibilities. +We have only defined searches that are relatively efficient, compared to other combinations of containers and algorithms. +But there can be many variations on the available searches: increasing Hamming/Levenshtein distance +at the end of words, or matching limited ranges of characters (eg allowing mismatches only in vowels), etc. + +The next step for this project is to support a more flexible low-level interface for +traversing and filtering tree nodes. +The interface for these "structured searches" is open for consideration, but it +will basically define sub-key iterators, conversion of full-key from sub-key iterators, +and a small collection of algorithms operating on these sub-key iterators. + +At least the following operations are needed: + + - sub-key match: matching a part of a key (prefix, or starting from current char position) + - key element range increment: from a sub-key position, match a range of characters + in next position (returns a list of sub-key iterators? - or iterator-like operation?) + - conversion from sub-key iterator to full-key iterator range (nearest and post-furthest + keys in the subtree) + - \c is_key(subkey_iterator pos): true if end-of-key exists at iterator position. + - \c count_elements(subkey_iterator pos): returns number of available key elements at position. + - In all predefined algorithms above, either a specific, or any char is matched, + we would also support arbitrary char sets (possibly with special case for char ranges). + + */ + +/** \page tst_reference Reference +<center><table bgcolor="#fbf9e5" style="border: thin dotted #808000;" width="95%" border=0> +<tr> +<td> +<dl> + <dt>\ref structured_concept "Structured Container concept"</dt> + <dt>\ref ref_sethpp "Header < structured_set.hpp >"</dt> + <dt>\ref ref_maphpp "Header < structured_map.hpp >"</dt> + <dt>\ref ref_tsthpp "Header < ternary_tree.hpp >"</dt> + <dt>\ref ref_iterhpp "Header < iterator_wrapper.hpp >"</dt> +</dl> +</td> +</tr></table></center> + +<hr> + +\anchor ref_sethpp +<h2>Header < <a href="../structured_set.hpp">%structured_set.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + \b template <\b class Key, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<Key> > + \b class \ref containers::structured_set "structured_set"; + + \b template <\b class Key, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<Key> > + \b class \ref containers::structured_multiset "structured_multiset"; +} +</pre> + +\anchor ref_maphpp +<h2>Header < <a href="../structured_map.hpp">%structured_map.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::structured_map "structured_map"; + + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::structured_multimap "structured_multimap"; +} +</pre> + +<hr> +Supplementary header files needed to support structured_set and -map classes. + + +\anchor ref_tsthpp +<h2>Header < <a href="../ternary_tree.hpp">%ternary_tree.hpp</a> > synopsis</h2> +<pre> +\b namespace containers { + + \b template <\b class Key, + \b class T, + \b class Comp = std::less<\b typename Key::value_type>, + \b class Alloc = std::allocator<std::pair<\b const Key, T> > > + \b class \ref containers::ternary_tree "ternary_tree"; + + \b template <\b class TreeT, \b class IteratorT> + \b class \ref containers::search_results_list "search_results_list"; + +} +</pre> + + +\anchor ref_iterhpp +<h2>Header < <a href="../iterator_wrapper.hpp">%iterator_wrapper.hpp</a> > synopsis</h2> +<pre> +\b namespace iterators { + + \b template <\b class T> \b struct const_traits; + \b template <\b class T> \b struct nonconst_traits; + + \b template <\b class BaseIterT, + \b class TraitsT, // either const_traits<T> or nonconst_traits<T> + \b class IterCatT = std::bidirectional_iterator_tag > + \b class \ref iterators::iterator_wrapper "iterator_wrapper"; +} +</pre> + +*/ + +/** +\page structured_concept Structured Associative Container Concept + +<span style="color:#905050;">(a preliminary sketch of the formal technical concept description)</span> + +A Structured Associative Container is a specialization of the C++ 98 standard concept +<a target="sgi" href="http://www.sgi.com/tech/stl/SortedAssociativeContainer.html">Sorted Associative Container</a>, +with extended interface. + +The template parameters are similar to that of the Associated Containers: + +<code> structured_set<Key, Comp, Alloc>; </code>\n +<code> structured_map<Key, Value, Comp, Alloc>; </code>\n + +where: + - \c <b>Key</b> type is itself a container (eg a \c std::string or \c std::wstring) + - \c <b>Comp</b> is a comparison operator that imposes a sort order on \c Key::value_type elements \n + (so if \c Key is string, \c Comp compares \c char, if \c Key is \c wstring, \c Comp applies to \c wchar_t). + - \c <b>Value</b> can be any Assignable type + - \c <b>Alloc</b> is an allocator that manages all memory allocation for the container. + +The \c Comp and the \c Alloc types have default template arguments. + +In other words Structured containers are like Sorted Associative Containers, BUT + - add the requirement on Key template type to be a + <a target="sgi" href="http://www.sgi.com/tech/stl/ForwardContainer.html">Forward Container</a>.\n + For example, \c std::basic_string<CharT> is compatible with this requirement. + - change the requirement on the \c Comp (comparator) template argument to operate on + \c key_type::value_type elements (rather than on \c key_type itself). + Like Sorted Associative comparator, the \c Comp type shall define a less-like comparison, a + <a target="sgi" href="http://www.sgi.com/tech/stl/StrictWeakOrdering.html">Strict Weak Ordering</a> + of key-elements. + +<b>Associated types</b> + - \b char_compare: less-like comparison of key elements (establishing a Strict Weak Ordering). + The <a target="sgi" href="http://www.sgi.com/tech/stl/AssociativeContainer.html">Associative Container</a> + \c key_compare type is also provided, but is defined in terms of \c char_compare. \n + - \b subkey_iterator: Used in structure searches. Convertible to iterator (TBD). + +In consequence it allows searches involving subparts of keys, ie with shared prefix and/or +with shared middle parts. + +<hr> +<h3>Deprecated search interface</h3> + +In the first iteration, additional searches are provided as methods on the containers. +This will be changed to use free functions operating on \c subkey_iterator. +The deprecated search methods will still be provided as convenience functions; +to migrate your code from present version to the new interface, will mean moving +the object name to the first argument, but also to respecify the search_results_list type. +(This sloppy-hackish type is by itself reason not to keep the method interface) + +See \ref subkey_search_overview "Structured search overview" +and \ref tst_structsearch "ternary_tree Structure search section". +*/ + +/* + +\b Notation \n +<table border=0> +<tr><td>\c X <td>A type that is a model of Associative Container </td></tr> +<tr><td>\c a <td>Object of type \c X </tr> +<tr><td>\c k <td>Object of type \c X::key_type </tr> +<tr><td>\c p, \c q <td>Object of type \c X::char_iterator </tr> +<tr><td>\c c <td>Object of type \c X::char_type </tr> +<tr><td>\c o <td>Object modelling output iterator </tr> +<tr><td>\c i <td>Object of type \c X::size_type </tr> +</dl> + +<table border=1> +<tr><th>Name</th><th>Expression</th><th>Return value</th> +<tr><td>Prefix match</td><td><code>a.prefix_range(k)</code></td><td> + \c std::pair<iterator, iterator> if \c a is mutable, otherwise <br>\c std::pair<const_iterator, const_iterator></td></tr> +<tr><td>Longest match</td><td><code>a.longest_match(p, q)</code></td><td> + \c iterator if \c a is mutable, otherwise \c const_iterator</td></tr> +<tr><td>Partial match, or <br>wildcard search</td><td><code>a.partial_match_search(k, o, c)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Hamming search</td><td><code>a.hamming_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Levenshtein search</td><td><code>a.levenshtein_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +<tr><td>Combinatorial or <br>"scrabble" search</td><td><code>a.combinatorial_search(k, o, i)</code></td><td> + The output iterator \c o</td></tr> +</table> + +*/ + +/** \page tst_impl Implementation Details + * (In the following, "original" and "DDJ" code refers to the article by Bentley/Sedgewick + * published in Dr Dobb's Journal, and the accompanying C source code - see \ref tst_links) + * + * In most implementations, a ternary tree node has the following members: \code + * struct node { + * char splitchar; // key letter, or 0 at end of key + * node *hikid; // subtree of keys with higher value than splitchar + * node *eqkid; // subtree matching splitchar (pointer to mapped value at end-of-key node) + * node *lokid; // subtree less than splitchar + * node *parent; // necessary for iteration (not needed for insert/find) + * }; \endcode + * + * This means that each node is 1 char plus three or four pointers size. + * On many systems, struct member alignment makes the char member consume size of one pointer + * as well, so we have 4 (or 5) x sizeof(pointer) per node in the tree. + * With several kinds of dictionaries, the node count ends up at around 0.3-0.5 times + * total key length (since keys share nodes). + * This is even more expensive on 64-bit machines. + * + * There are several variation points in the node class: + * -# the DDJ C code designates an invalid value of zero to indicate end-of-string. We want to + * allow any string as key, so the end-of-string representation should change. + * We note that on many platforms, C/C++ struct member alignment leaves a "hole" + * in the binary representation of the node, between the char and the first pointer ("hikid"). + * On such systems there is no space cost to use another char-sized value to indicate end node. + * This also works for \c wchar_t strings on 32- or 64-bit systems. + * -# The original code stores a value for each string in the terminal node's "equal" pointer. + * The value in DDJ code is always a pointer to the terminated string. This is used to make + * advanced searches work (they return an array of pointers to strings stored in end-nodes). + * In reality this means that strings may need to be copied on insertion (not reflected in DDJ timings). + * -# Original DDJ code does not support iterating over strings in the tree. + * Idiomatic STL-like container style strongly suggests that iteration should be supported. + * This is fairly simple to implement if a parent pointer is added to the node struct: + * Because when an end-node is reached, the iterator must backtrack to find the previous + * branch point. + * + * The parent pointer also makes it possible to recover the inserted string by walking nodes + * backward from a terminal node to the root. Complexity is key length, plus log(tree.size), + * but it means inserted keys do not \b have to be copied to the end node. + * We opt to cache keys in iterators, at no measurable extra cost in iteration. + + * Instead of the key, an arbitrary value can be associated with endnodes. + * However, it should not be allowed to increase node size, since most nodes in the tree are not endnodes. + * In this library we store the mapped value directly in end-node if it is <tt> <= \c sizeof(void*). </tt> + * Larger objects are allocated on the heap, and a pointer to the copy is stored in end-node + * (the copy is managed by the tree). + * + * <h4>Now for some optimization</h4> + * We use a \c vector<node> as pool allocator, and record eq-hi-lo links as vector index instead of pointers. + * The pool allocation essentially follows original C code insert2() principle. + * For us, it also simplifies reallocation, since pointers do not have to be rebound; + * the indices are always valid. + * This has the following consequences: + * - allow the option of 4-byte indices also on 64-bit systems (with obvious resulting tree size limit) + * - When a new key is inserted, the last part (unique to the key) is always allocated in a batch. + * This means that one node member, \c "eqkid", becomes redundant, as it is always the next index + * (except after terminal nodes of course). + * - in DDJ code the end-node value is stored in union with the eqkid. We note that the \c lokid node index + * is also unused by end-nodes (as no char should be lower than zero), so all endnode children + * are linked to the hi node. + * + * (In our binary-cognizant version where zero is a regular char value, this still holds, + * we just change the end-node test accordingly) + * + * In the final cut, our node struct data members appear roughly like this: \code + * struct node { + * CharType splitchar; // key letter, or 0 at end of key (to make sure lokid is never allowed) + * CharType endflag; // zero on normal nodes, 1 at end nodes, 2 at erased nodes. + * node_index hikid; // subtree of keys with higher value than splitchar + * node_index lokid; // subtree less than splitchar + * node_index parent; // necessary for iteration (not needed for insert/find) + * }; \endcode + * + * where \c CharType is defined by template \c Key::value_type, and treated as an unsigned type + * (so 0 is the lowest value); and \c node_index is a \c size_t -like type used by the node + * storage backend (currently \c std::vector). + * + * This optimization could also be applied to C version, trimming space requirement in DDJ code + * to 3-word nodes. + */ + +/** \page perf_notes Performance Notes + * + * <h3>Space considerations</h3> + * + * Ternary trees are notably larger than hash maps or most binary tree designs. + * Each node holds only one character (instead of a whole key), and use 3-5 pointers. + * Our nodes consist of 4 \c size_t values (16 bytes) regardless of platform pointer size, + * or char type (if at most 2-byte like \c wchar_t). + * + * The shared parts of strings save space: In a typical English dictionary, + * each key shares over half its nodes with other keys, so the allocated space is about half + * of total key-length times 16. In a scrabble dictionary like the one reported below, + * which contains all valid word endings, most nodes are shared, so its storage cost is "only" + * total key length times 0.35 times 16, or less than 6 bytes per char. + * With \c wchar_t type, the storage cost cannot be considered overly large. + * + * See also \ref tst_impl + * + * <h3>Lookup speed</h3> + * + * The complexity of ternary tree operations is basically the same as for binary trees, + * (logarithmic in tree size) but with quite different constant factors. See \ref note_1 "[1]". + * + * Overall lookup and iteration speed depends on application factors - ie + * whether strings are inserted in random order or not, etc. + * + * Rough speed estimates (compared to Stlport hash_map and map). + * - insertion is a bit slower (>30% to 0%) than hash_map, ~30% faster than map. + * - finding a key is ~0-50% slower than hash_map (equal on failure, with short keys). + * - finding a key is 1.5-3 times faster than map (again with short keys). + * + * Compared to C versions (DDJ and libtst), + * - find and insert are slower, by factors ranging from 1.5-4. + * - partial_match and neighbour searches are 5-20% faster than published DDJ code - + * the code is essentially the same, but our implementation rolls out some recursion. + * This is easily back-ported, so in effect they should be considered to run at same speed. + * This by itself is good news though, since eg single-key lookup is always slower. + * + * Since each character in a key is at a separate node in the internal tree, + * iterating over values is a little slower than for other tree-based containers. + * + * For detailed test, see performance table below. + * + * <hr style="height: 3px; border-top: 0px; background-color: #e09060;"> + * \htmlinclude performancetable.html + */ + + +/** \page tst_links Links + * ternary_tree by rasmus ekman, see http://abc.se/~re/code/tst <br> + * Download: http://abc.se/~re/code/tst/ternary_tree.zip + * + * Some other TST implementations. + * - <b>DDJ code:</b> Original C implementation by Jon Bentley and Robert Sedgewick. + * Article in Dr Dobb's Journal, 1998 #4: http://www.ddj.com/documents/s=921/ddj9804a/9804a.htm \anchor note_1 \n + * See http://www.cs.princeton.edu/~rs/strings/ for C code and article on TST complexity. + * - \b libtst: Worked-out version of DDJ code by Peter A. Friend 2002. Version 1.3. \n + * See http://www.octavian.org/cs/software.html \anchor note_2 \n + * - \b Boost.Spirit version: C++ reimplementation by Joel de Guzman. \anchor note_3 \n + * See http://spirit.sourceforge.net/ internal file ./boost/spirit/symbols/impl/tst.h + * - <b>Hartmut Kaiser version:</b> C++ reimplementation intended for generalization of tst. + * Currently abandoned, available in Spirit CVS. (interesting for interface design) \n + * See http://lists.boost.org/Archives/boost/2005/09/93316.php \n + * and http://article.gmane.org/gmane.comp.parsers.spirit.general/6959 + * - \b pytst: C++ version by Nicolas Lehuen, with SWIG wrappers for use from other languages. Version 0.97. \n + * See http://nicolas.lehuen.com/download/pytst/ + * (not yet tested) \anchor note_4 + * + * <h2>Feature chart</h2> + * All versions have insert and plain search, other features available as tabled below: + * \htmlinclude featuretable.html + */ + +/** \page tst_tests Test Suite + +All tests require the <a href="http://boost.org">Boost library</a> to compile. + +<h3>Concept checks</h3> + +The file <a href="../tst_concept_checks.cpp">tst_concept_checks.cpp</a> +performs a compile-time test of structured containers. \n +A class \c StructuredAssociativeContainer is defined, which contains +prototypes of all required methods for structured containers (also class ternary_tree). +Relevant concepts from \c boost/concept_check.hpp are used to check the structured set/map +containers. + +<h3>Correctness tests</h3> + +The subdirectory \b test in the distribution contains a bunch of files hacked up during development. +All these tests are performed by a single main test file <a href="../test/test_tst.cpp">test_tst.cpp</a>. +This file includes individual .cpp files, since we use a simplified (hacked) version +of the Boost.Test harness. + +Each test prints a single line to \c std::cerr saying whether the test was "OK" or "FAIL". +A line is added if an exception was thrown. + +These are runtime tests, several which require a file name to a dictionary-type file, +a plain-text file with one word per line. +The file \c fill_dictionary.cpp must be compiled with test projects, +it reads a dictionary file and fills a std::vector with strings. + +Dictionary files can be found by an internet search (try eg "dictionary file"). + +<h3>To do</h3> + +Proper organization and cleanup of this part of our library will be required before 1.0 release. + +*/ + + Added: trunk/opentrep/ternary_tree/doxygen_input/concepts.txt =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/concepts.txt (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/concepts.txt 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,122 @@ +Tutorial + +In programming, a Concept is a set of formal requirements on input/output of a subsystem, or pre/post-conditions of +an operation, of complexity constraints and exceptional behaviour. +Note ye well the "complexity" bit. Since the specification of C++ STL, the complexity of operations on +a type have been introduced as a proper feature of its concept, a full-citizenship part of type specification. +(This diverges from the mathematical roots of programming, which defines types/concepts in purely structural terms +-- ie as long as an operation does not transgress countability or infinity boundaries, it doesn't matter a damn bit whether +it requires zero overhead or would enrol half the atoms in the universe to encode intermediate information. +Maths is not about bean counting.). + +Here we will discuss tree concepts in the common sense. +The following are some definitions of terms as used in documentation of the Structured Containers library. +The definitions given are stipulative, in that they do not purely document an existing usage, but unless an expert +tells me otherwise, I believe they should be made into when discussing trees and tries. + + +Tree =df a directed acyclic graph of single-parented, multi-childed Nodes. Usually single-rooted, but this is not essential. + Stipulative: tree nodes have a fixed maximum number of children. + An implementation constraint that has become ingrained in most programmers' understanding of the concept. + All trees can be reduced to (easily and naturally implemented by) a binary tree. + +We assume common terminology for the parts of Trees: + Root - a node designated as start point, from which other nodes are reachable as children, or children of children etc. + Single-rooted Tree - a Tree where all nodes are reachable from a single Root node. + Multi-rooted tree, or Forest - a Tree where several start nodes are designated. + Level N - the set of nodes that are at the same distance from Root. Every node at level 3 is reachable + from the/a Root node by following exactly 3 child-node links. + Sibling - relation between any two nodes on the same Level. + Leaf - a node without children. + Fanout - the number of children that a node can have. This defines the maximum number of nodes at each level. + +Trie =df A Tree where the nodes have a "alphabet"-sized (max) number of children, for some alphabet. + Typical alphabets are the English letters, Unicode, or the ACGT genetic bases. + +Tree nodes represent a full "key" of any [less-comparable] type. +Tries store string-like keys; a Trie node does not store a full key, only a part of it. +A full key is represented by a leaf node and its path back to the/a root node. +The reason for using Tries is that access to string-like keys is very fast - in principle linear in key length. +Binary Trees over the same key is O(log n) where n is count of keys in the tree, with average key length +as a constant factor. + + +Ternary Search Tree (TST) is a space optimization for Tries. + +Because each path to a child of a Tree node takes up memory space, Trie nodes are very expensive +if the alphabet is large. From the 3rd or 4th level on, most child-node links are empty. +A TST constructs exactly the number of links existing at each level of a Trie. + +Graphics: +Tree + root==node==node==node + \\node==node \\node==node + +Trie (6-letter alphabet: 123456) + root + ______________||______________ + || || || || || || + node1 (empty) node3 node4 node5 (empty) + ______________||_____________ + || || || || || || + node1 node2 node3 node4 node5 node6 + +Here we see the root node with 4/6 child links populated. Each child has 6 empty links, except the 5th child. +The 5th child has 6/6 links filled, and each of its children has 0/6 children. +In all there are 4+6 = 10 nodes, and 10*6 = 60 links. +Since a Trie only stores part of a key, the substantial information in each trie node is small, and the +structure overhead - the links - is very large. + +This cries out for optimization. Several kinds of variable-sized nodes have been tried, but they usually +end up with complex code to use and maintain, and thus squander the search speed which was the rationale +for constructing Tries in the first place. + + +TST is one such optimization. Here each trie node is constructed from much smaller nodes, but the +code to use and maintain nodes is still fairly simple, so search speed is not badly compromised. +Let's see the structure of the above tree: +The exact runtime layout depends on insert order. If child 3 is inserted before child 1, child1 may become a +"lower-child" child of child 2. +Here we assume insertion order 4, 3, 5, 1 for the root + + //node1 + //node3 +root=node4 + \\node5==(level2) + +level 2: assume insertion order 3,4,5,1,2,6 + + //node1 + // \\node2 +level2.root=node3 + \\node4 + \\node5 + \\node6 + + +Here we see 10 nodes, each with 3 child links. This means 30 links, ie half the link count of Trie. +The space savings are of course even better for larger alphabets. +(And in the Structured Containers implementation, the middle child link is omitted, so only 2*10=20 links are needed.) +Given English alphabet Trie with the above sparse population, there would be 26*10 = 260 links in +the Trie (each Trie node has 26 child links), and still the same count of TST node-links (30, or 20). +A Unicode Trie would have 10*2^16 links= 10*65536= 655 thousand links. A TST for Unicode again uses 30 (20) links only. + +Important points wrt TSTs and TST nodes. +A. TST nodes have two different kinds of child links: + (1) Two same-level sibling links + (2) One next-level "proper" child link. + +B. TSTs generalizes Trie implementations. + Tries with any alphabet can be implemented by the same TST node type - no new type needs to be defined for new alphabets. + (However a specialization is still often needed, since there must be a comparison function for the alphabet letters) + +C. TSTs are a hybrid of (binary) Tree and Trie. + In consequence of (A), TSTs combine the features of binary trees with Tries. + TST nodes can be viewed as binary nodes with associated data, where the data is a link to a next-level binary tree + (that implements a trie node). + - Against this view one may note that the binary treelets have absolute size constraints (defined by + count of letters in the implemented alphabet) - this is an "unnatural" constraint on a tree type. + - In support of the view one may note that search complexity is more like binary trees than pure Trie implementation. + + + Added: trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css =================================================================== --- trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css (rev 0) +++ trunk/opentrep/ternary_tree/doxygen_input/doxygen-old.css 2009-07-14 10:45:19 UTC (rev 126) @@ -0,0 +1,311 @@ +BODY,H1,H2,H3,H4,H5,H6,P,CENTER,TD,TH,UL,DL,DIV { + font-family: Geneva, Arial, Helvetica, sans-serif; +} +/*BODY,TD { font-size: 90%; } +H1 { + font-size: 150%; + background-color: #eeeeff; + width: 100%; + border: 1px solid #b00000; + margin: 2px; + padding: 2px; +} +H2 { font-size: 140%; } +H3 { font-size: 100%; } */ + +CAPTION { font-weight: bold } +DIV.qindex { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.nav { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.navtab { + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + margin-right: 15px; + padding: 2px; +} +TD.navtab { + font-size: 90%; +} +A.qindex { + text-decoration: none; + font-weight: bold; + color: #1A419D; +} +A.qindex:visited { + text-decoration: none; + font-weight: bold; + color: #1A419D +} +A.qindex:hover { + text-decoration: none; + background-color: #ddddff; +} +A.qindexHL { + text-decoration: none; + font-weight: bold; + background-color: #6666cc; + color: #ffffff; + border: 1px double #9295C2; +} +A.qindexHL:hover { + text-decoration: none; + background-color: #6666cc; + color: #ffffff; +} +A.qindexHL:visited { text-decoration: none; background-color: #6666cc; color: #ffffff } +A.el { text-decoration: none; font-weight: bold } +A.elRef { font-weight: bold } +A.code:link { text-decoration: none; font-weight: normal; color: #0000a0 } +A.code:visited { text-decoration: none; font-weight: normal; color: #0000a0 } +A.codeRef:link { font-weight: normal; color: #0000FF} +A.codeRef:visited { font-weight: normal; color: #0000FF} +A:hover { text-decoration: none; background-color: #f2f2ff } +DL.el { margin-left: -1cm } +.fragment { + font-family: Fixed, monospace + font-size: 100%; +} +PRE.fragment { + font-size: normal; + border: 1px solid #CCCCCC; + background-color: #f5f5f5; + margin-top: 4px; + margin-bottom: 4px; + margin-left: 2px; + margin-right: 8px; + padding-left: 6px; + padding-right: 6px; + padding-top: 4px; + padding-bottom: 4px; +} +DIV.ah { background-color: black; font-weight: bold; color: #ffffff; margin-bottom: 3px; margin-top: 3px } +TD.md { background-color: #F4F4FB; font-weight: bold; } +TD.mdPrefix { + background-color: #F4F4FB; + color: #606060; + font-size: 90%; +} +TD.mdname1 { background-color: #F4F4FB; font-weight: bold; color: #602020; } +TD.mdname { background-color: #F4F4FB; font-weight: bold; color: #602020; width: 600px; } +DIV.groupHeader { + margin-left: 16px; + margin-top: 12px; + margin-bottom: 6px; + padding: 3px; + font-weight: bold; + font-size: 110%; + background-color: #d0d0ff; +} +DIV.groupText { margin-left: 16px; font-style: italic; font-size: 90%; } +BODY { + background: white; + color: black; + margin-right: 20px; + margin-left: 20px; +} +TD.indexkey { + background-color: #eeeeff; + font-weight: bold; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TD.indexvalue { + background-color: #eeeeff; + font-style: italic; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TR.memlist { + background-color: #f0f0f0; +} +P.formulaDsp { text-align: center; } +IMG.formulaDsp { } +IMG.formulaInl { vertical-align: middle; } +SPAN.keyword { color: #0000ff } +SPAN.keywordtype { color: #0000ff } +SPAN.keywordflow { color: #0000ff } +SPAN.comment { color: #008000 } +SPAN.preprocessor { color: #806020 } +SPAN.stringliteral { color: #800000 } +SPAN.charliteral { color: #800000 } +.mdTable { + border: 1px solid #868686; + background-color: #F4F4FB; +} +.mdRow { + padding: 8px 10px; +} +.mdescLeft { + padding: 0px 8px 4px 8px; + font-size: 12px; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.mdescRight { + padding: 0px 8px 4px 8px; + font-size: 12px; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.memItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 90%; +} +.memItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 100%; +} +.memTemplItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: none; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 90%; +} +.memTemplItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + borde... [truncated message content] |
From: <den...@us...> - 2009-07-14 14:07:33
|
Revision: 127 http://opentrep.svn.sourceforge.net/opentrep/?rev=127&view=rev Author: denis_arnaud Date: 2009-07-14 14:07:29 +0000 (Tue, 14 Jul 2009) Log Message: ----------- [TST] Updated the Ternary Structure Tree (TST). It still does not compile. Modified Paths: -------------- trunk/opentrep/configure.ac trunk/opentrep/opentrep/Makefile.am trunk/opentrep/ternary_tree/examples.cpp trunk/opentrep/ternary_tree/iterator_compile_test.cpp trunk/opentrep/ternary_tree/iterator_wrapper.hpp trunk/opentrep/ternary_tree/readme.txt trunk/opentrep/ternary_tree/structured_map.hpp trunk/opentrep/ternary_tree/structured_set.hpp trunk/opentrep/ternary_tree/ternary_tree.hpp trunk/opentrep/ternary_tree/tst_concept_checks.cpp trunk/opentrep/ternary_tree/tst_detail/iteration_impl.hpp trunk/opentrep/ternary_tree/tst_detail/new_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_implementation.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_base.ipp trunk/opentrep/ternary_tree/tst_detail/tst_iterator_facade.hpp trunk/opentrep/ternary_tree/tst_detail/tst_node.hpp trunk/opentrep/ternary_tree/tst_detail/tst_search_results.ipp Added Paths: ----------- trunk/opentrep/ternary_tree/Makefile.am trunk/opentrep/ternary_tree/fill_dictionary.hpp trunk/opentrep/ternary_tree/simple_tst.cpp trunk/opentrep/ternary_tree/sources.mk Removed Paths: ------------- trunk/opentrep/ternary_tree/fill_dictionary.cpp Property Changed: ---------------- trunk/opentrep/ternary_tree/ Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/configure.ac 2009-07-14 14:07:29 UTC (rev 127) @@ -211,6 +211,7 @@ opentrep.pc opentrep.spec opentrep.m4 + ternary_tree/Makefile opentrep/Makefile opentrep/basic/Makefile opentrep/bom/Makefile Modified: trunk/opentrep/opentrep/Makefile.am =================================================================== --- trunk/opentrep/opentrep/Makefile.am 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/opentrep/Makefile.am 2009-07-14 14:07:29 UTC (rev 127) @@ -3,8 +3,6 @@ ## Source directory -DISTCLEANFILES = @PACKAGE@-paths.h - MAINTAINERCLEANFILES = Makefile.in SUBDIRS = basic bom factory dbadaptor command service core config batches Property changes on: trunk/opentrep/ternary_tree ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in Added: trunk/opentrep/ternary_tree/Makefile.am =================================================================== --- trunk/opentrep/ternary_tree/Makefile.am (rev 0) +++ trunk/opentrep/ternary_tree/Makefile.am 2009-07-14 14:07:29 UTC (rev 127) @@ -0,0 +1,46 @@ +include $(top_srcdir)/Makefile.common +include $(srcdir)/sources.mk + +## +# Source directory + +DISTCLEANFILES = + +MAINTAINERCLEANFILES = Makefile.in + +SUBDIRS = + +EXTRA_DIST = + + +## +# Library +lib_LTLIBRARIES = libtst.la + +libtst_la_SOURCES = $(tst_h_sources) $(tst_cc_sources) +#libtst_la_LIBADD = +libtst_la_LDFLAGS = -version-info $(GENERIC_LIBRARY_VERSION) + +# Header files +nobase_pkginclude_HEADERS = $(ttree_h_sources) +#nobase_nodist_pkginclude_HEADERS = $(top_builddir)/@PACKAGE@/config.h + + +## +# Binaries (batches) +bin_PROGRAMS = simple_tst + +simple_tst_SOURCES = simple_tst.cpp +#simple_tst_CXXFLAGS = +#simple_tst_LDADD = +#simple_tst_LDFLAGS = + +## +# Test binaries +#check_PROGRAMS = iterator_compile_test tst_concept_checks + +#iterator_compile_test_SOURCES = iterator_compile_test.cpp +#iterator_compile_test_LDFLAGS = + +#tst_concept_checks_SOURCES = tst_concept_checks.cpp +#tst_concept_checks_LDFLAGS = Modified: trunk/opentrep/ternary_tree/examples.cpp =================================================================== --- trunk/opentrep/ternary_tree/examples.cpp 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/examples.cpp 2009-07-14 14:07:29 UTC (rev 127) @@ -1,231 +1,231 @@ -/** \file - * Usage examples for Structured Containers. - */ - -#include <iostream> -#include "structured_set.hpp" -#include "structured_map.hpp" - -#include <iostream> -#include <string> -#include <set> -#include <functional> -//#include <boost/scoped_ptr.hpp> - -// -// Basic use of structured_set -// - -void basic() -{ - typedef containers::structured_set<std::string> Set; - typedef Set::iterator SetIter; - typedef std::pair<SetIter, SetIter> IterPair; - - Set names; - names.insert("apps"); - names.insert("applets"); - names.insert("banana"); - - std::cout << "The set contains\n\t"; - for (SetIter it = names.begin(); it != names.end(); ++it) - std::cout << *it << ", "; - - IterPair p = names.prefix_range("app"); - std::cout << "\nprefix_range(\"app\") returns:\n\t"; - while (p.first != p.second) { - std::cout << *p.first++ << ", "; - } - std::cout << "\np.second points to " << *p.second; - - std::cout << "\nequal_range(\"app\") returns:\n\t"; - p = names.equal_range("app"); - if (p.first == p.second) - std::cout << "empty range"; - std::cout << "\np.second points to " << *p.second; -} - - -//############################################################################# -// -// prefix_range example (compile-only, does not run as is) -// -typedef containers::structured_set<std::string> SymbolSet; -SymbolSet symbols; -bool is_defined_in_scope(std::string scope, std::string name) -{ - typedef std::pair<SymbolSet::iterator, SymbolSet::iterator> Range; - Range r = symbols.prefix_range(scope + "::"); - SymbolSet::iterator n = symbols.find(name); - return n != symbols.end() && *n >= *r.first && *n < *r.second; -} - - -//############################################################################# -// -// Case-insensitive structured containers -// -template<class CharT> -struct nocase_less : public std::binary_function<CharT, CharT, bool> -{ - bool operator()(CharT a, CharT b) const { return tolower(a) < tolower(b); } -}; - -void caseless_set() -{ - typedef containers::structured_multiset<std::string, nocase_less<char> > CaselessSet; - typedef containers::structured_multimap<std::string, double, nocase_less<char> > CaselessMap; - - CaselessMap uncased; - uncased.insert(std::make_pair("NoCase", 0.1)); - CaselessSet caseless; - caseless.insert("NoCase"); - caseless.insert("nocase"); - caseless.insert("noCase"); - caseless.insert("NOCASE"); - - std::cout << "nocase = " << (int)caseless.count("nocase"); - - CaselessSet::const_iterator endit = caseless.end(); - for(CaselessSet::const_iterator it = caseless.begin(); it != endit; ++it) { - std::cout << ", " << *it; - } - -} - -//############################################################################# -// -// Localization comparator -// - -#include "examples/locale_less.hpp" - -void localized_comparator() -{ - typedef containers::structured_set<std::string, utility::locale_less<char> > LocalSet; - - typedef containers::structured_set<std::string> DefaultSet; - - if (utility::swedish_locale_name == "C") - std::cout << "No locale to test\n"; - else - std::cout << "Attempt to set Swedish locale \"" << utility::swedish_locale_name << "\"\n"; - - try { - // use comparator constructor, create Swedish locale - LocalSet se_names(utility::locale_less<char>::locale_less(utility::swedish_locale_name)); - DefaultSet anynames; - - se_names.insert("\xC4ska"); - se_names.insert("\xC5m\xE5l"); - se_names.insert("\xD6dla"); - se_names.insert("Adam"); - - anynames.insert("\xC4ska"); - anynames.insert("\xC5m\xE5l"); - anynames.insert("\xD6dla"); - anynames.insert("Adam"); - - - for(LocalSet::iterator sit = se_names.begin(); sit != se_names.end(); ++sit) { - std::cout << *sit << ", "; - } - std::cout << "not:\n"; - for(DefaultSet::iterator dit = anynames.begin(); dit != anynames.end(); ++dit) { - std::cout << *dit << ", "; - } - } catch(std::exception& x) { - std::cout << "...failed - skip test\n" << x.what() << "\n"; - } -} - -//############################################################################# -// -// longest_match example -// -#include <fstream> - -typedef containers::structured_map<std::string, int, nocase_less<char> > Vocabulary; - -void fill_wordlist(const char* filename, Vocabulary& wordlist) -{ - std::ifstream wordstream(filename); - if (!wordstream.is_open()) { - std::cerr << "Could not open dictionary " << filename << "\n"; - return; - } - char buf[300]; - int linecount = 0; - while(wordstream.getline(buf, 300, '\n').good()) - wordlist[buf] = ++linecount; -} - -std::streamsize get_filesize(std::ifstream& str) -{ - std::streamsize pos = str.tellg(); - str.seekg(0, std::ios_base::end); - std::streamsize result = str.tellg(); - str.seekg(pos, std::ios_base::beg); - return result; -} - -namespace { - template<class T> - struct scoped_array - { - scoped_array(size_t count) : buf(new T[count]) {} - ~scoped_array() { delete[] buf; } - T* get() { return buf; } - private: - T* buf; - }; -} - -void longest_match_example(const char* dictfile, const char* parsefile) -{ - Vocabulary english; - // Read dictionary from disk - fill_wordlist(dictfile, english); - if (english.empty()) - return; - - std::ifstream infile(parsefile); - if (!infile.is_open()) - return; - - // longest_match does not work with istream_iterator, so must fill buffer - size_t filesize = (size_t)get_filesize(infile); - // instead of boost::scoped_array - scoped_array<char> bytes(filesize); - infile.read(bytes.get(), filesize); - - const char *first = bytes.get(); - const char *last = first + infile.gcount(); - - while (first != last) - { - Vocabulary::iterator word = english.longest_match(first, last); - if (word != english.end()) - std::cout << (*word).first << " "; //= " << (*word).second << "\n"; - else { - // No key; try next char - ++first; - } - } -} - -//############################################################################# - -int main() -{ - std::cout << "*** basic usage ***\n"; - basic(); - std::cout << "\n\n*** custom comparator ***\n"; - caseless_set(); - std::cout << "\n\n*** locale comparator ***\n"; - localized_comparator(); - std::cout << "\n\n*** longest_match ***\n"; - // You need to supply files, not included in ternary_tree distribution - longest_match_example("../english-150k.txt", "../shakequotes.txt"); - return 0; -} +/** \file + * Usage examples for Structured Containers. + */ + +#include <iostream> +#include "structured_set.hpp" +#include "structured_map.hpp" + +#include <iostream> +#include <string> +#include <set> +#include <functional> +//#include <boost/scoped_ptr.hpp> + +// +// Basic use of structured_set +// + +void basic() +{ + typedef containers::structured_set<std::string> Set; + typedef Set::iterator SetIter; + typedef std::pair<SetIter, SetIter> IterPair; + + Set names; + names.insert("apps"); + names.insert("applets"); + names.insert("banana"); + + std::cout << "The set contains\n\t"; + for (SetIter it = names.begin(); it != names.end(); ++it) + std::cout << *it << ", "; + + IterPair p = names.prefix_range("app"); + std::cout << "\nprefix_range(\"app\") returns:\n\t"; + while (p.first != p.second) { + std::cout << *p.first++ << ", "; + } + std::cout << "\np.second points to " << *p.second; + + std::cout << "\nequal_range(\"app\") returns:\n\t"; + p = names.equal_range("app"); + if (p.first == p.second) + std::cout << "empty range"; + std::cout << "\np.second points to " << *p.second; +} + + +//############################################################################# +// +// prefix_range example (compile-only, does not run as is) +// +typedef containers::structured_set<std::string> SymbolSet; +SymbolSet symbols; +bool is_defined_in_scope(std::string scope, std::string name) +{ + typedef std::pair<SymbolSet::iterator, SymbolSet::iterator> Range; + Range r = symbols.prefix_range(scope + "::"); + SymbolSet::iterator n = symbols.find(name); + return n != symbols.end() && *n >= *r.first && *n < *r.second; +} + + +//############################################################################# +// +// Case-insensitive structured containers +// +template<class CharT> +struct nocase_less : public std::binary_function<CharT, CharT, bool> +{ + bool operator()(CharT a, CharT b) const { return tolower(a) < tolower(b); } +}; + +void caseless_set() +{ + typedef containers::structured_multiset<std::string, nocase_less<char> > CaselessSet; + typedef containers::structured_multimap<std::string, double, nocase_less<char> > CaselessMap; + + CaselessMap uncased; + uncased.insert(std::make_pair("NoCase", 0.1)); + CaselessSet caseless; + caseless.insert("NoCase"); + caseless.insert("nocase"); + caseless.insert("noCase"); + caseless.insert("NOCASE"); + + std::cout << "nocase = " << (int)caseless.count("nocase"); + + CaselessSet::const_iterator endit = caseless.end(); + for(CaselessSet::const_iterator it = caseless.begin(); it != endit; ++it) { + std::cout << ", " << *it; + } + +} + +//############################################################################# +// +// Localization comparator +// + +#include "examples/locale_less.hpp" + +void localized_comparator() +{ + typedef containers::structured_set<std::string, utility::locale_less<char> > LocalSet; + + typedef containers::structured_set<std::string> DefaultSet; + + if (utility::swedish_locale_name == "C") + std::cout << "No locale to test\n"; + else + std::cout << "Attempt to set Swedish locale \"" << utility::swedish_locale_name << "\"\n"; + + try { + // use comparator constructor, create Swedish locale + LocalSet se_names(utility::locale_less<char>::locale_less(utility::swedish_locale_name)); + DefaultSet anynames; + + se_names.insert("\xC4ska"); + se_names.insert("\xC5m\xE5l"); + se_names.insert("\xD6dla"); + se_names.insert("Adam"); + + anynames.insert("\xC4ska"); + anynames.insert("\xC5m\xE5l"); + anynames.insert("\xD6dla"); + anynames.insert("Adam"); + + + for(LocalSet::iterator sit = se_names.begin(); sit != se_names.end(); ++sit) { + std::cout << *sit << ", "; + } + std::cout << "not:\n"; + for(DefaultSet::iterator dit = anynames.begin(); dit != anynames.end(); ++dit) { + std::cout << *dit << ", "; + } + } catch(std::exception& x) { + std::cout << "...failed - skip test\n" << x.what() << "\n"; + } +} + +//############################################################################# +// +// longest_match example +// +#include <fstream> + +typedef containers::structured_map<std::string, int, nocase_less<char> > Vocabulary; + +void fill_wordlist(const char* filename, Vocabulary& wordlist) +{ + std::ifstream wordstream(filename); + if (!wordstream.is_open()) { + std::cerr << "Could not open dictionary " << filename << "\n"; + return; + } + char buf[300]; + int linecount = 0; + while(wordstream.getline(buf, 300, '\n').good()) + wordlist[buf] = ++linecount; +} + +std::streamsize get_filesize(std::ifstream& str) +{ + std::streamsize pos = str.tellg(); + str.seekg(0, std::ios_base::end); + std::streamsize result = str.tellg(); + str.seekg(pos, std::ios_base::beg); + return result; +} + +namespace { + template<class T> + struct scoped_array + { + scoped_array(size_t count) : buf(new T[count]) {} + ~scoped_array() { delete[] buf; } + T* get() { return buf; } + private: + T* buf; + }; +} + +void longest_match_example(const char* dictfile, const char* parsefile) +{ + Vocabulary english; + // Read dictionary from disk + fill_wordlist(dictfile, english); + if (english.empty()) + return; + + std::ifstream infile(parsefile); + if (!infile.is_open()) + return; + + // longest_match does not work with istream_iterator, so must fill buffer + size_t filesize = (size_t)get_filesize(infile); + // instead of boost::scoped_array + scoped_array<char> bytes(filesize); + infile.read(bytes.get(), filesize); + + const char *first = bytes.get(); + const char *last = first + infile.gcount(); + + while (first != last) + { + Vocabulary::iterator word = english.longest_match(first, last); + if (word != english.end()) + std::cout << (*word).first << " "; //= " << (*word).second << "\n"; + else { + // No key; try next char + ++first; + } + } +} + +//############################################################################# + +int main() +{ + std::cout << "*** basic usage ***\n"; + basic(); + std::cout << "\n\n*** custom comparator ***\n"; + caseless_set(); + std::cout << "\n\n*** locale comparator ***\n"; + localized_comparator(); + std::cout << "\n\n*** longest_match ***\n"; + // You need to supply files, not included in ternary_tree distribution + longest_match_example("../english-150k.txt", "../shakequotes.txt"); + return 0; +} Deleted: trunk/opentrep/ternary_tree/fill_dictionary.cpp =================================================================== --- trunk/opentrep/ternary_tree/fill_dictionary.cpp 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/fill_dictionary.cpp 2009-07-14 14:07:29 UTC (rev 127) @@ -1,66 +0,0 @@ -#include <vector> -#include <string> -#include <iostream> -#include <fstream> -#include <algorithm> -#include <stdexcept> - -typedef std::vector<std::string> Dictionary; - -//template<class Container> -size_t fill_dictionary(const char* filename, Dictionary& dictionary, size_t maxsize, size_t line_length = 0) -{ - std::ifstream input(filename); - size_t longest_in_file = 0; - size_t linecount = 0; - if (!input.is_open()) { - std::cerr << filename << ": file open fail\n"; - throw std::runtime_error("fill_dictionary failed"); - } - if (input.is_open() && !line_length) - { - std::vector<char> next; - next.resize(std::max(line_length, size_t(300))); - while(input.good() && linecount < maxsize) { - input.getline(&next[0], next.capacity()); - std::string s(next.begin(), next.begin() + (size_t)input.gcount()); - if (!s.empty()) { - dictionary.push_back(s.c_str()); - ++linecount; - if (s.size() > longest_in_file) - longest_in_file = s.size(); - //std::cerr << s.c_str() << "\n"; - } - //next.clear(); - } - //std::cerr << "Read " << dictionary.size() << " lines from wordlist.txt\n"; - } - - // If file not long enough, fill up with some random alphabetic strings - if (line_length || (dictionary.size() < maxsize && (maxsize < size_t(-1)) ) ) - { - std::string next; - if (!line_length) { - line_length = linecount? longest_in_file : 10; - std::cerr << "zero-length line, we'll have trouble"; - } - next.reserve(line_length + 1); - for (size_t i = dictionary.size(); i < maxsize; ++i) - { - size_t length = 1 + (rand() % line_length); - next.resize(length--); - while(length--) { - next[length] = rand() % (127-' ') + ' '; - } - //std::cerr << next << '\n'; - dictionary.push_back(next.c_str()); - } - if (line_length > longest_in_file) - longest_in_file = line_length; - } - return longest_in_file; -} - - - - Copied: trunk/opentrep/ternary_tree/fill_dictionary.hpp (from rev 126, trunk/opentrep/ternary_tree/fill_dictionary.cpp) =================================================================== --- trunk/opentrep/ternary_tree/fill_dictionary.hpp (rev 0) +++ trunk/opentrep/ternary_tree/fill_dictionary.hpp 2009-07-14 14:07:29 UTC (rev 127) @@ -0,0 +1,66 @@ +#include <vector> +#include <string> +#include <iostream> +#include <fstream> +#include <algorithm> +#include <stdexcept> + +typedef std::vector<std::string> Dictionary; + +//template<class Container> +size_t fill_dictionary(const char* filename, Dictionary& dictionary, size_t maxsize, size_t line_length = 0) +{ + std::ifstream input(filename); + size_t longest_in_file = 0; + size_t linecount = 0; + if (!input.is_open()) { + std::cerr << filename << ": file open fail\n"; + throw std::runtime_error("fill_dictionary failed"); + } + if (input.is_open() && !line_length) + { + std::vector<char> next; + next.resize(std::max(line_length, size_t(300))); + while(input.good() && linecount < maxsize) { + input.getline(&next[0], next.capacity()); + std::string s(next.begin(), next.begin() + (size_t)input.gcount()); + if (!s.empty()) { + dictionary.push_back(s.c_str()); + ++linecount; + if (s.size() > longest_in_file) + longest_in_file = s.size(); + //std::cerr << s.c_str() << "\n"; + } + //next.clear(); + } + //std::cerr << "Read " << dictionary.size() << " lines from wordlist.txt\n"; + } + + // If file not long enough, fill up with some random alphabetic strings + if (line_length || (dictionary.size() < maxsize && (maxsize < size_t(-1)) ) ) + { + std::string next; + if (!line_length) { + line_length = linecount? longest_in_file : 10; + std::cerr << "zero-length line, we'll have trouble"; + } + next.reserve(line_length + 1); + for (size_t i = dictionary.size(); i < maxsize; ++i) + { + size_t length = 1 + (rand() % line_length); + next.resize(length--); + while(length--) { + next[length] = rand() % (127-' ') + ' '; + } + //std::cerr << next << '\n'; + dictionary.push_back(next.c_str()); + } + if (line_length > longest_in_file) + longest_in_file = line_length; + } + return longest_in_file; +} + + + + Modified: trunk/opentrep/ternary_tree/iterator_compile_test.cpp =================================================================== --- trunk/opentrep/ternary_tree/iterator_compile_test.cpp 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/iterator_compile_test.cpp 2009-07-14 14:07:29 UTC (rev 127) @@ -1,163 +1,163 @@ -/** Pure compilation/header test - * \file - * This file checks interoperability requirements for iterator_wrapper.hpp - * The problem cases are those that should fail: they cannot be checked - * automatically by compiler. - * So to use, you must define TEST_COMPILATION_FAILURE or - * CHECK_SPECIAL_COMP_FAILURE below, and then inspect compiler warnings - * to see that you get an error for every SHOULD_FAIL line (#1-F, #20-21). - * To simplify this, uncomment one statement at a time in the function - * iterator_interop_checks_main() - * at the end of this file, then try to compile. - * - * Construction/assignment to reverse_iterator from const_reverse_iterator - * generates more complicated error messages from compilers, - * so were broken out to allow separate runs. - * Define the macro CHECK_SPECIAL_COMP_FAILURE and look for - * SPECIAL_FAIL_1 and _2 in the compiler output. - * - * This (un)works as required with MSVC and Comeau online tryitout compiler. - */ - -#ifdef _MSC_VER -# pragma warning(disable: 4245 4127 4189 4700) -#endif - - -//#define TEST_COMPILATION_FAILURE -// Two cases must be inspected in references on MSVC -//#define CHECK_SPECIAL_COMP_FAILURE - -#ifdef TEST_COMPILATION_FAILURE -# define SHOULD_FAIL( Pred ) Pred -#else -# define SHOULD_FAIL( Pred ) -#endif - -#ifdef CHECK_SPECIAL_COMP_FAILURE -# define SPECIAL_FAIL_1( Pred ) Pred -# define SPECIAL_FAIL_2( Pred ) Pred -#else -# define SPECIAL_FAIL_1( Pred ) -# define SPECIAL_FAIL_2( Pred ) -#endif - - - -template<class Container> -void iterator_interop_checks() -{ - typedef Container C; - typedef typename C::const_iterator c_t; - typedef typename C::iterator i_t; - typedef typename C::const_reverse_iterator cr_t; - typedef typename C::reverse_iterator r_t; - c_t c; - i_t i; - cr_t cr; - r_t r; -///// COPY-CONSTRUCTORS - // Should work - c_t i1( i ); - c_t i2( cr.base() ); - c_t i3( r.base() ); - i_t i4( r.base() ); - cr_t i5( r ); - cr_t i6( c ); - cr_t i7( i ); - r_t i8( i ); - - SHOULD_FAIL( c_t i101( cr ); ) // #1 - SHOULD_FAIL( c_t i102( r ); ) // #2 - SHOULD_FAIL( i_t i103( c ); ) // #3 - SHOULD_FAIL( i_t i104( cr ); ) // #4 - SHOULD_FAIL( i_t i105( r ); ) // #5 - SHOULD_FAIL( r_t i106( c ); ) // #6 - -///// ASSIGNMENT - // Should work - c = i; - c = cr.base(); - c = r.base(); - i = r.base(); - cr = r; - - SHOULD_FAIL( c = cr; ) // #7 - SHOULD_FAIL( c = r; ) // #8 - SHOULD_FAIL( i = c; ) // #9 - SHOULD_FAIL( i = cr; ) // #A - SHOULD_FAIL( i = r; ) // #B - SHOULD_FAIL( cr = c; ) // #C - SHOULD_FAIL( cr = i; ) // #D - SHOULD_FAIL( r = c; ) // #E - SHOULD_FAIL( r = i; ) // #F - -// these fail in 2nd pass or something, compile separately - SPECIAL_FAIL_1( r_t i107( cr ); ) // #10 - SPECIAL_FAIL_2( r = cr; ) // #11 - -///// ADVANCE - ++c; --c; - ++i; --i; - ++cr; --cr; - ++r; --r; - c++; c--; - i++; i--; - cr++; cr--; - r++; r--; - -///// COMPARE - if (c == i && i == c) c = c; - if (c == cr.base() && cr.base() == c) c = c; - if (i == r.base() && r.base() == i) c = c; - // Should this fail? - Dinkum nor StlPort don't seem to prevent it - //if (r == cr && cr == r) c = c; - if (r.base() == cr.base() && cr.base() == r.base()) c = c; - -///// DEREFERENCE - typedef typename C::value_type val_t; - typedef typename C::pointer ptr_t; - typedef typename C::reference ref_t; - typedef typename C::const_reference cref_t; - - val_t val1 = *c; - val_t val2 = *i; - val_t val3 = *cr; - val_t val4 = *r; - - ref_t ref2 = *i; - ref_t ref4 = *r; - - cref_t cref1 = *c; - cref_t cref2 = *i; - cref_t cref3 = *cr; - cref_t cref4 = *r; - - SHOULD_FAIL( ref_t ref1 = *c ); // #20 - SHOULD_FAIL( ref_t ref3 = *cr ); // #21 - -} - -#include <vector> -#include "ternary_tree.hpp" -#include "structured_set.hpp" -#include "structured_map.hpp" - -void iterator_interop_checks_main() -{ - typedef std::vector<int> Cont; -// iterator_interop_checks<Cont>(); -/* typedef containers::ternary_tree<std::string, int> Tst; - iterator_interop_checks<Tst>(); - typedef containers::structured_set<std::string> StrucSet; - iterator_interop_checks<StrucSet>(); -*/ typedef containers::structured_multiset<std::string> MStrucSet; - iterator_interop_checks<MStrucSet>(); -/* typedef containers::structured_map<std::string, int> StrucMap; - iterator_interop_checks<StrucMap>(); - typedef containers::structured_multimap<std::string, int> MStrucMap; - iterator_interop_checks<MStrucMap>(); */ -} - - - +/** Pure compilation/header test + * \file + * This file checks interoperability requirements for iterator_wrapper.hpp + * The problem cases are those that should fail: they cannot be checked + * automatically by compiler. + * So to use, you must define TEST_COMPILATION_FAILURE or + * CHECK_SPECIAL_COMP_FAILURE below, and then inspect compiler warnings + * to see that you get an error for every SHOULD_FAIL line (#1-F, #20-21). + * To simplify this, uncomment one statement at a time in the function + * iterator_interop_checks_main() + * at the end of this file, then try to compile. + * + * Construction/assignment to reverse_iterator from const_reverse_iterator + * generates more complicated error messages from compilers, + * so were broken out to allow separate runs. + * Define the macro CHECK_SPECIAL_COMP_FAILURE and look for + * SPECIAL_FAIL_1 and _2 in the compiler output. + * + * This (un)works as required with MSVC and Comeau online tryitout compiler. + */ + +#ifdef _MSC_VER +# pragma warning(disable: 4245 4127 4189 4700) +#endif + + +//#define TEST_COMPILATION_FAILURE +// Two cases must be inspected in references on MSVC +//#define CHECK_SPECIAL_COMP_FAILURE + +#ifdef TEST_COMPILATION_FAILURE +# define SHOULD_FAIL( Pred ) Pred +#else +# define SHOULD_FAIL( Pred ) +#endif + +#ifdef CHECK_SPECIAL_COMP_FAILURE +# define SPECIAL_FAIL_1( Pred ) Pred +# define SPECIAL_FAIL_2( Pred ) Pred +#else +# define SPECIAL_FAIL_1( Pred ) +# define SPECIAL_FAIL_2( Pred ) +#endif + + + +template<class Container> +void iterator_interop_checks() +{ + typedef Container C; + typedef typename C::const_iterator c_t; + typedef typename C::iterator i_t; + typedef typename C::const_reverse_iterator cr_t; + typedef typename C::reverse_iterator r_t; + c_t c; + i_t i; + cr_t cr; + r_t r; +///// COPY-CONSTRUCTORS + // Should work + c_t i1( i ); + c_t i2( cr.base() ); + c_t i3( r.base() ); + i_t i4( r.base() ); + cr_t i5( r ); + cr_t i6( c ); + cr_t i7( i ); + r_t i8( i ); + + SHOULD_FAIL( c_t i101( cr ); ) // #1 + SHOULD_FAIL( c_t i102( r ); ) // #2 + SHOULD_FAIL( i_t i103( c ); ) // #3 + SHOULD_FAIL( i_t i104( cr ); ) // #4 + SHOULD_FAIL( i_t i105( r ); ) // #5 + SHOULD_FAIL( r_t i106( c ); ) // #6 + +///// ASSIGNMENT + // Should work + c = i; + c = cr.base(); + c = r.base(); + i = r.base(); + cr = r; + + SHOULD_FAIL( c = cr; ) // #7 + SHOULD_FAIL( c = r; ) // #8 + SHOULD_FAIL( i = c; ) // #9 + SHOULD_FAIL( i = cr; ) // #A + SHOULD_FAIL( i = r; ) // #B + SHOULD_FAIL( cr = c; ) // #C + SHOULD_FAIL( cr = i; ) // #D + SHOULD_FAIL( r = c; ) // #E + SHOULD_FAIL( r = i; ) // #F + +// these fail in 2nd pass or something, compile separately + SPECIAL_FAIL_1( r_t i107( cr ); ) // #10 + SPECIAL_FAIL_2( r = cr; ) // #11 + +///// ADVANCE + ++c; --c; + ++i; --i; + ++cr; --cr; + ++r; --r; + c++; c--; + i++; i--; + cr++; cr--; + r++; r--; + +///// COMPARE + if (c == i && i == c) c = c; + if (c == cr.base() && cr.base() == c) c = c; + if (i == r.base() && r.base() == i) c = c; + // Should this fail? - Dinkum nor StlPort don't seem to prevent it + //if (r == cr && cr == r) c = c; + if (r.base() == cr.base() && cr.base() == r.base()) c = c; + +///// DEREFERENCE + typedef typename C::value_type val_t; + typedef typename C::pointer ptr_t; + typedef typename C::reference ref_t; + typedef typename C::const_reference cref_t; + + val_t val1 = *c; + val_t val2 = *i; + val_t val3 = *cr; + val_t val4 = *r; + + ref_t ref2 = *i; + ref_t ref4 = *r; + + cref_t cref1 = *c; + cref_t cref2 = *i; + cref_t cref3 = *cr; + cref_t cref4 = *r; + + SHOULD_FAIL( ref_t ref1 = *c ); // #20 + SHOULD_FAIL( ref_t ref3 = *cr ); // #21 + +} + +#include <vector> +#include "ternary_tree.hpp" +#include "structured_set.hpp" +#include "structured_map.hpp" + +void iterator_interop_checks_main() +{ + typedef std::vector<int> Cont; +// iterator_interop_checks<Cont>(); +/* typedef containers::ternary_tree<std::string, int> Tst; + iterator_interop_checks<Tst>(); + typedef containers::structured_set<std::string> StrucSet; + iterator_interop_checks<StrucSet>(); +*/ typedef containers::structured_multiset<std::string> MStrucSet; + iterator_interop_checks<MStrucSet>(); +/* typedef containers::structured_map<std::string, int> StrucMap; + iterator_interop_checks<StrucMap>(); + typedef containers::structured_multimap<std::string, int> MStrucMap; + iterator_interop_checks<MStrucMap>(); */ +} + + + Modified: trunk/opentrep/ternary_tree/iterator_wrapper.hpp =================================================================== --- trunk/opentrep/ternary_tree/iterator_wrapper.hpp 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/iterator_wrapper.hpp 2009-07-14 14:07:29 UTC (rev 127) @@ -1,233 +1,233 @@ -// Created Mon Feb 06 13:20:01 2006 -#ifndef ITERATOR_WRAPPER_HPP_INCLUDED -#define ITERATOR_WRAPPER_HPP_INCLUDED - -#include <iterator> - -namespace iterators { - - // This is mostly a lame ripoff from Boost.Iterator, to avoid the dependency... - - //! Standard type traits for const_iterators. \see iterator_wrapper - template <class T> - struct const_traits { - typedef T value_type; - typedef const T* pointer; - typedef const T* const_pointer; - typedef const T& reference; - typedef const T& const_reference; - }; - - //! Standard type traits for (non-const) iterators. \see iterator_wrapper - template <class T> - struct nonconst_traits { - typedef T value_type; - typedef T* pointer; - typedef const T* const_pointer; - typedef T& reference; - typedef const T& const_reference; - }; - - /** Creates a bidirectional iterator from a base implementation, - * which is required to supply the interface \code - * struct iter_impl_sample - * { - * typedef /impl-defined/ reference; - * iter_impl_sample(); - * iter_impl_sample(/args/); - * void increment(); - * void decrement(); - * reference dereference() const; - * template<class OtherIter> bool equal(const OtherIter& rhs); - * void swap(this_type& rhs); - * }; \endcode - * (This class is meant for iterators you control - if you need to adapt an existing iterator - * with different interface, something like boost::iterator_facade is needed.) - * - * The first template parameter is the iterator implementation class. - * iterator_wrapper does not inherit from this. The second parameter is either const_traits <T> - * or nonconst_traits <T>, which provide the basic value_type related definitions. - * - * Note that Boost.Iterator will do the same job better, this was provided to avoid the dependency. - * Future versions may move to Boost instead. - * - * \ingroup utilities - */ - template< class BaseIterT - , class TraitsT - , class IterCatT = std::bidirectional_iterator_tag - > - struct iterator_wrapper - { - typedef BaseIterT base_iter; - typedef TraitsT traits_type; - typedef iterator_wrapper<BaseIterT, TraitsT, IterCatT> this_type; - - typedef typename TraitsT::value_type value_type; - typedef typename TraitsT::pointer pointer; - typedef typename TraitsT::reference reference; - typedef typename TraitsT::const_reference const_reference; - - typedef IterCatT iterator_category; - typedef ptrdiff_t difference_type; - typedef size_t size_type; - - iterator_wrapper() {} - - //! Copy constructor for iterator and constructor from (non-const) iterator for const_iterator - template<class SameBase> - iterator_wrapper(const iterator_wrapper<SameBase, nonconst_traits<value_type>, IterCatT>& it) - : m_iter(it.iter_base()) - {} - - iterator_wrapper(const iterator_wrapper& it) : m_iter(it.iter_base()) {} - - iterator_wrapper(const base_iter& it) : m_iter(it) {} - - reference operator*() const { return m_iter.dereference(); } - - pointer operator->() const { return &m_iter.dereference(); } - - this_type& operator++() { m_iter.increment(); return *this; } - - this_type operator++(int) { - this_type tmp(*this); - this->operator++(); - return tmp; - } - - this_type& operator--() { m_iter.decrement(); return *this; } - - this_type operator--(int) { - this_type tmp(*this); - this->operator--(); - return tmp; - } - - //! Assignment from non-const to const_iterator - template<class SameBase> - this_type& operator=(const iterator_wrapper<SameBase, nonconst_traits<value_type> >& rhs) { - this_type(rhs).swap(*this); - return *this; - } - - this_type& operator=(const iterator_wrapper<BaseIterT, TraitsT>& rhs) { - this_type(rhs).swap(*this); - return *this; - } - - template<class Base, class Constness> - void swap(iterator_wrapper<Base, Constness>& other) { - iter_base().swap(other.iter_base()); - } - - base_iter& iter_base() { return m_iter; } - const base_iter& iter_base() const { return m_iter; } - - private: - base_iter m_iter; - }; - - // \relates iterator_wrapper - template<class Base, class Val, class Val2, class Cat> - bool operator== (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return lhs.iter_base().equal(rhs.iter_base()); - } - - // \relates iterator_wrapper - template<class Base, class Val, class Val2, class Cat> - bool operator!= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return ! (lhs == rhs); - } - -//! \def provide equality operator for reverse const/nonconst iterators \relates iterator_wrapper -#define INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(ConstTraits, NonConstTraits) \ - template<class Base, class Val, class Cat> \ - bool operator== (const std::reverse_iterator<iterator_wrapper<Base, ConstTraits<Val>, Cat> >& lhs, \ - const std::reverse_iterator<iterator_wrapper<Base, NonConstTraits<Val>, Cat> >& rhs) { \ - return lhs.base() == rhs.base(); \ - } \ - template<class Base, class Val, class Cat> \ - bool operator!= (const std::reverse_iterator<iterator_wrapper<Base, NonConstTraits<Val>, Cat> >& lhs, \ - const std::reverse_iterator<iterator_wrapper<Base, ConstTraits<Val>, Cat> >& rhs) { \ - return !(lhs.base() == rhs.base()); \ - } - -INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(const_traits, nonconst_traits) -INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(nonconst_traits, const_traits) - -#undef INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS - - -/** \relates iterator_wrapper - * @{ - */ -template<class Base, class Val, class Val2, class Cat> -bool operator< (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return lhs.iter_base().less(rhs.iter_base()); -} - - -template<class Base, class Val, class Val2, class Cat> -bool operator> (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return rhs < lhs; -} - - -template<class Base, class Val, class Val2, class Cat> -bool operator>= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return ! (lhs < rhs); -} - -template<class Base, class Val, class Val2, class Cat> -bool operator<= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { - return ! (rhs > lhs); -} - -// random access iter operations (+= -= etc) - -template<class Base, class Val, class Dist> -iterator_wrapper<Base, Val, std::random_access_iterator_tag>& -operator+= (iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { - it.iter_base().advance(n); - return it; -} - -template<class Base, class Val, class Dist> -iterator_wrapper<Base, Val, std::random_access_iterator_tag>& -operator-= (iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { - it.iter_base().advance(-n); - return it; -} - -template<class Base, class Val, class Dist> -iterator_wrapper<Base, Val, std::random_access_iterator_tag> -operator+ (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { - iterator_wrapper<Base, Val, std::random_access_iterator_tag> tmp(it); - tmp += n; - return tmp; -} - -template<class Base, class Val, class Dist> -iterator_wrapper<Base, Val, std::random_access_iterator_tag> -operator- (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { - iterator_wrapper<Base, Val, std::random_access_iterator_tag> tmp(it); - tmp -= n; - return tmp; -} - -template<class Base, class Val, class Dist> -Dist -operator- (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& lhs, - const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& rhs) { - return lhs - rhs; -} - - -/*@}*/ - -} // namespace iterators - - - -#endif // ITERATOR_WRAPPER_HPP_INCLUDED +// Created Mon Feb 06 13:20:01 2006 +#ifndef ITERATOR_WRAPPER_HPP_INCLUDED +#define ITERATOR_WRAPPER_HPP_INCLUDED + +#include <iterator> + +namespace iterators { + + // This is mostly a lame ripoff from Boost.Iterator, to avoid the dependency... + + //! Standard type traits for const_iterators. \see iterator_wrapper + template <class T> + struct const_traits { + typedef T value_type; + typedef const T* pointer; + typedef const T* const_pointer; + typedef const T& reference; + typedef const T& const_reference; + }; + + //! Standard type traits for (non-const) iterators. \see iterator_wrapper + template <class T> + struct nonconst_traits { + typedef T value_type; + typedef T* pointer; + typedef const T* const_pointer; + typedef T& reference; + typedef const T& const_reference; + }; + + /** Creates a bidirectional iterator from a base implementation, + * which is required to supply the interface \code + * struct iter_impl_sample + * { + * typedef /impl-defined/ reference; + * iter_impl_sample(); + * iter_impl_sample(/args/); + * void increment(); + * void decrement(); + * reference dereference() const; + * template<class OtherIter> bool equal(const OtherIter& rhs); + * void swap(this_type& rhs); + * }; \endcode + * (This class is meant for iterators you control - if you need to adapt an existing iterator + * with different interface, something like boost::iterator_facade is needed.) + * + * The first template parameter is the iterator implementation class. + * iterator_wrapper does not inherit from this. The second parameter is either const_traits <T> + * or nonconst_traits <T>, which provide the basic value_type related definitions. + * + * Note that Boost.Iterator will do the same job better, this was provided to avoid the dependency. + * Future versions may move to Boost instead. + * + * \ingroup utilities + */ + template< class BaseIterT + , class TraitsT + , class IterCatT = std::bidirectional_iterator_tag + > + struct iterator_wrapper + { + typedef BaseIterT base_iter; + typedef TraitsT traits_type; + typedef iterator_wrapper<BaseIterT, TraitsT, IterCatT> this_type; + + typedef typename TraitsT::value_type value_type; + typedef typename TraitsT::pointer pointer; + typedef typename TraitsT::reference reference; + typedef typename TraitsT::const_reference const_reference; + + typedef IterCatT iterator_category; + typedef ptrdiff_t difference_type; + typedef size_t size_type; + + iterator_wrapper() {} + + //! Copy constructor for iterator and constructor from (non-const) iterator for const_iterator + template<class SameBase> + iterator_wrapper(const iterator_wrapper<SameBase, nonconst_traits<value_type>, IterCatT>& it) + : m_iter(it.iter_base()) + {} + + iterator_wrapper(const iterator_wrapper& it) : m_iter(it.iter_base()) {} + + iterator_wrapper(const base_iter& it) : m_iter(it) {} + + reference operator*() const { return m_iter.dereference(); } + + pointer operator->() const { return &m_iter.dereference(); } + + this_type& operator++() { m_iter.increment(); return *this; } + + this_type operator++(int) { + this_type tmp(*this); + this->operator++(); + return tmp; + } + + this_type& operator--() { m_iter.decrement(); return *this; } + + this_type operator--(int) { + this_type tmp(*this); + this->operator--(); + return tmp; + } + + //! Assignment from non-const to const_iterator + template<class SameBase> + this_type& operator=(const iterator_wrapper<SameBase, nonconst_traits<value_type> >& rhs) { + this_type(rhs).swap(*this); + return *this; + } + + this_type& operator=(const iterator_wrapper<BaseIterT, TraitsT>& rhs) { + this_type(rhs).swap(*this); + return *this; + } + + template<class Base, class Constness> + void swap(iterator_wrapper<Base, Constness>& other) { + iter_base().swap(other.iter_base()); + } + + base_iter& iter_base() { return m_iter; } + const base_iter& iter_base() const { return m_iter; } + + private: + base_iter m_iter; + }; + + // \relates iterator_wrapper + template<class Base, class Val, class Val2, class Cat> + bool operator== (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return lhs.iter_base().equal(rhs.iter_base()); + } + + // \relates iterator_wrapper + template<class Base, class Val, class Val2, class Cat> + bool operator!= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return ! (lhs == rhs); + } + +//! \def provide equality operator for reverse const/nonconst iterators \relates iterator_wrapper +#define INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(ConstTraits, NonConstTraits) \ + template<class Base, class Val, class Cat> \ + bool operator== (const std::reverse_iterator<iterator_wrapper<Base, ConstTraits<Val>, Cat> >& lhs, \ + const std::reverse_iterator<iterator_wrapper<Base, NonConstTraits<Val>, Cat> >& rhs) { \ + return lhs.base() == rhs.base(); \ + } \ + template<class Base, class Val, class Cat> \ + bool operator!= (const std::reverse_iterator<iterator_wrapper<Base, NonConstTraits<Val>, Cat> >& lhs, \ + const std::reverse_iterator<iterator_wrapper<Base, ConstTraits<Val>, Cat> >& rhs) { \ + return !(lhs.base() == rhs.base()); \ + } + +INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(const_traits, nonconst_traits) +INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS(nonconst_traits, const_traits) + +#undef INTEROPERABLE_REVERSE_ITERATOR_WRAPPERS + + +/** \relates iterator_wrapper + * @{ + */ +template<class Base, class Val, class Val2, class Cat> +bool operator< (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return lhs.iter_base().less(rhs.iter_base()); +} + + +template<class Base, class Val, class Val2, class Cat> +bool operator> (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return rhs < lhs; +} + + +template<class Base, class Val, class Val2, class Cat> +bool operator>= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return ! (lhs < rhs); +} + +template<class Base, class Val, class Val2, class Cat> +bool operator<= (const iterator_wrapper<Base, Val, Cat>& lhs, const iterator_wrapper<Base, Val2, Cat>& rhs) { + return ! (rhs > lhs); +} + +// random access iter operations (+= -= etc) + +template<class Base, class Val, class Dist> +iterator_wrapper<Base, Val, std::random_access_iterator_tag>& +operator+= (iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { + it.iter_base().advance(n); + return it; +} + +template<class Base, class Val, class Dist> +iterator_wrapper<Base, Val, std::random_access_iterator_tag>& +operator-= (iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { + it.iter_base().advance(-n); + return it; +} + +template<class Base, class Val, class Dist> +iterator_wrapper<Base, Val, std::random_access_iterator_tag> +operator+ (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { + iterator_wrapper<Base, Val, std::random_access_iterator_tag> tmp(it); + tmp += n; + return tmp; +} + +template<class Base, class Val, class Dist> +iterator_wrapper<Base, Val, std::random_access_iterator_tag> +operator- (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& it, Dist n) { + iterator_wrapper<Base, Val, std::random_access_iterator_tag> tmp(it); + tmp -= n; + return tmp; +} + +template<class Base, class Val, class Dist> +Dist +operator- (const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& lhs, + const iterator_wrapper<Base, Val, std::random_access_iterator_tag>& rhs) { + return lhs - rhs; +} + + +/*@}*/ + +} // namespace iterators + + + +#endif // ITERATOR_WRAPPER_HPP_INCLUDED Modified: trunk/opentrep/ternary_tree/readme.txt =================================================================== --- trunk/opentrep/ternary_tree/readme.txt 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/readme.txt 2009-07-14 14:07:29 UTC (rev 127) @@ -1,71 +1,71 @@ -LIBRARY -Ternary Search Tree C++ implementation by rasmus ekman -A header-only library of fast string containers with advanced search features. - -version 0.67, 14 May 2006 - -Please send bug reports, suggestions or questions to ras...@ab... -Get latest version at http://abc.se/~re/code/tst/ - -REQUIREMENTS -Library files tested with g++ 3.4.3 and MSVC 7.1 (Visual Studio 2003). -Visual Studio 6 will not work, but may only need moving the template -methods of ternary_tree and structured_* classes inline. -To generate documentation, you need Doxygen. See http://doxygen.org. - - -USAGE -Container classes: - structured_set<Key [, Comp, Alloc]> - structured_multiset<Key [, Comp, Alloc]> - structured_map<Key, Value [, Comp, Alloc]> - structured_multimap<Key, Value [, Comp, Alloc]> -- Key is a std::string-like type (a Forward Container), -- Value is any type, -- Comp is a less-like sort operation on Key::value_type (eg char/wchar_t) -- Alloc is std::allocator<Key [, Value]> or has same interface. - -These containers can be used as nearly drop-in replacments for std::set, -multiset, map, multimap or unordered_* containers on string-like types. - -There is one difference in interface: -If you used non-default comparator template argument with a set or map type, -it must be changed to operate on character type, not string. - -To use standard set, map features: See documentation of these classes. -See included documentation for information about the advanced key search -facilities in all structured_* containers. - - -FILES ---- l i b r a r y c o d e --- -structured_map.hpp - classes structured_map and -multimap. -structured_set.hpp - classes structured_set and multiset. -ternary_tree.hpp - implementation backend class. -./tst_detail/.* ternary_tree implementation files -iterator_wrapper.hpp - iterator interface, included by all containers. - ---- d o c s --- -tst_public.doxy - doxygen config file, generates public interface of library. -tst.doxy - doxygen config file, generates public and private interface docs. -./doxygen_input/* - extra documentation sources used by Doxygen. - -./html/* - generated public and private documentation -full-docs-index.html - redirects to html directory -index.html - redirects to tst_docs directory, only useful if doxygen is used - with tst_public.doxy - ---- t e s t s --- -tst_concept_checks.cpp - requires Boost concept_check header, portable. -iterator_compile_test.cpp - checks iterator_wrapper interoperability, portable. - -test_tst.cpp - test suite; relies on non-portable wstring.hpp (but see below) -fill_dictionary.cpp - (sloppy old support file for test_tst) - fills a vector with strings from file. -wstring.hpp - string/wstring conversion, uses Windows API - if the MultiByteToWideChar and WideCharToMultiByte API calls are replaced, - test_tst may run on your platform... - ---- -rasmus ekman -May 14, 2006 +LIBRARY +Ternary Search Tree C++ implementation by rasmus ekman +A header-only library of fast string containers with advanced search features. + +version 0.67, 14 May 2006 + +Please send bug reports, suggestions or questions to ras...@ab... +Get latest version at http://abc.se/~re/code/tst/ + +REQUIREMENTS +Library files tested with g++ 3.4.3 and MSVC 7.1 (Visual Studio 2003). +Visual Studio 6 will not work, but may only need moving the template +methods of ternary_tree and structured_* classes inline. +To generate documentation, you need Doxygen. See http://doxygen.org. + + +USAGE +Container classes: + structured_set<Key [, Comp, Alloc]> + structured_multiset<Key [, Comp, Alloc]> + structured_map<Key, Value [, Comp, Alloc]> + structured_multimap<Key, Value [, Comp, Alloc]> +- Key is a std::string-like type (a Forward Container), +- Value is any type, +- Comp is a less-like sort operation on Key::value_type (eg char/wchar_t) +- Alloc is std::allocator<Key [, Value]> or has same interface. + +These containers can be used as nearly drop-in replacments for std::set, +multiset, map, multimap or unordered_* containers on string-like types. + +There is one difference in interface: +If you used non-default comparator template argument with a set or map type, +it must be changed to operate on character type, not string. + +To use standard set, map features: See documentation of these classes. +See included documentation for information about the advanced key search +facilities in all structured_* containers. + + +FILES +--- l i b r a r y c o d e --- +structured_map.hpp - classes structured_map and -multimap. +structured_set.hpp - classes structured_set and multiset. +ternary_tree.hpp - implementation backend class. +./tst_detail/.* ternary_tree implementation files +iterator_wrapper.hpp - iterator interface, included by all containers. + +--- d o c s --- +tst_public.doxy - doxygen config file, generates public interface of library. +tst.doxy - doxygen config file, generates public and private interface docs. +./doxygen_input/* - extra documentation sources used by Doxygen. + +./html/* - generated public and private documentation +full-docs-index.html - redirects to html directory +index.html - redirects to tst_docs directory, only useful if doxygen is used + with tst_public.doxy + +--- t e s t s --- +tst_concept_checks.cpp - requires Boost concept_check header, portable. +iterator_compile_test.cpp - checks iterator_wrapper interoperability, portable. + +test_tst.cpp - test suite; relies on non-portable wstring.hpp (but see below) +fill_dictionary.cpp - (sloppy old support file for test_tst) + fills a vector with strings from file. +wstring.hpp - string/wstring conversion, uses Windows API + if the MultiByteToWideChar and WideCharToMultiByte API calls are replaced, + test_tst may run on your platform... + +--- +rasmus ekman +May 14, 2006 Added: trunk/opentrep/ternary_tree/simple_tst.cpp =================================================================== --- trunk/opentrep/ternary_tree/simple_tst.cpp (rev 0) +++ trunk/opentrep/ternary_tree/simple_tst.cpp 2009-07-14 14:07:29 UTC (rev 127) @@ -0,0 +1,15 @@ +// C +#include <cassert> +// STL +#include <iostream> +// Ternary Tree Structure (TST) +//#include <structured_set.hpp> + +// /////////////// M A I N /////////////// +int main (int argc, char* argv[]) { + + std::cout << "Hello TST!" << std::endl; + + return 0; +} + Added: trunk/opentrep/ternary_tree/sources.mk =================================================================== --- trunk/opentrep/ternary_tree/sources.mk (rev 0) +++ trunk/opentrep/ternary_tree/sources.mk 2009-07-14 14:07:29 UTC (rev 127) @@ -0,0 +1,6 @@ +tst_h_sources = \ + $(top_srcdir)/ternary_tree/ternary_tree.hpp \ + $(top_srcdir)/ternary_tree/structured_set.hpp \ + $(top_srcdir)/ternary_tree/structured_map.hpp \ + $(top_srcdir)/ternary_tree/iterator_wrapper.hpp +tst_cc_sources = Modified: trunk/opentrep/ternary_tree/structured_map.hpp =================================================================== --- trunk/opentrep/ternary_tree/structured_map.hpp 2009-07-14 10:45:19 UTC (rev 126) +++ trunk/opentrep/ternary_tree/structured_map.hpp 2009-07-14 14:07:29 UTC (rev 127) @@ -1,938 +1,938 @@ -#ifndef STRUCTURED_MAP_HPP_INCLUDED -#define STRUCTURED_MAP_HPP_INCLUDED - -#define TST_NO_STANDALONE_ITERATOR_FACADE -#include "ternary_tree.hpp" -#undef TST_NO_STANDALONE_ITERATOR_FACADE -// note: also #include <list> in structured_multimap section - -namespace containers { - - -/** Structured Map is a Sorted Associative Container that stores objects of type pair<Key, Data>. - * Structured Map is a Structured Container, meaning that its key type is required to be - * a Forward Container, and that the map uses a comparator to establish - * a strict weak ordering among key::value_type elements (rather than on whole keys). - * This allows searches in the set involving parts of keys, ie with shared prefix - * or with shared middle parts. - * - * Structured Map is a Pair Associative Container, meaning that its value type - * is pair<const Key, T>. - * It is also a Unique Associative Container, meaning that no two elements are the same. - * - * A std::map is normally backed by a binary tree. A structured map is instead backed - * by a ternary_tree, which manages structured ordering of keys. - * For string-like keys, a ternary tree is typically as fast as an unordered_map, - * and several times faster than most std::map implementations. - * \ingroup containers - */ -template<class KeyT, class DataT, class CompT = std::less<typename KeyT::value_type>, - class AllocT = std::allocator<std::pair<const KeyT, DataT> > > -class structured_map -{ -public: - typedef KeyT key_type; - typedef DataT mapped_type; - typedef std::pair<const KeyT, DataT> value_type; - typedef typename KeyT::value_type char_type; - typedef CompT char_compare; - - typedef AllocT allocator_type; - typedef typename AllocT::difference_type difference_type; - typedef typename AllocT::size_type size_type; - typedef typename AllocT::pointer pointer; - typedef typename AllocT::const_pointer const_pointer; - typedef typename AllocT::reference reference; - typedef typename AllocT::const_reference const_reference; - -private: - // Internal value type is pair<Key, Value> (non-const Key). - typedef ternary_tree< KeyT, value_type, CompT, - typename AllocT::template rebind<value_type>::other - > ternary_tree; - typedef typename ternary_tree::iterator tst_iterator; - typedef typename ternary_tree::iterator::base_iter tst_iterator_base; - typedef typename ternary_tree::const_iterator tst_const_iterator; - - enum { invalid_index = size_type(-1) }; - -public: - - typedef typename ternary_tree::key_compare key_compare; - - - typedef iterators::iterator_wrapper < tst_iterator_base - , iterators::nonconst_traits<value_type> - > iterator; - - typedef iterators::iterator_wrapper < tst_iterator_base - , iterators::const_traits<value_type> - > const_iterator; - - typedef std::reverse_iterator<iterator> reverse_iterator; - typedef std::reverse_iterator<const_iterator> const_reverse_iterator; - - - /** \name Construct, copy, destroy - * @{ - */ - structured_map() : m_tree(char_compare(), allocator_type()) {} - - explicit structured_map(const char_compare& comp) - : m_tree(comp, allocator_type()) - {} - - structured_map(const char_compare& comp, const allocator_type& alloc) - : m_tree(comp, alloc) - {} - - template<class InputIterator> - structured_map( InputIterator first, InputIterator last, - const char_compare& comp = char_compare(), - const allocator_type& alloc = allocator_type()) - : m_tree(comp, alloc) - { - insert(first, last); - } - - structured_map(const structured_map& other) - : m_tree(other.m_tree) - {} - - ~structured_map() {} - - structured_map& operator= (const structured_map& other) { - structured_map(other).swap(*this); - return *this; - } - - allocator_type get_allocator() const { return m_tree.get_allocator(); } - /* @} */ - - /** \name Iterators - * Includes C++0x methods cbegin, cend, crbegin, crend to make it easier - * to access const iterators. - * @{ - */ - iterator begin() { return iterator(m_tree.begin()); } - const_iterator begin() const { return const_iterator(m_tree.begin()); } - iterator end() { return iterator(m_tree.end()); } - const_iterator end() const { return const_iterator(m_tree.end()); } - - reverse_iterator rbegin() { return reverse_iterator(end()); } - const_reverse_iterator rbegin() const { return const_reverse_iterator(end()); } - reverse_iterator rend() { return reverse_iterator(begin()); } - const_reverse_iterator rend() const { return const_reverse_iterator(begin()); } - - // C++0x additions - const_iterator cbegin() const { return const_iterator(m_tree.begin()); } - const_iterator cend() const { return const_iterator(m_... [truncated message content] |
From: <den...@us...> - 2009-07-14 22:34:18
|
Revision: 128 http://opentrep.svn.sourceforge.net/opentrep/?rev=128&view=rev Author: denis_arnaud Date: 2009-07-14 22:34:14 +0000 (Tue, 14 Jul 2009) Log Message: ----------- [Test] Tested a few variations for Xapian string search. Modified Paths: -------------- trunk/opentrep/test/xapian/string_search.cpp Property Changed: ---------------- trunk/opentrep/ternary_tree/ Property changes on: trunk/opentrep/ternary_tree ___________________________________________________________________ Modified: svn:ignore - .libs .deps Makefile Makefile.in + .libs .deps Makefile Makefile.in simple_tst Modified: trunk/opentrep/test/xapian/string_search.cpp =================================================================== --- trunk/opentrep/test/xapian/string_search.cpp 2009-07-14 14:07:29 UTC (rev 127) +++ trunk/opentrep/test/xapian/string_search.cpp 2009-07-14 22:34:14 UTC (rev 128) @@ -28,20 +28,20 @@ for (int idx=2; idx != argc; ++idx) { if (idx != 2) { oStr << " "; -// oStr << " AND "; } const std::string lWord (argv[idx]); - const std::string lSuggestedWord = - db.get_spelling_suggestion (lWord, 3); + const std::string lSuggestedWord = db.get_spelling_suggestion(lWord, 3); std::cout << "Word `" << lWord << "' ==> Suggested word `" << lSuggestedWord << "'" << std::endl; oStr << lWord; } const std::string lQueryString = oStr.str(); + std::cout << "QueryString `" << lQueryString << "'" << std::endl; // Build the query object Xapian::QueryParser lQueryParser; lQueryParser.set_database (db); + lQueryParser.set_default_op (Xapian::Query::OP_NEAR); std::cout << "Query parser `" << lQueryParser.get_description() << "'" << std::endl; @@ -53,10 +53,11 @@ | Xapian::QueryParser::FLAG_LOVEHATE | Xapian::QueryParser::FLAG_SPELLING_CORRECTION); Xapian::Query lCorrectedQuery = lQueryParser.get_corrected_query_string(); - + std::cout << "Query `" << lQuery.get_description() << "', Corrected query `" << lCorrectedQuery.get_description() - << "'" << std::endl; + << "' " + << std::endl; // Give the query object to the enquire session enquire.set_query (lQuery); @@ -65,12 +66,19 @@ Xapian::MSet matches = enquire.get_mset (0, 10); // Display the results - int nbMatches = matches.size(); + const int nbMatches = matches.size(); std::cout << nbMatches << " results found" << std::endl; + + // if (true) { if (nbMatches == 0) { enquire.set_query (lCorrectedQuery); matches = enquire.get_mset (0, 10); + //const Xapian::MSet matchesAll = enquire.get_mset (); + if (matches.size() == matches.max_size()) { + std::cout << "Corrected string matches all the documents" + << std::endl; + } } const Xapian::Query& lActualQuery = enquire.get_query(); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-15 14:45:49
|
Revision: 129 http://opentrep.svn.sourceforge.net/opentrep/?rev=129&view=rev Author: denis_arnaud Date: 2009-07-15 14:45:43 +0000 (Wed, 15 Jul 2009) Log Message: ----------- [Indexer] Fixed a bug in the indexer (where terms were inserted with spaces). Modified Paths: -------------- trunk/opentrep/opentrep/command/IndexBuilder.cpp trunk/opentrep/refdata/data/ref_place_names.csv trunk/opentrep/test/xapian/simple_search.cpp trunk/opentrep/test/xapian/string_search.cpp Modified: trunk/opentrep/opentrep/command/IndexBuilder.cpp =================================================================== --- trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-14 22:34:14 UTC (rev 128) +++ trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-15 14:45:43 UTC (rev 129) @@ -7,6 +7,8 @@ #include <string> #include <vector> #include <exception> +// Boost +#include <boost/tokenizer.hpp> // OpenTrep #include <opentrep/bom/World.hpp> #include <opentrep/bom/Place.hpp> @@ -21,6 +23,30 @@ #include <xapian.h> namespace OPENTREP { + + // ////////////////////////////////////////////////////////////////////// + void tokeniseAndAddToDocument (const std::string& iPhrase, + Xapian::Document& ioDocument, + Xapian::WritableDatabase& ioDatabase) { + + // Boost Tokeniser + typedef boost::tokenizer<boost::char_separator<char> > Tokeniser_T; + + // Define the separators + boost::char_separator<char> lSepatorList(" .,;:|+-*/_=!@#$%`~^&(){}[]?'<>\""); + + // Initialise the phrase to be tokenised + Tokeniser_T lTokens (iPhrase, lSepatorList); + for (Tokeniser_T::const_iterator tok_iter = lTokens.begin(); + tok_iter != lTokens.end(); ++tok_iter) { + const std::string& lTerm = *tok_iter; + + ioDatabase.add_spelling (lTerm); + ioDocument.add_term (lTerm); + + OPENTREP_LOG_DEBUG ("Added term: " << lTerm); + } + } // ////////////////////////////////////////////////////////////////////// void IndexBuilder:: @@ -90,8 +116,9 @@ // extended, alternate, etc.) if (lName.empty() == false) { // OPENTREP_LOG_DEBUG ("Added name: " << lName); - lDocument.add_term (lName); ++idx; - ioDatabase.add_spelling (lName); + // lDocument.add_term (lName); ++idx; + // ioDatabase.add_spelling (lName); + tokeniseAndAddToDocument (lName, lDocument, ioDatabase); } } } Modified: trunk/opentrep/refdata/data/ref_place_names.csv =================================================================== --- trunk/opentrep/refdata/data/ref_place_names.csv 2009-07-14 22:34:14 UTC (rev 128) +++ trunk/opentrep/refdata/data/ref_place_names.csv 2009-07-15 14:45:43 UTC (rev 129) @@ -1826,7 +1826,7 @@ en,jbt,bethel jbt,bethel jbt,bethel/ak/us:city landing en,jca,cannes jca,cannes jca,cannes/fr:croisette hpt en,jcb,joacaba,joacaba,joacaba/sc/br -en,jcc,sanfrancisco jcc,sanfrancisco jc,san francisco/ca/us:china hpt +en,jcc,san francisco jcc,san francisco jc,san francisco/ca/us:china hpt en,jcd,st croix is jcd,st croix is jcd,st croix is/vi:downtown hpt en,jce,convention,convention,convention/ca/us:heliport en,jch,qasigiannguit,qasigiannguit,qasigiannguit/gl Modified: trunk/opentrep/test/xapian/simple_search.cpp =================================================================== --- trunk/opentrep/test/xapian/simple_search.cpp 2009-07-14 22:34:14 UTC (rev 128) +++ trunk/opentrep/test/xapian/simple_search.cpp 2009-07-15 14:45:43 UTC (rev 129) @@ -1,5 +1,6 @@ // STL #include <iostream> +#include <string> // Xapian #include <xapian.h> @@ -7,46 +8,59 @@ int main (int argc, char* argv[]) { // Simplest possible options parsing: we just require two or more - // parameters. - if (argc < 3) { - std::cout << "Usage: " << argv[0] - << " <path to database> <search terms>" << std::endl; - return -1; - } + // parameters. + if (argc < 3) { + std::cout << "Usage: " << argv[0] + << " <path to database> <search terms>" << std::endl; + return -1; + } - // Catch any Xapian::Error exceptions thrown - try { + // Catch any Xapian::Error exceptions thrown + try { - // Make the database - Xapian::Database db (argv[1]); + // Open the database for searching. + Xapian::Database db (argv[1]); - // Start an enquire session - Xapian::Enquire enquire (db); + // Start an enquire session + Xapian::Enquire enquire (db); - // Build the query object - Xapian::Query query (Xapian::Query::OP_AND, argv + 2, argv + argc); - std::cout << "Performing query `" << query.get_description() << "'" - << std::endl; - - // Give the query object to the enquire session - enquire.set_query (query); + // Combine the rest of the command line arguments with spaces between + // them, so that simple queries don't have to be quoted at the shell + // level. + std::string query_string (argv[2]); + argv += 3; + while (*argv) { + query_string += ' '; + query_string += *argv++; + } - // Get the top 10 results of the query - Xapian::MSet matches = enquire.get_mset (0, 10); + // Parse the query string to produce a Xapian::Query object. + Xapian::QueryParser qp; + Xapian::Stem stemmer ("english"); + qp.set_stemmer (stemmer); + qp.set_database (db); + qp.set_stemming_strategy (Xapian::QueryParser::STEM_SOME); + Xapian::Query query = qp.parse_query (query_string); + std::cout << "Parsed query is: " << query.get_description() << std::endl; - // Display the results - std::cout << matches.size() << " results found" << std::endl; + // Find the top 10 results for the query. + enquire.set_query (query); + Xapian::MSet matches = enquire.get_mset(0, 10); - for (Xapian::MSetIterator i = matches.begin(); i != matches.end(); ++i) { - Xapian::Document doc = i.get_document(); - std::cout << "Document ID " << *i << "\t" << - i.get_percent() << "% [" << - doc.get_data() << "]" << std::endl; - } + // Display the results. + std::cout << matches.get_matches_estimated() << " results found." + << std::endl; + std::cout << "Matches 1-" << matches.size() << ":" << std::endl << std::endl; + + for (Xapian::MSetIterator i = matches.begin(); i != matches.end(); ++i) { + std::cout << i.get_rank() + 1 << ": " << i.get_percent() << "% docid=" + << *i << " [" << i.get_document().get_data() << "]" + << std::endl << std::endl; + } - } catch (const Xapian::Error& error) { - std::cerr << "Exception: " << error.get_msg() << std::endl; - } + } catch (const Xapian::Error& error) { + std::cerr << "Exception: " << error.get_msg() << std::endl; + } - return 0; + return 0; } Modified: trunk/opentrep/test/xapian/string_search.cpp =================================================================== --- trunk/opentrep/test/xapian/string_search.cpp 2009-07-14 22:34:14 UTC (rev 128) +++ trunk/opentrep/test/xapian/string_search.cpp 2009-07-15 14:45:43 UTC (rev 129) @@ -41,7 +41,11 @@ // Build the query object Xapian::QueryParser lQueryParser; lQueryParser.set_database (db); - lQueryParser.set_default_op (Xapian::Query::OP_NEAR); + // As explained in http://www.xapian.org/docs/queryparser.html, + // Xapian::Query::OP_ADJ is better than Xapian::Query::OP_PHRASE, + // but only available from version 1.0.13 of Xapian + // lQueryParser.set_default_op (Xapian::Query::OP_ADJ); + lQueryParser.set_default_op (Xapian::Query::OP_PHRASE); std::cout << "Query parser `" << lQueryParser.get_description() << "'" << std::endl; This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-15 16:59:26
|
Revision: 130 http://opentrep.svn.sourceforge.net/opentrep/?rev=130&view=rev Author: denis_arnaud Date: 2009-07-15 16:59:19 +0000 (Wed, 15 Jul 2009) Log Message: ----------- [TREP] Improved the string search (command line utility). Modified Paths: -------------- trunk/opentrep/opentrep/command/IndexBuilder.cpp trunk/opentrep/test/xapian/string_search.cpp Modified: trunk/opentrep/opentrep/command/IndexBuilder.cpp =================================================================== --- trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-15 14:45:43 UTC (rev 129) +++ trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-15 16:59:19 UTC (rev 130) @@ -33,7 +33,7 @@ typedef boost::tokenizer<boost::char_separator<char> > Tokeniser_T; // Define the separators - boost::char_separator<char> lSepatorList(" .,;:|+-*/_=!@#$%`~^&(){}[]?'<>\""); + const boost::char_separator<char> lSepatorList(" .,;:|+-*/_=!@#$%`~^&(){}[]?'<>\""); // Initialise the phrase to be tokenised Tokeniser_T lTokens (iPhrase, lSepatorList); @@ -44,7 +44,7 @@ ioDatabase.add_spelling (lTerm); ioDocument.add_term (lTerm); - OPENTREP_LOG_DEBUG ("Added term: " << lTerm); + // OPENTREP_LOG_DEBUG ("Added term: " << lTerm); } } @@ -115,10 +115,14 @@ // Add the place name (it can be the classical one, or // extended, alternate, etc.) if (lName.empty() == false) { + // Add the full name (potentially containing spaces, e.g., + // 'san francisco'), as well as each word + // within it (with the example above, 'san' and 'francisco'). + lDocument.add_term (lName); ++idx; + ioDatabase.add_spelling (lName); + tokeniseAndAddToDocument (lName, lDocument, ioDatabase); + // OPENTREP_LOG_DEBUG ("Added name: " << lName); - // lDocument.add_term (lName); ++idx; - // ioDatabase.add_spelling (lName); - tokeniseAndAddToDocument (lName, lDocument, ioDatabase); } } } Modified: trunk/opentrep/test/xapian/string_search.cpp =================================================================== --- trunk/opentrep/test/xapian/string_search.cpp 2009-07-15 14:45:43 UTC (rev 129) +++ trunk/opentrep/test/xapian/string_search.cpp 2009-07-15 16:59:19 UTC (rev 130) @@ -24,26 +24,43 @@ // Start an enquire session Xapian::Enquire enquire (db); - std::ostringstream oStr; + std::ostringstream oOriginalStr; + std::ostringstream oCorrectedStr; for (int idx=2; idx != argc; ++idx) { if (idx != 2) { - oStr << " "; + oOriginalStr << " "; + oCorrectedStr << " "; } const std::string lWord (argv[idx]); const std::string lSuggestedWord = db.get_spelling_suggestion(lWord, 3); std::cout << "Word `" << lWord << "' ==> Suggested word `" << lSuggestedWord << "'" << std::endl; - oStr << lWord; + oOriginalStr << lWord; + + if (lSuggestedWord.empty() == true) { + oCorrectedStr << lWord; + + } else { + oCorrectedStr << lSuggestedWord; + } } - const std::string lQueryString = oStr.str(); - std::cout << "QueryString `" << lQueryString << "'" << std::endl; + + const std::string lOriginalQueryString = oOriginalStr.str(); + const std::string lCorrectedQueryString = oCorrectedStr.str(); + const std::string lFullWordCorrectedString = + db.get_spelling_suggestion (lOriginalQueryString, 4); + + std::cout << "Query string `" << lOriginalQueryString + << "' ==> corrected query string: `" << lCorrectedQueryString + << "' and correction for the full query string: `" + << lFullWordCorrectedString << "'" << std::endl; // Build the query object Xapian::QueryParser lQueryParser; lQueryParser.set_database (db); // As explained in http://www.xapian.org/docs/queryparser.html, // Xapian::Query::OP_ADJ is better than Xapian::Query::OP_PHRASE, - // but only available from version 1.0.13 of Xapian + // but only available from version 1.0.13 of Xapian. // lQueryParser.set_default_op (Xapian::Query::OP_ADJ); lQueryParser.set_default_op (Xapian::Query::OP_PHRASE); @@ -51,17 +68,28 @@ << std::endl; Xapian::Query lQuery = - lQueryParser.parse_query (lQueryString, + lQueryParser.parse_query (lOriginalQueryString, Xapian::QueryParser::FLAG_BOOLEAN | Xapian::QueryParser::FLAG_PHRASE | Xapian::QueryParser::FLAG_LOVEHATE | Xapian::QueryParser::FLAG_SPELLING_CORRECTION); - Xapian::Query lCorrectedQuery = lQueryParser.get_corrected_query_string(); + //Xapian::Query lCorrectedQuery= lQueryParser.get_corrected_query_string(); + Xapian::Query lCorrectedQuery = + lQueryParser.parse_query (lCorrectedQueryString, + Xapian::QueryParser::FLAG_BOOLEAN + | Xapian::QueryParser::FLAG_PHRASE + | Xapian::QueryParser::FLAG_LOVEHATE); + Xapian::Query lFullQueryCorrected = + lQueryParser.parse_query (lFullWordCorrectedString, + Xapian::QueryParser::FLAG_BOOLEAN + | Xapian::QueryParser::FLAG_PHRASE + | Xapian::QueryParser::FLAG_LOVEHATE); + std::cout << "Query `" << lQuery.get_description() - << "', Corrected query `" << lCorrectedQuery.get_description() - << "' " - << std::endl; + << "', corrected query `" << lCorrectedQuery.get_description() + << "' and corrected for full query `" + << lFullQueryCorrected.get_description() << "' " << std::endl; // Give the query object to the enquire session enquire.set_query (lQuery); @@ -70,17 +98,25 @@ Xapian::MSet matches = enquire.get_mset (0, 10); // Display the results - const int nbMatches = matches.size(); + int nbMatches = matches.size(); std::cout << nbMatches << " results found" << std::endl; - - // if (true) { if (nbMatches == 0) { enquire.set_query (lCorrectedQuery); matches = enquire.get_mset (0, 10); - //const Xapian::MSet matchesAll = enquire.get_mset (); - if (matches.size() == matches.max_size()) { - std::cout << "Corrected string matches all the documents" + + // Display the results + nbMatches = matches.size(); + std::cout << nbMatches << " results found on corrected string" + << std::endl; + + if (nbMatches == 0) { + enquire.set_query (lFullQueryCorrected); + matches = enquire.get_mset (0, 10); + + // Display the results + nbMatches = matches.size(); + std::cout << nbMatches << " results found on corrected full string" << std::endl; } } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-17 17:19:16
|
Revision: 134 http://opentrep.svn.sourceforge.net/opentrep/?rev=134&view=rev Author: denis_arnaud Date: 2009-07-17 17:19:13 +0000 (Fri, 17 Jul 2009) Log Message: ----------- [Dev] Added a few factory classes. Modified Paths: -------------- trunk/opentrep/opentrep/OPENTREP_Service.hpp trunk/opentrep/opentrep/batches/indexer.cpp trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/bom/Names.cpp trunk/opentrep/opentrep/bom/Names.hpp trunk/opentrep/opentrep/bom/Place.cpp trunk/opentrep/opentrep/bom/Place.hpp trunk/opentrep/opentrep/bom/PlaceList.hpp trunk/opentrep/opentrep/bom/Result.cpp trunk/opentrep/opentrep/bom/Result.hpp trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/bom/ResultHolder.hpp trunk/opentrep/opentrep/bom/StringMatcher.cpp trunk/opentrep/opentrep/bom/World.cpp trunk/opentrep/opentrep/bom/World.hpp trunk/opentrep/opentrep/bom/sources.mk trunk/opentrep/opentrep/command/DBManager.cpp trunk/opentrep/opentrep/command/RequestInterpreter.cpp trunk/opentrep/opentrep/command/RequestInterpreter.hpp trunk/opentrep/opentrep/factory/FacPlace.cpp trunk/opentrep/opentrep/factory/FacPlace.hpp trunk/opentrep/opentrep/factory/FacWorld.cpp trunk/opentrep/opentrep/factory/sources.mk trunk/opentrep/opentrep/service/OPENTREP_Service.cpp trunk/opentrep/refdata/data/ref_place_names.csv trunk/opentrep/test/parsers/search_string_parser.cpp Added Paths: ----------- trunk/opentrep/opentrep/bom/PlaceHolder.cpp trunk/opentrep/opentrep/bom/PlaceHolder.hpp trunk/opentrep/opentrep/factory/FacPlaceHolder.cpp trunk/opentrep/opentrep/factory/FacPlaceHolder.hpp trunk/opentrep/opentrep/factory/FacResult.cpp trunk/opentrep/opentrep/factory/FacResult.hpp trunk/opentrep/opentrep/factory/FacResultHolder.cpp trunk/opentrep/opentrep/factory/FacResultHolder.hpp Modified: trunk/opentrep/opentrep/OPENTREP_Service.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -6,6 +6,7 @@ // ////////////////////////////////////////////////////////////////////// // STL #include <ostream> +#include <string> // OPENTREP #include <opentrep/OPENTREP_Types.hpp> @@ -18,14 +19,12 @@ class OPENTREP_Service { public: /** Constructor. */ - OPENTREP_Service (); + OPENTREP_Service (std::ostream& ioLogStream, + const std::string& iXapianDatabaseFilepath); + /** Destructor. */ ~OPENTREP_Service(); - /** Initialise. */ - void init (std::ostream& ioLogStream, - const std::string& iTravelDatabaseName); - /** Build the Xapian database (index) on the BOM held in memory. */ void buildSearchIndex (); @@ -35,9 +34,15 @@ private: // /////// Construction and Destruction helper methods /////// - /** Default Constructor. */ + /** Default constructor. */ + OPENTREP_Service (); + /** Default copy constructor. */ OPENTREP_Service (const OPENTREP_Service&); + /** Initialise. */ + void init (std::ostream& ioLogStream, + const std::string& iXapianDatabaseFilepath); + /** Initilise the log. */ void logInit (const LOG::EN_LogLevel iLogLevel, std::ostream& ioLogStream); Modified: trunk/opentrep/opentrep/batches/indexer.cpp =================================================================== --- trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -127,8 +127,8 @@ logOutputFile.clear(); // Initialise the context - OPENTREP::OPENTREP_Service opentrepService; - opentrepService.init (logOutputFile, lXapianDatabaseName); + OPENTREP::OPENTREP_Service opentrepService (logOutputFile, + lXapianDatabaseName); // Launch the indexation opentrepService.buildSearchIndex(); Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -4,15 +4,73 @@ #include <iostream> #include <sstream> #include <fstream> -#include <map> #include <vector> +#include <string> // Boost (Extended STL) #include <boost/date_time/posix_time/posix_time.hpp> #include <boost/date_time/gregorian/gregorian.hpp> +#include <boost/tokenizer.hpp> #include <boost/program_options.hpp> // OPENTREP #include <opentrep/OPENTREP_Service.hpp> +#include <opentrep/config/opentrep-paths.hpp> +// //////// Type definitions /////// +typedef std::vector<std::string> WordList_T; + +// //////// Constants ////// +/** Default name and location for the log file. */ +const std::string K_OPENTREP_DEFAULT_LOG_FILENAME ("opentrep_searcher.log"); + +/** Default name and location for the Xapian database. */ +const std::string K_OPENTREP_DEFAULT_DATABSE_FILEPATH ("/tmp/opentrep/traveldb"); + +/** Default travel query string, to be seached against the Xapian database. */ +const std::string K_OPENTREP_DEFAULT_QUERY_STRING ("sna francicso rio de janero lso anglese reykyavki"); + +/** Default error distance for spelling corrections. */ +const unsigned short K_OPENTREP_DEFAULT_SPELLING_ERROR_DISTANCE = 3; + +// ////////////////////////////////////////////////////////////////////// +void tokeniseStringIntoWordList (const std::string& iPhrase, + WordList_T& ioWordList) { + // Empty the word list + ioWordList.clear(); + + // Boost Tokeniser + typedef boost::tokenizer<boost::char_separator<char> > Tokeniser_T; + + // Define the separators + const boost::char_separator<char> lSepatorList(" .,;:|+-*/_=!@#$%`~^&(){}[]?'<>\""); + + // Initialise the phrase to be tokenised + Tokeniser_T lTokens (iPhrase, lSepatorList); + for (Tokeniser_T::const_iterator tok_iter = lTokens.begin(); + tok_iter != lTokens.end(); ++tok_iter) { + const std::string& lTerm = *tok_iter; + ioWordList.push_back (lTerm); + } + +} + +// ////////////////////////////////////////////////////////////////////// +std::string createStringFromWordList (const WordList_T& iWordList) { + std::ostringstream oStr; + + unsigned short idx = iWordList.size(); + for (WordList_T::const_iterator itWord = iWordList.begin(); + itWord != iWordList.end(); ++itWord, --idx) { + const std::string& lWord = *itWord; + oStr << lWord; + if (idx > 1) { + oStr << " "; + } + } + + return oStr.str(); +} + + // ///////// Parsing of Options & Configuration ///////// // A helper function to simplify the main part. template<class T> std::ostream& operator<< (std::ostream& os, @@ -21,34 +79,57 @@ return os; } -int readConfiguration (int argc, char* argv[]) { - int opt; - - // Declare a group of options that will be - // allowed only on command line - boost::program_options::options_description generic("Generic options"); +/** Early return status (so that it can be differentiated from an error). */ +const int K_OPENTREP_EARLY_RETURN_STATUS = 99; + +/** Read and parse the command line options. */ +int readConfiguration (int argc, char* argv[], + unsigned short& ioSpellingErrorDistance, + std::string& ioQueryString, + std::string& ioDatabaseFilepath, + std::string& ioLogFilename) { + + // Initialise the travel query string, if that one is empty + if (ioQueryString.empty() == true) { + ioQueryString = K_OPENTREP_DEFAULT_QUERY_STRING; + } + + // Transform the query string into a list of words (STL strings) + WordList_T lWordList; + tokeniseStringIntoWordList (ioQueryString, lWordList); + + // Declare a group of options that will be allowed only on command line + boost::program_options::options_description generic ("Generic options"); generic.add_options() + ("prefix", "print installation prefix") ("version,v", "print version string") ("help,h", "produce help message"); - // Declare a group of options that will be allowed both on command line and in - // config file - boost::program_options::options_description config("Configuration"); + // Declare a group of options that will be allowed both on command + // line and in config file + boost::program_options::options_description config ("Configuration"); config.add_options() - ("optimization", - boost::program_options::value<int>(&opt)->default_value(10), - "optimization level") - ("include-path,I", - boost::program_options::value< std::vector<std::string> >()->composing(), - "include path"); + ("error,e", + boost::program_options::value< unsigned short >(&ioSpellingErrorDistance)->default_value(K_OPENTREP_DEFAULT_SPELLING_ERROR_DISTANCE), + "Spelling error distance (e.g., 3)") + ("query,q", + boost::program_options::value< WordList_T >(&lWordList)->multitoken(), + "Traval query word list (e.g. sna francicso rio de janero lso anglese reykyavki") + ("database,d", + boost::program_options::value< std::string >(&ioDatabaseFilepath)->default_value(K_OPENTREP_DEFAULT_DATABSE_FILEPATH), + "Xapian database filepath (e.g., /tmp/opentrep/traveldb)") + ("log,l", + boost::program_options::value< std::string >(&ioLogFilename)->default_value(K_OPENTREP_DEFAULT_LOG_FILENAME), + "Filepath for the logs") + ; // Hidden options, will be allowed both on command line and // in config file, but will not be shown to the user. - boost::program_options::options_description hidden("Hidden options"); + boost::program_options::options_description hidden ("Hidden options"); hidden.add_options() - ("input-file", + ("copyright", boost::program_options::value< std::vector<std::string> >(), - "input file"); + "Show the copyright (license)"); boost::program_options::options_description cmdline_options; cmdline_options.add(generic).add(config).add(hidden); @@ -56,45 +137,53 @@ boost::program_options::options_description config_file_options; config_file_options.add(config).add(hidden); - boost::program_options::options_description visible("Allowed options"); + boost::program_options::options_description visible ("Allowed options"); visible.add(generic).add(config); boost::program_options::positional_options_description p; - p.add("input-file", -1); + p.add ("copyright", -1); boost::program_options::variables_map vm; boost::program_options:: - store (boost::program_options::command_line_parser(argc, argv). - options (cmdline_options).positional(p).run(), vm); + store (boost::program_options::command_line_parser (argc, argv). + options (cmdline_options).positional(p).run(), vm); - std::ifstream ifs ("request_parser.cfg"); + std::ifstream ifs ("opentrep_searcher.cfg"); boost::program_options::store (parse_config_file (ifs, config_file_options), vm); boost::program_options::notify (vm); if (vm.count ("help")) { std::cout << visible << std::endl; - return 0; + return K_OPENTREP_EARLY_RETURN_STATUS; } if (vm.count ("version")) { - std::cout << "Open Travel Request Parser, version 1.0" << std::endl; - return 0; + std::cout << PACKAGE_NAME << ", version " << PACKAGE_VERSION << std::endl; + return K_OPENTREP_EARLY_RETURN_STATUS; } - if (vm.count ("include-path")) { - std::cout << "Include paths are: " - << vm["include-path"].as< std::vector<std::string> >() - << std::endl; + if (vm.count ("prefix")) { + std::cout << "Installation prefix: " << PREFIXDIR << std::endl; + return K_OPENTREP_EARLY_RETURN_STATUS; } - if (vm.count ("input-file")) { - std::cout << "Input files are: " - << vm["input-file"].as< std::vector<std::string> >() + if (vm.count ("database")) { + ioDatabaseFilepath = vm["database"].as< std::string >(); + std::cout << "Xapian database filepath is: " << ioDatabaseFilepath << std::endl; } - std::cout << "Optimization level is " << opt << std::endl; + if (vm.count ("log")) { + ioLogFilename = vm["log"].as< std::string >(); + std::cout << "Log filename is: " << ioLogFilename << std::endl; + } + + std::cout << "The spelling error distance is: " << ioSpellingErrorDistance + << std::endl; + + ioQueryString = createStringFromWordList (lWordList); + std::cout << "The travel query string is: " << ioQueryString << std::endl; return 0; } @@ -105,29 +194,26 @@ try { // Travel query - OPENTREP::TravelQuery_T lTravelQuery ("sna francisco rio de janero lso angeles"); + OPENTREP::TravelQuery_T lTravelQuery; // Output log File - std::string lLogFilename ("searcher.log"); + std::string lLogFilename; // Xapian database name (directory of the index) - OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); + OPENTREP::TravelDatabaseName_T lXapianDatabaseName; - if (argc >= 1 && argv[1] != NULL) { - std::istringstream istr (argv[1]); - istr >> lTravelQuery; - } + // Xapian spelling error distance + unsigned short lSpellingErrorDistance; - if (argc >= 2 && argv[2] != NULL) { - std::istringstream istr (argv[2]); - istr >> lLogFilename; + // Call the command-line option parser + const int lOptionParserStatus = + readConfiguration (argc, argv, lSpellingErrorDistance, lTravelQuery, + lXapianDatabaseName, lLogFilename); + + if (lOptionParserStatus == K_OPENTREP_EARLY_RETURN_STATUS) { + return 0; } - if (argc >= 3 && argv[3] != NULL) { - std::istringstream istr (argv[3]); - istr >> lXapianDatabaseName; - } - // Set the log parameters std::ofstream logOutputFile; // open and clean the log outputfile @@ -135,8 +221,8 @@ logOutputFile.clear(); // Initialise the context - OPENTREP::OPENTREP_Service opentrepService; - opentrepService.init (logOutputFile, lXapianDatabaseName); + OPENTREP::OPENTREP_Service opentrepService (logOutputFile, + lXapianDatabaseName); // Query the Xapian database (index) opentrepService.interpretTravelRequest (lTravelQuery); Modified: trunk/opentrep/opentrep/bom/Names.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Names.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Names.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -28,6 +28,17 @@ // ////////////////////////////////////////////////////////////////////// Names::~Names() { } + + // ////////////////////////////////////////////////////////////////////// + std::string Names::getFirstName() const { + if (_nameList.empty() == true) { + return ""; + } + NameList_T::const_iterator itName = _nameList.begin(); + assert (itName != _nameList.end()); + const std::string& lName = *itName; + return lName; + } // ////////////////////////////////////////////////////////////////////// const std::string Names::describeShortKey() const { Modified: trunk/opentrep/opentrep/bom/Names.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Names.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Names.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -46,6 +46,11 @@ return _nameList; } + /** Get the first name of the list. + <br>Note that it can be empty (when the list is itself empty). */ + std::string getFirstName() const; + + // /////////// Setters /////////////// /** Set the language code. */ void setLanguageCode (const Language::EN_Language& iLanguageCode) { Modified: trunk/opentrep/opentrep/bom/Place.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Place.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Place.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -2,7 +2,7 @@ // Import section // ////////////////////////////////////////////////////////////////////// // C -#include <assert.h> +#include <cassert> // OpenTrep BOM #include <opentrep/bom/Place.hpp> #include <opentrep/service/Logger.hpp> @@ -10,12 +10,12 @@ namespace OPENTREP { // ////////////////////////////////////////////////////////////////////// - Place::Place () : _world (NULL) { + Place::Place () : _world (NULL), _placeHolder (NULL) { } // ////////////////////////////////////////////////////////////////////// Place::Place (const Place& iPlace) : - _world (iPlace._world), + _world (iPlace._world), _placeHolder (iPlace._placeHolder), _placeCode (iPlace._placeCode), _cityCode (iPlace._cityCode), _stateCode (iPlace._stateCode), _countryCode (iPlace._countryCode), _regionCode (iPlace._regionCode), _continentCode (iPlace._continentCode), @@ -67,6 +67,33 @@ } // ////////////////////////////////////////////////////////////////////// + std::string Place::toShortString() const { + /* When the city code is empty, it means that the place is a city and + not an airport. The city code is thus the same as the place code + itself. */ + std::ostringstream oStr; + oStr << describeShortKey() << ", "; + if (_cityCode.empty()) { + oStr << _placeCode << ", "; + } else { + oStr << _cityCode << ", "; + } + oStr << _stateCode + << ", " << _countryCode << ", " << _regionCode + << ", " << _continentCode << ", " << _timeZoneGroup + << ", " << _longitude << ", " << _latitude << ", " << _docID; + + NameMatrix_T::const_iterator itNameHolder = _nameMatrix.begin(); + const Names& lNameHolder = itNameHolder->second; + const std::string& lFirstName = lNameHolder.getFirstName(); + if (lFirstName.empty() == false) { + oStr << ", " << lFirstName << "."; + } + + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// void Place::toStream (std::ostream& ioOut) const { ioOut << toString(); } Modified: trunk/opentrep/opentrep/bom/Place.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Place.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Place.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -18,11 +18,13 @@ // Forward declarations class World; + class PlaceHolder; /** Structure modelling an place. */ class Place : public BomAbstract { friend class FacWorld; friend class FacPlace; + friend class FacPlaceHolder; friend class DbaPlace; public: // ///////// Getters //////// @@ -141,6 +143,7 @@ /** Reset the map of name lists. */ void resetMatrix(); + public: // ///////// Display methods //////// @@ -155,6 +158,9 @@ /** Get the serialised version of the Place object. */ std::string toString() const; + /** Get a short display of the Business Object. */ + std::string toShortString() const; + /** Get a string describing the whole key (differentiating two objects at any level). */ const std::string describeKey() const; @@ -182,6 +188,9 @@ /** Parent World. */ World* _world; + /** Parent PlaceHolder. */ + PlaceHolder* _placeHolder; + private: // /////// Attributes ///////// /** Place code. */ Added: trunk/opentrep/opentrep/bom/PlaceHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/PlaceHolder.cpp (rev 0) +++ trunk/opentrep/opentrep/bom/PlaceHolder.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,81 @@ +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// C +#include <cassert> +#include <sstream> +// OpenTREP +#include <opentrep/bom/Place.hpp> +#include <opentrep/bom/PlaceHolder.hpp> +#include <opentrep/service/Logger.hpp> + +namespace OPENTREP { + + // ////////////////////////////////////////////////////////////////////// + PlaceHolder::PlaceHolder () { + init(); + } + + // ////////////////////////////////////////////////////////////////////// + PlaceHolder::~PlaceHolder () { + } + + // ////////////////////////////////////////////////////////////////////// + void PlaceHolder::init () { + _placeList.clear(); + _placeOrderedList.clear(); + } + + // ////////////////////////////////////////////////////////////////////// + const std::string PlaceHolder::describeShortKey() const { + std::ostringstream oStr; + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + const std::string PlaceHolder::describeKey() const { + return describeShortKey(); + } + + // ////////////////////////////////////////////////////////////////////// + std::string PlaceHolder::toString() const { + std::ostringstream oStr; + oStr << describeShortKey() << std::endl; + + for (PlaceOrderedList_T::const_iterator itPlace = _placeOrderedList.begin(); + itPlace != _placeOrderedList.end(); ++itPlace) { + const Place* lPlace_ptr = *itPlace; + assert (lPlace_ptr != NULL); + + oStr << lPlace_ptr->toString() << std::endl; + } + + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + std::string PlaceHolder::toShortString() const { + std::ostringstream oStr; + oStr << describeShortKey() << std::endl; + + for (PlaceOrderedList_T::const_iterator itPlace = _placeOrderedList.begin(); + itPlace != _placeOrderedList.end(); ++itPlace) { + const Place* lPlace_ptr = *itPlace; + assert (lPlace_ptr != NULL); + + oStr << lPlace_ptr->toShortString() << std::endl; + } + + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + void PlaceHolder::toStream (std::ostream& ioOut) const { + ioOut << toString(); + } + + // ////////////////////////////////////////////////////////////////////// + void PlaceHolder::fromStream (std::istream& ioIn) { + } + +} Added: trunk/opentrep/opentrep/bom/PlaceHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/PlaceHolder.hpp (rev 0) +++ trunk/opentrep/opentrep/bom/PlaceHolder.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,80 @@ +#ifndef __OPENTREP_BOM_PLACEHOLDER_HPP +#define __OPENTREP_BOM_PLACEHOLDER_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// OpenTREP +#include <opentrep/OPENTREP_Types.hpp> +#include <opentrep/bom/BomAbstract.hpp> +#include <opentrep/bom/PlaceList.hpp> + +namespace OPENTREP { + + /** Class wrapping functions on a list of Place objects. */ + class PlaceHolder : public BomAbstract { + friend class FacPlaceHolder; + public: + // ////////////// Getters ///////////// + /** Retrieve the list of place objects. */ + const PlaceList_T& getPlaceList() const { + return _placeList; + } + + + // ////////////// Setters ///////////// + + + public: + // /////////// Business methods ///////// + + + public: + // /////////// Display support methods ///////// + /** Dump a Business Object into an output stream. + @param ostream& the output stream. */ + void toStream (std::ostream& ioOut) const; + + /** Read a Business Object from an input stream. + @param istream& the input stream. */ + void fromStream (std::istream& ioIn); + + /** Get the serialised version of the Business Object. */ + std::string toString() const; + + /** Get a short display of the Business Object. */ + std::string toShortString() const; + + /** Get a string describing the whole key (differentiating two objects + at any level). */ + const std::string describeKey() const; + + /** Get a string describing the short key (differentiating two objects + at the same level). */ + const std::string describeShortKey() const; + + + private: + // ////////////// Constructors and Destructors ///////////// + /** Default constructor. */ + PlaceHolder (); + /** Default copy constructor. */ + PlaceHolder (const PlaceHolder&); + /** Destructor. */ + ~PlaceHolder (); + /** Initialise (reset the list of documents). */ + void init (); + + + private: + // /////////////// Attributes //////////////// + /** List of place objects, sorted by Place ID. */ + PlaceList_T _placeList; + + /** List of place objects, the sort order corresponding to their + insertion order. */ + PlaceOrderedList_T _placeOrderedList; + }; + +} +#endif // __OPENTREP_BOM_PLACEHOLDER_HPP Modified: trunk/opentrep/opentrep/bom/PlaceList.hpp =================================================================== --- trunk/opentrep/opentrep/bom/PlaceList.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/PlaceList.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -15,8 +15,9 @@ // ///////////// Type definitions //////////////////// typedef std::size_t PlaceID_T; - typedef std::map<PlaceID_T, Place*> PlaceList_T; - typedef std::list<Place*> SimplePlaceList_T; + // typedef std::map<PlaceID_T, Place*> PlaceDirectList_T; + typedef std::map<std::string, Place*> PlaceList_T; + typedef std::list<Place*> PlaceOrderedList_T; } #endif // __OPENTREP_BOM_PLACELIST_HPP Modified: trunk/opentrep/opentrep/bom/Result.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Result.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Result.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -14,7 +14,7 @@ // ////////////////////////////////////////////////////////////////////// Result::Result (const Xapian::Database& iDatabase) - : _database (iDatabase) { + : _resultHolder (NULL), _database (iDatabase) { init(); } Modified: trunk/opentrep/opentrep/bom/Result.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Result.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/Result.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -11,10 +11,13 @@ namespace OPENTREP { + // Forward declarations + class ResultHolder; + /** Class wrapping functions on a list of Xapian Document objects. */ class Result : public BomAbstract { + friend class FacResultHolder; friend class FacResult; - friend class ResultHolder; public: // ////////////// Getters ///////////// /** Get the query string. */ @@ -86,6 +89,9 @@ private: // /////////////// Attributes //////////////// + /** Parent ResultHolder. */ + ResultHolder* _resultHolder; + /** Query string having generated the list of document. */ TravelQuery_T _queryString; Modified: trunk/opentrep/opentrep/bom/ResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -10,6 +10,9 @@ #include <opentrep/bom/StringMatcher.hpp> #include <opentrep/bom/Result.hpp> #include <opentrep/bom/ResultHolder.hpp> +// TODO: move that out of the BOM layer +#include <opentrep/factory/FacResultHolder.hpp> +#include <opentrep/factory/FacResult.hpp> #include <opentrep/service/Logger.hpp> namespace OPENTREP { @@ -89,18 +92,19 @@ resulting string gets empty. */ DocumentList_T lDocumentList; - Result* lResult_ptr = new Result (_database); - assert (lResult_ptr != NULL); + // TODO: move that out of the BOM layer + Result& lResult = FacResult::instance().create (_database); std::string lQueryString (lRemainingQueryString); // - lResult_ptr->setQueryString (lQueryString); - lResult_ptr->searchString (); + lResult.setQueryString (lQueryString); + lResult.searchString (); // Add the Result object (holding the list of matching // documents) to the dedicated list. - _resultList.push_back (lResult_ptr); + // TODO: move that out of the BOM layer + FacResultHolder::initLinkWithResult (*this, lResult); /** Remove, from the lRemainingQueryString string, the part which @@ -113,7 +117,7 @@ 'rio de janeiro'. So, the already parsed part, namely 'sna francisco', must be subtracted from the initial query string. */ - lQueryString = lResult_ptr->getQueryString(); + lQueryString = lResult.getQueryString(); StringMatcher::subtractParsedToRemaining (lQueryString, lRemainingQueryString); Modified: trunk/opentrep/opentrep/bom/ResultHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -19,7 +19,6 @@ /** Class wrapping functions on a list of Result objects. */ class ResultHolder : public BomAbstract { friend class FacResultHolder; - friend class RequestInterpreter; public: // ////////////// Getters ///////////// /** Get the query string. */ Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -20,13 +20,31 @@ namespace OPENTREP { + // ///////////// Type definitions ////////// + /** Edit distance (e.g., 2 or 3). */ + typedef unsigned int EditDistance_T; + + /** Given the size of the phrase, determine the allowed edit distance for + spelling purpose. For instance, an edit distance of 1 will be allowed + on a 4-letter word, while an edit distance of 3 will be allowed on an + 11-letter word. */ + // ////////////////////////////////////////////////////////////////////// + static unsigned int calculateEditDistance (const std::string& iPhrase) { + EditDistance_T oEditDistance = 2; + + const EditDistance_T lQueryStringSize = iPhrase.size(); + + oEditDistance = lQueryStringSize / 3; + return oEditDistance; + } + /** For each of the word in the given list, perform spelling corrections. If the word is correctly spelled, it is copied as is. Otherwise, a corrected version is stored. */ // ////////////////////////////////////////////////////////////////////// static void createCorrectedWordList (const WordList_T& iOriginalWordList, - WordList_T& ioCorrectedWordList, - const Xapian::Database& iDatabase) { + WordList_T& ioCorrectedWordList, + const Xapian::Database& iDatabase) { // Empty the target list ioCorrectedWordList.clear(); @@ -36,8 +54,9 @@ for (WordList_T::const_iterator itWord = iOriginalWordList.begin(); itWord != iOriginalWordList.end(); ++itWord) { const std::string& lOriginalWord = *itWord; + const EditDistance_T lEditDistance= calculateEditDistance(lOriginalWord); const std::string& lSuggestedWord = - iDatabase.get_spelling_suggestion (lOriginalWord, 3); + iDatabase.get_spelling_suggestion (lOriginalWord, lEditDistance); if (lSuggestedWord.empty() == true) { ioCorrectedWordList.push_back (lOriginalWord); @@ -110,8 +129,10 @@ phrase/string. With the above example, 'sna francisco' yields the suggestion 'san francisco'. */ + const EditDistance_T lEditDistance = + calculateEditDistance (lOriginalQueryString); const std::string lFullWordCorrectedString = - ioDatabase.get_spelling_suggestion (lOriginalQueryString, 3); + ioDatabase.get_spelling_suggestion (lOriginalQueryString, lEditDistance); // DEBUG /* Modified: trunk/opentrep/opentrep/bom/World.cpp =================================================================== --- trunk/opentrep/opentrep/bom/World.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/World.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -62,8 +62,8 @@ oStr << shortDisplay(); - for (SimplePlaceList_T::const_iterator itPlace = _simplePlaceList.begin(); - itPlace != _simplePlaceList.end(); ++itPlace) { + for (PlaceOrderedList_T::const_iterator itPlace = _placeOrderedList.begin(); + itPlace != _placeOrderedList.end(); ++itPlace) { const Place* lPlace_ptr = *itPlace; assert (lPlace_ptr != NULL); Modified: trunk/opentrep/opentrep/bom/World.hpp =================================================================== --- trunk/opentrep/opentrep/bom/World.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/World.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -23,14 +23,14 @@ public: // //////////// Getters ///////////// /** Get the list of Place objects. - const PlaceList_T& getPlaceList () const { + const PlaceDirectList_T& getPlaceList () const { return _placeList; } */ /** Get the list of Place objects. */ - const SimplePlaceList_T& getSimplePlaceList () const { - return _simplePlaceList; + const PlaceOrderedList_T& getSimplePlaceList () const { + return _placeOrderedList; } // //////////// Setters ///////////// @@ -80,12 +80,12 @@ /** List of Place objects. <br>That list is actually a STL map, indexed on the Xapian document ID. */ - // PlaceList_T _placeList; + // PlaceDirectList_T _placeList; /** List of Place objects. <br>That list is actually a STL list, to store temporarily Place objects when indexing the Xapian database. */ - SimplePlaceList_T _simplePlaceList; + PlaceOrderedList_T _placeOrderedList; }; // ///////////// Type definitions //////////////////// Modified: trunk/opentrep/opentrep/bom/sources.mk =================================================================== --- trunk/opentrep/opentrep/bom/sources.mk 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/bom/sources.mk 2009-07-17 17:19:13 UTC (rev 134) @@ -3,11 +3,12 @@ $(top_srcdir)/opentrep/bom/Language.hpp \ $(top_srcdir)/opentrep/bom/GenericBom.hpp \ $(top_srcdir)/opentrep/bom/World.hpp \ + $(top_srcdir)/opentrep/bom/WordList.hpp \ + $(top_srcdir)/opentrep/bom/WordHolder.hpp \ $(top_srcdir)/opentrep/bom/Names.hpp \ $(top_srcdir)/opentrep/bom/Place.hpp \ $(top_srcdir)/opentrep/bom/PlaceList.hpp \ - $(top_srcdir)/opentrep/bom/WordList.hpp \ - $(top_srcdir)/opentrep/bom/WordHolder.hpp \ + $(top_srcdir)/opentrep/bom/PlaceHolder.hpp \ $(top_srcdir)/opentrep/bom/DocumentList.hpp \ $(top_srcdir)/opentrep/bom/Result.hpp \ $(top_srcdir)/opentrep/bom/ResultList.hpp \ @@ -17,9 +18,10 @@ $(top_srcdir)/opentrep/bom/BomType.cpp \ $(top_srcdir)/opentrep/bom/Language.cpp \ $(top_srcdir)/opentrep/bom/World.cpp \ + $(top_srcdir)/opentrep/bom/WordHolder.cpp \ $(top_srcdir)/opentrep/bom/Names.cpp \ $(top_srcdir)/opentrep/bom/Place.cpp \ - $(top_srcdir)/opentrep/bom/WordHolder.cpp \ + $(top_srcdir)/opentrep/bom/PlaceHolder.cpp \ $(top_srcdir)/opentrep/bom/Result.cpp \ $(top_srcdir)/opentrep/bom/ResultHolder.cpp \ $(top_srcdir)/opentrep/bom/StringMatcher.cpp Modified: trunk/opentrep/opentrep/command/DBManager.cpp =================================================================== --- trunk/opentrep/opentrep/command/DBManager.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/command/DBManager.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -185,7 +185,7 @@ hasStillData = iterateOnStatement (lSelectStatement, ioPlace, shouldNotDoReset); if (hasStillData == true) { - throw new MultipleRowsForASingleDocIDException(); + throw MultipleRowsForASingleDocIDException(); } // Debug Modified: trunk/opentrep/opentrep/command/RequestInterpreter.cpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -11,7 +11,10 @@ #include <opentrep/bom/Place.hpp> #include <opentrep/bom/ResultHolder.hpp> #include <opentrep/bom/Result.hpp> +#include <opentrep/bom/PlaceHolder.hpp> +#include <opentrep/factory/FacPlaceHolder.hpp> #include <opentrep/factory/FacPlace.hpp> +#include <opentrep/factory/FacResultHolder.hpp> #include <opentrep/command/DBManager.hpp> #include <opentrep/command/RequestInterpreter.hpp> #include <opentrep/service/Logger.hpp> @@ -24,40 +27,31 @@ void RequestInterpreter:: interpretTravelRequest (soci::session& ioSociSession, const TravelDatabaseName_T& iTravelDatabaseName, - const TravelQuery_T& iTravelQuery) { + const TravelQuery_T& iTravelQuery, + PlaceHolder& ioPlaceHolder) { try { // Make the database Xapian::Database lXapianDatabase (iTravelDatabaseName); - // TODO: Use FacResultHolder for the following - ResultHolder* lResultHolder_ptr = new ResultHolder (iTravelQuery, - lXapianDatabase); - assert (lResultHolder_ptr != NULL); + // Create a ResultHolder object + ResultHolder& lResultHolder = + FacResultHolder::instance().create (iTravelQuery, lXapianDatabase); // - lResultHolder_ptr->searchString(); + lResultHolder.searchString(); // DEBUG OPENTREP_LOG_DEBUG (std::endl - << "_________________________________________" - << std::endl << "=========================================" - << std::endl - << "-----------------------------------------" - << std::endl - << "Matching list: " << std::endl - << lResultHolder_ptr->toString() - << "_________________________________________" - << std::endl + << std::endl << "Matching list: " << std::endl + << lResultHolder.toString() << "=========================================" - << std::endl - << "-----------------------------------------" << std::endl << std::endl); // Browse the list of result objects - const ResultList_T& lResultList = lResultHolder_ptr->getResultList(); + const ResultList_T& lResultList = lResultHolder.getResultList(); for (ResultList_T::const_iterator itResult = lResultList.begin(); itResult != lResultList.end(); ++itResult) { // Retrieve the result object @@ -86,8 +80,14 @@ DBManager::retrievePlace (ioSociSession, lDocID, lPlace); if (hasRetrievedPlace == true) { + // Insert the Place object within the PlaceHolder object + FacPlaceHolder::initLinkWithPlace (ioPlaceHolder, lPlace); + + // DEBUG OPENTREP_LOG_DEBUG ("Retrieved Document: " << lPlace.toString()); + } else { + // DEBUG OPENTREP_LOG_DEBUG ("No retrieved Document for ID = " << lDocID); } } Modified: trunk/opentrep/opentrep/command/RequestInterpreter.hpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/command/RequestInterpreter.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -16,6 +16,9 @@ namespace OPENTREP { + // Forward declarations + class PlaceHolder; + /** Command wrapping the travel request process. */ class RequestInterpreter { friend class OPENTREP_Service; @@ -23,7 +26,7 @@ /** Interpret a search query. */ static void interpretTravelRequest (soci::session&, const TravelDatabaseName_T&, - const TravelQuery_T&); + const TravelQuery_T&, PlaceHolder&); private: /** Constructors. */ Modified: trunk/opentrep/opentrep/factory/FacPlace.cpp =================================================================== --- trunk/opentrep/opentrep/factory/FacPlace.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/factory/FacPlace.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -2,7 +2,7 @@ // Import section // ////////////////////////////////////////////////////////////////////// // C -#include <assert.h> +#include <cassert> // OPENTREP #include <opentrep/bom/Place.hpp> #include <opentrep/factory/FacSupervisor.hpp> @@ -13,6 +13,14 @@ FacPlace* FacPlace::_instance = NULL; // ////////////////////////////////////////////////////////////////////// + FacPlace::FacPlace () { + } + + // ////////////////////////////////////////////////////////////////////// + FacPlace::FacPlace (const FacPlace&) { + } + + // ////////////////////////////////////////////////////////////////////// FacPlace::~FacPlace () { _instance = NULL; } Modified: trunk/opentrep/opentrep/factory/FacPlace.hpp =================================================================== --- trunk/opentrep/opentrep/factory/FacPlace.hpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/factory/FacPlace.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -40,8 +40,8 @@ /** Default Constructor. <br>This constructor is private in order to ensure the singleton pattern.*/ - FacPlace () {} - FacPlace (const FacPlace&) {} + FacPlace (); + FacPlace (const FacPlace&); private: /** The unique instance.*/ Added: trunk/opentrep/opentrep/factory/FacPlaceHolder.cpp =================================================================== --- trunk/opentrep/opentrep/factory/FacPlaceHolder.cpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacPlaceHolder.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,76 @@ +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// C +#include <cassert> +// OPENTREP +#include <opentrep/bom/PlaceHolder.hpp> +#include <opentrep/bom/Place.hpp> +#include <opentrep/factory/FacSupervisor.hpp> +#include <opentrep/factory/FacPlaceHolder.hpp> +#include <opentrep/service/Logger.hpp> + +namespace OPENTREP { + + FacPlaceHolder* FacPlaceHolder::_instance = NULL; + + // ////////////////////////////////////////////////////////////////////// + FacPlaceHolder::FacPlaceHolder () { + } + + // ////////////////////////////////////////////////////////////////////// + FacPlaceHolder::FacPlaceHolder (const FacPlaceHolder&) { + } + + // ////////////////////////////////////////////////////////////////////// + FacPlaceHolder::~FacPlaceHolder () { + _instance = NULL; + } + + // ////////////////////////////////////////////////////////////////////// + FacPlaceHolder& FacPlaceHolder::instance () { + + if (_instance == NULL) { + _instance = new FacPlaceHolder(); + assert (_instance != NULL); + + FacSupervisor::instance().registerBomFactory (_instance); + } + return *_instance; + } + + // ////////////////////////////////////////////////////////////////////// + PlaceHolder& FacPlaceHolder::create () { + PlaceHolder* oPlaceHolder_ptr = NULL; + + oPlaceHolder_ptr = new PlaceHolder (); + assert (oPlaceHolder_ptr != NULL); + + // The new object is added to the Bom pool + _pool.push_back (oPlaceHolder_ptr); + + return *oPlaceHolder_ptr; + } + + // ////////////////////////////////////////////////////////////////////// + void FacPlaceHolder::initLinkWithPlace (PlaceHolder& ioPlaceHolder, + Place& ioPlace) { + // Link the PlaceHolder to the Place, and vice versa + ioPlace._placeHolder = &ioPlaceHolder; + + // Add the Place to the PlaceHolder internal map (of Place objects) + const bool insertSucceeded = ioPlaceHolder._placeList. + insert (PlaceList_T::value_type (ioPlace.describeShortKey(), + &ioPlace)).second; + if (insertSucceeded == false) { + OPENTREP_LOG_ERROR ("Insertion failed for " + << ioPlaceHolder.describeKey() + << " and " << ioPlace.describeShortKey()); + assert (insertSucceeded == true); + } + + // Add the Place to the PlaceHolder internal list (of Place objects) + ioPlaceHolder._placeOrderedList.push_back (&ioPlace); + } + +} Added: trunk/opentrep/opentrep/factory/FacPlaceHolder.hpp =================================================================== --- trunk/opentrep/opentrep/factory/FacPlaceHolder.hpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacPlaceHolder.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,56 @@ +#ifndef __OPENTREP_FAC_FACPLACEHOLDER_HPP +#define __OPENTREP_FAC_FACPLACEHOLDER_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// OPENTREP +#include <opentrep/factory/FacBomAbstract.hpp> + +namespace OPENTREP { + + /** Forward declarations. */ + class PlaceHolder; + class Place; + + /** Factory for Place. */ + class FacPlaceHolder : public FacBomAbstract { + public: + + /** Provide the unique instance. + <br> The singleton is instantiated when first used + @return FacPlaceHolder& */ + static FacPlaceHolder& instance(); + + /** Destructor. + <br> The Destruction put the _instance to NULL + in order to be clean for the next FacPlaceHolder::instance() */ + virtual ~FacPlaceHolder(); + + /** Create a new PlaceHolder object. + <br>This new object is added to the list of instantiated objects. + @return PlaceHolder& The newly created object. */ + PlaceHolder& create (); + + /** Initialise the link between a PlaceHolder and a Place. + @param PlaceHolder& + @param Place& + @exception FacExceptionNullPointer + @exception FacException.*/ + static void initLinkWithPlace (PlaceHolder&, Place&); + + + private: + /** Default Constructor. + <br>This constructor is private in order to ensure the singleton + pattern.*/ + FacPlaceHolder (); + FacPlaceHolder (const FacPlaceHolder&); + + private: + /** The unique instance.*/ + static FacPlaceHolder* _instance; + + }; +} +#endif // __OPENTREP_FAC_FACPLACEHOLDER_HPP Added: trunk/opentrep/opentrep/factory/FacResult.cpp =================================================================== --- trunk/opentrep/opentrep/factory/FacResult.cpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacResult.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,66 @@ +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// C +#include <cassert> +// OPENTREP +#include <opentrep/bom/Result.hpp> +#include <opentrep/factory/FacSupervisor.hpp> +#include <opentrep/factory/FacResult.hpp> + +namespace OPENTREP { + + FacResult* FacResult::_instance = NULL; + + // ////////////////////////////////////////////////////////////////////// + FacResult::FacResult () { + } + + // ////////////////////////////////////////////////////////////////////// + FacResult::FacResult (const FacResult&) { + } + + // ////////////////////////////////////////////////////////////////////// + FacResult::~FacResult () { + _instance = NULL; + } + + // ////////////////////////////////////////////////////////////////////// + FacResult& FacResult::instance () { + + if (_instance == NULL) { + _instance = new FacResult(); + assert (_instance != NULL); + + FacSupervisor::instance().registerBomFactory (_instance); + } + return *_instance; + } + + // ////////////////////////////////////////////////////////////////////// + Result& FacResult::create (const Xapian::Database& iXapianDatabase) { + Result* oResult_ptr = NULL; + + oResult_ptr = new Result (iXapianDatabase); + assert (oResult_ptr != NULL); + + // The new object is added to the Bom pool + _pool.push_back (oResult_ptr); + + return *oResult_ptr; + } + + // ////////////////////////////////////////////////////////////////////// + Result& FacResult::clone (const Result& iResult) { + Result* oResult_ptr = NULL; + + oResult_ptr = new Result (iResult); + assert (oResult_ptr != NULL); + + // The new object is added to the Bom pool + _pool.push_back (oResult_ptr); + + return *oResult_ptr; + } + +} Added: trunk/opentrep/opentrep/factory/FacResult.hpp =================================================================== --- trunk/opentrep/opentrep/factory/FacResult.hpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacResult.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,52 @@ +#ifndef __OPENTREP_FAC_FACRESULT_HPP +#define __OPENTREP_FAC_FACRESULT_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// OPENTREP +#include <opentrep/factory/FacBomAbstract.hpp> + +namespace OPENTREP { + + /** Forward declarations. */ + class Result; + + /** Factory for Result. */ + class FacResult : public FacBomAbstract { + public: + + /** Provide the unique instance. + <br> The singleton is instantiated when first used + @return FacResult& */ + static FacResult& instance(); + + /** Destructor. + <br> The Destruction put the _instance to NULL + in order to be clean for the next FacResult::instance() */ + virtual ~FacResult(); + + /** Create a new Result object. + <br>This new object is added to the list of instantiated objects. + @return Result& The newly created object. */ + Result& create (const Xapian::Database&); + + /** Create a copy of a Result object. + <br>This new object is added to the list of instantiated objects. + @return Result& The newly created object. */ + Result& clone (const Result&); + + private: + /** Default Constructor. + <br>This constructor is private in order to ensure the singleton + pattern.*/ + FacResult (); + FacResult (const FacResult&); + + private: + /** The unique instance.*/ + static FacResult* _instance; + + }; +} +#endif // __OPENTREP_FAC_FACRESULT_HPP Added: trunk/opentrep/opentrep/factory/FacResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/factory/FacResultHolder.cpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacResultHolder.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,66 @@ +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// C +#include <cassert> +// OPENTREP +#include <opentrep/bom/ResultHolder.hpp> +#include <opentrep/bom/Result.hpp> +#include <opentrep/factory/FacSupervisor.hpp> +#include <opentrep/factory/FacResultHolder.hpp> +#include <opentrep/service/Logger.hpp> + +namespace OPENTREP { + + FacResultHolder* FacResultHolder::_instance = NULL; + + // ////////////////////////////////////////////////////////////////////// + FacResultHolder::FacResultHolder () { + } + + // ////////////////////////////////////////////////////////////////////// + FacResultHolder::FacResultHolder (const FacResultHolder&) { + } + + // ////////////////////////////////////////////////////////////////////// + FacResultHolder::~FacResultHolder () { + _instance = NULL; + } + + // ////////////////////////////////////////////////////////////////////// + FacResultHolder& FacResultHolder::instance () { + + if (_instance == NULL) { + _instance = new FacResultHolder(); + assert (_instance != NULL); + + FacSupervisor::instance().registerBomFactory (_instance); + } + return *_instance; + } + + // ////////////////////////////////////////////////////////////////////// + ResultHolder& FacResultHolder::create (const TravelQuery_T& iQueryString, + const Xapian::Database& iDatabase) { + ResultHolder* oResultHolder_ptr = NULL; + + oResultHolder_ptr = new ResultHolder (iQueryString, iDatabase); + assert (oResultHolder_ptr != NULL); + + // The new object is added to the Bom pool + _pool.push_back (oResultHolder_ptr); + + return *oResultHolder_ptr; + } + + // ////////////////////////////////////////////////////////////////////// + void FacResultHolder::initLinkWithResult (ResultHolder& ioResultHolder, + Result& ioResult) { + // Link the ResultHolder to the Result, and vice versa + ioResult._resultHolder = &ioResultHolder; + + // Add the Result to the ResultHolder internal list (of Result objects) + ioResultHolder._resultList.push_back (&ioResult); + } + +} Added: trunk/opentrep/opentrep/factory/FacResultHolder.hpp =================================================================== --- trunk/opentrep/opentrep/factory/FacResultHolder.hpp (rev 0) +++ trunk/opentrep/opentrep/factory/FacResultHolder.hpp 2009-07-17 17:19:13 UTC (rev 134) @@ -0,0 +1,63 @@ +#ifndef __OPENTREP_FAC_FACRESULTHOLDER_HPP +#define __OPENTREP_FAC_FACRESULTHOLDER_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// OpenTREP +#include <opentrep/factory/FacBomAbstract.hpp> +#include <opentrep/OPENTREP_Types.hpp> + +// Forward declarations +namespace Xapian { + class Database; +} + +namespace OPENTREP { + + /** Forward declarations. */ + class ResultHolder; + class Result; + + /** Factory for Result. */ + class FacResultHolder : public FacBomAbstract { + public: + + /** Provide the unique instance. + <br> The singleton is instantiated when first used + @return FacResultHolder& */ + static FacResultHolder& instance(); + + /** Destructor. + <br> The Destruction put the _instance to NULL + in order to be clean for the next FacResultHolder::instance() */ + virtual ~FacResultHolder(); + + /** Create a new ResultHolder object. + <br>This new object is added to the list of instantiated objects. + @return ResultHolder& The newly created object. */ + ResultHolder& create (const TravelQuery_T& iQueryString, + const Xapian::Database& iDatabase); + + /** Initialise the link between a ResultHolder and a Result. + @param ResultHolder& + @param Result& + @exception FacExceptionNullPointer + @exception FacException.*/ + static void initLinkWithResult (ResultHolder&, Result&); + + + private: + /** Default Constructor. + <br>This constructor is private in order to ensure the singleton + pattern.*/ + FacResultHolder (); + FacResultHolder (const FacResultHolder&); + + private: + /** The unique instance.*/ + static FacResultHolder* _instance; + + }; +} +#endif // __OPENTREP_FAC_FACRESULTHOLDER_HPP Modified: trunk/opentrep/opentrep/factory/FacWorld.cpp =================================================================== --- trunk/opentrep/opentrep/factory/FacWorld.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/factory/FacWorld.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -71,8 +71,8 @@ // Add the Place pointer to the dedicated list within the World object /* - const bool insertSucceeded2 = ioWorld._placeList. - insert (PlaceList_T::value_type (lPlaceID, &ioPlace)).second; + const bool insertSucceeded2 = ioWorld._placeDirectList. + insert (PlaceDirectList_T::value_type (lPlaceID, &ioPlace)).second; if (insertSucceeded2 == false) { OPENTREP_LOG_ERROR ("Insertion failed for " << ioWorld.describeKey() << " and " << ioPlace.describeShortKey()); @@ -83,7 +83,7 @@ } // Add the Place pointer to the dedicated list within the World object - ioWorld._simplePlaceList.push_back (&ioPlace); + ioWorld._placeOrderedList.push_back (&ioPlace); } // ////////////////////////////////////////////////////////////////////// Modified: trunk/opentrep/opentrep/factory/sources.mk =================================================================== --- trunk/opentrep/opentrep/factory/sources.mk 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/factory/sources.mk 2009-07-17 17:19:13 UTC (rev 134) @@ -3,10 +3,16 @@ $(top_srcdir)/opentrep/factory/FacSupervisor.hpp \ $(top_srcdir)/opentrep/factory/FacOpenTrepServiceContext.hpp \ $(top_srcdir)/opentrep/factory/FacWorld.hpp \ - $(top_srcdir)/opentrep/factory/FacPlace.hpp + $(top_srcdir)/opentrep/factory/FacPlaceHolder.hpp \ + $(top_srcdir)/opentrep/factory/FacPlace.hpp \ + $(top_srcdir)/opentrep/factory/FacResultHolder.hpp \ + $(top_srcdir)/opentrep/factory/FacResult.hpp fac_cc_sources = $(top_srcdir)/opentrep/factory/FacBomAbstract.cpp \ $(top_srcdir)/opentrep/factory/FacServiceAbstract.cpp \ $(top_srcdir)/opentrep/factory/FacSupervisor.cpp \ $(top_srcdir)/opentrep/factory/FacOpenTrepServiceContext.cpp \ $(top_srcdir)/opentrep/factory/FacWorld.cpp \ - $(top_srcdir)/opentrep/factory/FacPlace.cpp + $(top_srcdir)/opentrep/factory/FacPlaceHolder.cpp \ + $(top_srcdir)/opentrep/factory/FacPlace.cpp \ + $(top_srcdir)/opentrep/factory/FacResultHolder.cpp \ + $(top_srcdir)/opentrep/factory/FacResult.cpp Modified: trunk/opentrep/opentrep/service/OPENTREP_Service.cpp =================================================================== --- trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-17 00:10:38 UTC (rev 133) +++ trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-17 17:19:13 UTC (rev 134) @@ -2,11 +2,13 @@ // Import section // ////////////////////////////////////////////////////////////////////// // C -#include <assert.h> +#include <cassert> // OpenTrep #include <opentrep/basic/BasConst_OPENTREP_Service.hpp> #include <opentrep/basic/BasChronometer.hpp> +#include <opentrep/bom/PlaceHolder.hpp> #include <opentrep/factory/FacWorld.hpp> +#include <opentrep/factory/FacPlaceHolder.hpp> #include <opentrep/command/SociSessionManager.hpp> #include <opentrep/command/DBManager.hpp> #include <opentrep/command/IndexBuilder.hpp> @@ -19,12 +21,21 @@ namespace OPENTREP { // ////////////////////////////////////////////////////////////////////// + OPENTREP_Service::OPENTREP_Service (std::ostream& ioLogStream, + const std::string& iXapianDatabaseFilepath) + : _opentrepSer... [truncated message content] |
From: <den...@us...> - 2009-07-18 00:50:11
|
Revision: 135 http://opentrep.svn.sourceforge.net/opentrep/?rev=135&view=rev Author: denis_arnaud Date: 2009-07-18 00:50:00 +0000 (Sat, 18 Jul 2009) Log Message: ----------- 1. Re-worked the searcher, so that the Result objects be created by the command (and no longer by the BOM itself). 2. Added building makefiles for database handling. Modified Paths: -------------- trunk/opentrep/Makefile.am trunk/opentrep/configure.ac trunk/opentrep/opentrep/OPENTREP_Types.hpp trunk/opentrep/opentrep/batches/indexer.cpp trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/bom/DocumentList.hpp trunk/opentrep/opentrep/bom/Result.cpp trunk/opentrep/opentrep/bom/Result.hpp trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/bom/ResultHolder.hpp trunk/opentrep/opentrep/bom/StringMatcher.cpp trunk/opentrep/opentrep/bom/StringMatcher.hpp trunk/opentrep/opentrep/command/RequestInterpreter.cpp trunk/opentrep/opentrep/command/RequestInterpreter.hpp trunk/opentrep/opentrep/factory/FacResult.cpp trunk/opentrep/opentrep/factory/FacResult.hpp trunk/opentrep/opentrep/service/OPENTREP_Service.cpp Added Paths: ----------- trunk/opentrep/db/ trunk/opentrep/db/Makefile.am trunk/opentrep/db/admin/ trunk/opentrep/db/admin/Makefile.am trunk/opentrep/db/admin/create_opentrep_user.sh trunk/opentrep/db/admin/create_opentrep_user.sql trunk/opentrep/db/data/Makefile.am trunk/opentrep/db/data/sources.mk trunk/opentrep/db/maintenance/ trunk/opentrep/db/maintenance/Makefile.am trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh trunk/opentrep/db/maintenance/tables/ trunk/opentrep/db/maintenance/tables/Makefile.am trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql trunk/opentrep/db/maintenance/tables/ref_city.sql trunk/opentrep/db/maintenance/tables/ref_db.sql trunk/opentrep/db/maintenance/tables/sources.mk Removed Paths: ------------- trunk/opentrep/db/data_structure/ trunk/opentrep/db/maintenance/ref_city.sql trunk/opentrep/db/maintenance/ref_db.sql trunk/opentrep/db/mysql/create_and_fill_mysql_db.sh trunk/opentrep/db/mysql/create_and_fill_mysql_db.sql trunk/opentrep/db/mysql/create_opentrep_user.sh trunk/opentrep/db/mysql/create_opentrep_user.sql trunk/opentrep/db/mysql/drop_tables_from_mysql_db.sh trunk/opentrep/refdata/ Property Changed: ---------------- trunk/opentrep/db/data/ trunk/opentrep/opentrep/batches/ Modified: trunk/opentrep/Makefile.am =================================================================== --- trunk/opentrep/Makefile.am 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -24,7 +24,7 @@ EXTRA_DIST = @PACKAGE@.spec @PACKAGE@.m4 @PACKAGE@.pc Makefile.common # Build in these directories: -SUBDIRS = opentrep win32 po man $(INFO_DOC_DIR) $(HTML_DOC_DIR) $(TEST_DIR) +SUBDIRS = opentrep win32 po man $(INFO_DOC_DIR) $(HTML_DOC_DIR) db $(TEST_DIR) # Configuration helpers Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/configure.ac 2009-07-18 00:50:00 UTC (rev 135) @@ -233,6 +233,11 @@ doc/doxygen_html.cfg doc/sourceforge/howto_release_opentrep.html po/Makefile.in + db/Makefile + db/admin/Makefile + db/maintenance/Makefile + db/maintenance/tables/Makefile + db/data/Makefile test/com/Makefile test/parsers/Makefile test/Makefile Property changes on: trunk/opentrep/db ___________________________________________________________________ Added: svn:ignore + Makefile Makefile.in Added: trunk/opentrep/db/Makefile.am =================================================================== --- trunk/opentrep/db/Makefile.am (rev 0) +++ trunk/opentrep/db/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,6 @@ +## db sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +SUBDIRS = admin data maintenance Property changes on: trunk/opentrep/db/admin ___________________________________________________________________ Added: svn:ignore + Makefile Makefile.in Added: trunk/opentrep/db/admin/Makefile.am =================================================================== --- trunk/opentrep/db/admin/Makefile.am (rev 0) +++ trunk/opentrep/db/admin/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,8 @@ +## db sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +MYSQL_ADMIN_FILES = create_opentrep_user.sh create_opentrep_user.sql + +EXTRA_DIST = $(MYSQL_ADMIN_FILES) Copied: trunk/opentrep/db/admin/create_opentrep_user.sh (from rev 134, trunk/opentrep/refdata/mysql/create_opentrep_user.sh) =================================================================== --- trunk/opentrep/db/admin/create_opentrep_user.sh (rev 0) +++ trunk/opentrep/db/admin/create_opentrep_user.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,52 @@ +#!/bin/sh +# +# Two parameters are required for this script: +# - the administrator username +# - the administrator password +# +# Two parameters are optional: +# - the host server of the database +# - the port of the database +# + +if [ "$1" = "" -o "$2" = "" -o "$1" = "-h" -o "$1" = "--help" ]; +then + echo "Usage: $0 <Admin Username> <Admin password> [<Database Server Hostname> [<Database Server Port>]]" + echo "" + exit -1 +fi + +## +# Database Server Hostname +DB_HOST="localhost" +if [ "$3" != "" ]; +then + DB_HOST="$3" +fi + +# Database Server Port +DB_PORT="3306" +if [ "$4" != "" ]; +then + DB_PORT="$4" +fi + +# Database User +DB_USER="$1" + +# Database Password +DB_PASSWD="$2" + +# Database Name +DB_NAME="mysql" + +function createOpenTrepUser() { + echo "Creating the opentrep user within the database:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_FILE} + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} -e "flush privileges" +} + +# Creating the opentrep user +SQL_FILE="create_opentrep_user.sql" +createOpenTrepUser + Copied: trunk/opentrep/db/admin/create_opentrep_user.sql (from rev 134, trunk/opentrep/refdata/mysql/create_opentrep_user.sql) =================================================================== --- trunk/opentrep/db/admin/create_opentrep_user.sql (rev 0) +++ trunk/opentrep/db/admin/create_opentrep_user.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,7 @@ + +insert into `user` (`Host`, `User`, `Password`, `Select_priv`, `Insert_priv`, `Update_priv`, `Delete_priv`, `Create_priv`, `Drop_priv`, `Reload_priv`, `Shutdown_priv`, `Process_priv`, `File_priv`, `Grant_priv`, `References_priv`, `Index_priv`, `Alter_priv`, `Show_db_priv`, `Super_priv`, `Create_tmp_table_priv`, `Lock_tables_priv`, `Execute_priv`, `Repl_slave_priv`, `Repl_client_priv`, `Create_view_priv`, `Show_view_priv`, `Create_routine_priv`, `Alter_routine_priv`, `Create_user_priv`, `ssl_type`, `ssl_cipher`, `x509_issuer`, `x509_subject`, `max_questions`, `max_updates`, `max_connections`, `max_user_connections`) values +('%', 'opentrep', '*C21B5F0DB6BBABAA20B5496E75D652982A6AC65C', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'N', 'N', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'N', 'N', 'Y', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'Y', 'Y', 'N', '', '', '', '', 0, 0, 0, 0), +('localhost', 'opentrep', '*C21B5F0DB6BBABAA20B5496E75D652982A6AC65C', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'N', 'N', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'N', 'N', 'Y', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'Y', 'Y', 'N', '', '', '', '', 0, 0, 0, 0); + +flush privileges; + Property changes on: trunk/opentrep/db/data ___________________________________________________________________ Added: svn:ignore + Makefile Makefile.in Added: trunk/opentrep/db/data/Makefile.am =================================================================== --- trunk/opentrep/db/data/Makefile.am (rev 0) +++ trunk/opentrep/db/data/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,26 @@ +## db/data sub-directory +include $(top_srcdir)/Makefile.common +include $(top_srcdir)/db/data/sources.mk + +datadir = @datadir@ +pkgdatadir = $(datadir)/@PACKAGE@ +dbscriptsdir = $(pkgdatadir)/db/mysql/fill_tables + +MAINTAINERCLEANFILES = Makefile.in Makefile + +noinst_DATA = $(data_mysql_sources) +EXTRA_DIST = $(noinst_DATA) + + +# Targets +install-data-local: + $(mkinstalldirs) $(DESTDIR)$(dbscriptsdir); \ + for f in $(data_mysql_sources); do \ + $(INSTALL_DATA) $$f $(DESTDIR)$(dbscriptsdir); \ + done + +uninstall-local: + rm -rf $(DESTDIR)$(dbscriptsdir) + +clean-local: + rm -rf *.log *.tag Added: trunk/opentrep/db/data/sources.mk =================================================================== --- trunk/opentrep/db/data/sources.mk (rev 0) +++ trunk/opentrep/db/data/sources.mk 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,4 @@ +data_mysql_sources = \ + $(top_srcdir)/db/data/ref_city.csv \ + $(top_srcdir)/db/data/ref_place_details.csv \ + $(top_srcdir)/db/data/ref_place_names.csv Property changes on: trunk/opentrep/db/maintenance ___________________________________________________________________ Added: svn:ignore + Makefile Makefile.in Added: trunk/opentrep/db/maintenance/Makefile.am =================================================================== --- trunk/opentrep/db/maintenance/Makefile.am (rev 0) +++ trunk/opentrep/db/maintenance/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,10 @@ +## db/maintenance sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +SUBDIRS = tables + +TABLE_MAINT_FILES = create_and_fill_mysql_db.sh drop_tables_from_mysql_db.sh + +EXTRA_DIST = $(TABLE_MAINT_FILES) Copied: trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh (from rev 134, trunk/opentrep/refdata/mysql/create_and_fill_mysql_db.sh) =================================================================== --- trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh (rev 0) +++ trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,98 @@ +#!/bin/sh +# +# One parameter is required for this script: +# - the username +# +# Two parameters are optional: +# - the host server of the database +# - the port of the database +# + +if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; +then + echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" + echo "" + exit -1 +fi + +## +# Database Server Hostname +DB_HOST="localhost" +if [ "$2" != "" ]; +then + DB_HOST="$2" +fi + +# Database Server Port +DB_PORT="3306" +if [ "$3" != "" ]; +then + DB_PORT="$3" +fi + +# Database User +DB_USER="$1" + +# Database Password +DB_PASSWD="${DB_USER}" + +# Database Name +DB_NAME="opentrep" + +# Create the database +function createDatabase() { + echo "The '${DB_NAME}' database will be created:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} mysql < ${SQL_FILE} +} + +# Scan a SQL script for the names of (database) tables +function createTable() { + echo "The ref_place_details and ref_place_names tables will be created:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_FILE} +} + +# +function loadData() { + echo "The ref_place_details and ref_place_names tables will be filled from ../data/*.csv files:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_LOADER_FILE} + echo "Done" +} + +# +function trimStateCode() { + echo "Triming the spaces from the state_code field of the ${TABLE} table:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set city_code=NULL where city_code='';" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=NULL where state_code like '%null%';" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=NULL where length(state_code)=2;" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=substring(state_code,2,2) where length(state_code)=4;" +} + +# +function countRows() { + echo "Counting the rows from the ${TABLE} table:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "select count(*) from ${TABLE};" +} + +# Database +SQL_FILE="../data_structure/ref_db.sql" +createDatabase + +# Table: Airport and City +SQL_FILE="../data_structure/ref_city.sql" +createTable + +# Load data into the table +SQL_LOADER_FILE="create_and_fill_mysql_db.sql" +loadData + +# Trim the spaces from the state_code field of the ref_place_details table +TABLE=ref_place_details +trimStateCode + +# Count the rows from the ref_place_details table +TABLE=ref_place_details +countRows + +# Count the rows from the ref_place_names table +TABLE=ref_place_names +countRows Copied: trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh (from rev 134, trunk/opentrep/refdata/mysql/drop_tables_from_mysql_db.sh) =================================================================== --- trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh (rev 0) +++ trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,54 @@ +#!/bin/sh +# +# One parameter is required for this script: +# - the username +# +# Two parameters are optional: +# - the host server of the database +# - the port of the database +# + +if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; +then + echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" + echo "" + exit -1 +fi + +## +# Database Server Hostname +DB_HOST="localhost" +if [ "$2" != "" ]; +then + DB_HOST="$2" +fi + +# Database Server Port +DB_PORT="3306" +if [ "$3" != "" ]; +then + DB_PORT="$3" +fi + +# Database User +DB_USER="$1" + +# Database Password +DB_PASSWD="${DB_USER}" + +# Database Name +DB_NAME="opentrep" + +# Drop a table +function dropTable() { + echo "The ${TABLE} table will be dropped:" + mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "drop table ${DB_NAME}.${TABLE}" +} + +# Table: drop the ref_place_details table +TABLE=ref_place_details +dropTable + +# Table: drop the ref_place_names table +TABLE=ref_place_names +dropTable Deleted: trunk/opentrep/db/maintenance/ref_city.sql =================================================================== --- trunk/opentrep/refdata/data_structure/ref_city.sql 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/maintenance/ref_city.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -1,46 +0,0 @@ --- --- Place details --- Does not depend on language --- -create table if not exists ref_place_details ( - code char(3) collate utf8_unicode_ci not null, - city_code char(3) collate utf8_unicode_ci, - xapian_docid integer, - is_airport char(1) collate utf8_unicode_ci not null, - is_city char(1) collate utf8_unicode_ci not null, - is_main char(1) collate utf8_unicode_ci not null default 'N', - is_commercial char(1) collate utf8_unicode_ci not null, - state_code varchar(5) collate utf8_unicode_ci, - country_code char(2) collate utf8_unicode_ci not null, - region_code varchar(5) collate utf8_unicode_ci not null, - continent_code varchar(4) collate utf8_unicode_ci not null, - time_zone_grp varchar(5) collate utf8_unicode_ci not null, - longitude float(20), - latitude float(20), - primary key (code), - key `geographical codes`(city_code, continent_code, country_code, region_code, time_zone_grp) -) engine=myisam default charset=utf8 collate=utf8_unicode_ci; - --- --- Place names --- Depends on language --- -create table if not exists ref_place_names ( - language_code char(2) collate utf8_unicode_ci not null, - code char(3) collate utf8_unicode_ci not null, - classical_name varchar(30) collate utf8_unicode_ci not null, - classical_name2 varchar(50) collate utf8_unicode_ci not null, - extended_name varchar(100) collate utf8_unicode_ci not null, - alternate_name1 varchar(60) collate utf8_unicode_ci, - alternate_name2 varchar(60) collate utf8_unicode_ci, - alternate_name3 varchar(60) collate utf8_unicode_ci, - alternate_name4 varchar(60) collate utf8_unicode_ci, - alternate_name5 varchar(60) collate utf8_unicode_ci, - alternate_name6 varchar(60) collate utf8_unicode_ci, - alternate_name7 varchar(60) collate utf8_unicode_ci, - alternate_name8 varchar(60) collate utf8_unicode_ci, - alternate_name9 varchar(60) collate utf8_unicode_ci, - alternate_name10 varchar(60) collate utf8_unicode_ci, - primary key (language_code, code) -) engine=myisam default charset=utf8 collate=utf8_unicode_ci; - Deleted: trunk/opentrep/db/maintenance/ref_db.sql =================================================================== --- trunk/opentrep/refdata/data_structure/ref_db.sql 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/maintenance/ref_db.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -1,6 +0,0 @@ - --- --- Create the opentrep database --- -create database if not exists opentrep -default character set utf8 collate utf8_unicode_ci; Property changes on: trunk/opentrep/db/maintenance/tables ___________________________________________________________________ Added: svn:ignore + Makefile Makefile.in Added: trunk/opentrep/db/maintenance/tables/Makefile.am =================================================================== --- trunk/opentrep/db/maintenance/tables/Makefile.am (rev 0) +++ trunk/opentrep/db/maintenance/tables/Makefile.am 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,26 @@ +## db/maintenance/tables sub-directory +include $(top_srcdir)/Makefile.common +include $(top_srcdir)/db/maintenance/tables/sources.mk + +datadir = @datadir@ +pkgdatadir = $(datadir)/@PACKAGE@ +dbscriptsdir = $(pkgdatadir)/db/mysql/create_tables + +MAINTAINERCLEANFILES = Makefile.in Makefile + +noinst_DATA = $(dbscript_mysql_sources) +EXTRA_DIST = $(noinst_DATA) + + +# Targets +install-data-local: + $(mkinstalldirs) $(DESTDIR)$(dbscriptsdir); \ + for f in $(dbscript_mysql_sources); do \ + $(INSTALL_DATA) $$f $(DESTDIR)$(dbscriptsdir); \ + done + +uninstall-local: + rm -rf $(DESTDIR)$(dbscriptsdir) + +clean-local: + rm -rf *.log *.tag Copied: trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql (from rev 134, trunk/opentrep/refdata/mysql/create_and_fill_mysql_db.sql) =================================================================== --- trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql (rev 0) +++ trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,16 @@ + +-- +-- Load the Airport and City geographical details into the MySQL table +-- +load data local infile '../data/ref_place_details.csv' ignore +into table ref_place_details +fields terminated by ',' enclosed by '' escaped by '\\' +ignore 1 lines; + +-- +-- Load the Airport and City names into the MySQL table +-- +load data local infile '../data/ref_place_names.csv' ignore +into table ref_place_names +fields terminated by ',' enclosed by '' escaped by '\\' +ignore 1 lines; Copied: trunk/opentrep/db/maintenance/tables/ref_city.sql (from rev 134, trunk/opentrep/refdata/data_structure/ref_city.sql) =================================================================== --- trunk/opentrep/db/maintenance/tables/ref_city.sql (rev 0) +++ trunk/opentrep/db/maintenance/tables/ref_city.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,46 @@ +-- +-- Place details +-- Does not depend on language +-- +create table if not exists ref_place_details ( + code char(3) collate utf8_unicode_ci not null, + city_code char(3) collate utf8_unicode_ci, + xapian_docid integer, + is_airport char(1) collate utf8_unicode_ci not null, + is_city char(1) collate utf8_unicode_ci not null, + is_main char(1) collate utf8_unicode_ci not null default 'N', + is_commercial char(1) collate utf8_unicode_ci not null, + state_code varchar(5) collate utf8_unicode_ci, + country_code char(2) collate utf8_unicode_ci not null, + region_code varchar(5) collate utf8_unicode_ci not null, + continent_code varchar(4) collate utf8_unicode_ci not null, + time_zone_grp varchar(5) collate utf8_unicode_ci not null, + longitude float(20), + latitude float(20), + primary key (code), + key `geographical codes`(city_code, continent_code, country_code, region_code, time_zone_grp) +) engine=myisam default charset=utf8 collate=utf8_unicode_ci; + +-- +-- Place names +-- Depends on language +-- +create table if not exists ref_place_names ( + language_code char(2) collate utf8_unicode_ci not null, + code char(3) collate utf8_unicode_ci not null, + classical_name varchar(30) collate utf8_unicode_ci not null, + classical_name2 varchar(50) collate utf8_unicode_ci not null, + extended_name varchar(100) collate utf8_unicode_ci not null, + alternate_name1 varchar(60) collate utf8_unicode_ci, + alternate_name2 varchar(60) collate utf8_unicode_ci, + alternate_name3 varchar(60) collate utf8_unicode_ci, + alternate_name4 varchar(60) collate utf8_unicode_ci, + alternate_name5 varchar(60) collate utf8_unicode_ci, + alternate_name6 varchar(60) collate utf8_unicode_ci, + alternate_name7 varchar(60) collate utf8_unicode_ci, + alternate_name8 varchar(60) collate utf8_unicode_ci, + alternate_name9 varchar(60) collate utf8_unicode_ci, + alternate_name10 varchar(60) collate utf8_unicode_ci, + primary key (language_code, code) +) engine=myisam default charset=utf8 collate=utf8_unicode_ci; + Copied: trunk/opentrep/db/maintenance/tables/ref_db.sql (from rev 134, trunk/opentrep/refdata/data_structure/ref_db.sql) =================================================================== --- trunk/opentrep/db/maintenance/tables/ref_db.sql (rev 0) +++ trunk/opentrep/db/maintenance/tables/ref_db.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,6 @@ + +-- +-- Create the opentrep database +-- +create database if not exists opentrep +default character set utf8 collate utf8_unicode_ci; Added: trunk/opentrep/db/maintenance/tables/sources.mk =================================================================== --- trunk/opentrep/db/maintenance/tables/sources.mk (rev 0) +++ trunk/opentrep/db/maintenance/tables/sources.mk 2009-07-18 00:50:00 UTC (rev 135) @@ -0,0 +1,4 @@ +dbscript_mysql_sources = \ + $(top_srcdir)/db/maintenance/tables/create_and_fill_mysql_db.sql \ + $(top_srcdir)/db/maintenance/tables/ref_db.sql \ + $(top_srcdir)/db/maintenance/tables/ref_city.sql Deleted: trunk/opentrep/db/mysql/create_and_fill_mysql_db.sh =================================================================== --- trunk/opentrep/refdata/mysql/create_and_fill_mysql_db.sh 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/mysql/create_and_fill_mysql_db.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -1,98 +0,0 @@ -#!/bin/sh -# -# One parameter is required for this script: -# - the username -# -# Two parameters are optional: -# - the host server of the database -# - the port of the database -# - -if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; -then - echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" - echo "" - exit -1 -fi - -## -# Database Server Hostname -DB_HOST="localhost" -if [ "$2" != "" ]; -then - DB_HOST="$2" -fi - -# Database Server Port -DB_PORT="3306" -if [ "$3" != "" ]; -then - DB_PORT="$3" -fi - -# Database User -DB_USER="$1" - -# Database Password -DB_PASSWD="${DB_USER}" - -# Database Name -DB_NAME="opentrep" - -# Create the database -function createDatabase() { - echo "The '${DB_NAME}' database will be created:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} mysql < ${SQL_FILE} -} - -# Scan a SQL script for the names of (database) tables -function createTable() { - echo "The ref_place_details and ref_place_names tables will be created:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_FILE} -} - -# -function loadData() { - echo "The ref_place_details and ref_place_names tables will be filled from ../data/*.csv files:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_LOADER_FILE} - echo "Done" -} - -# -function trimStateCode() { - echo "Triming the spaces from the state_code field of the ${TABLE} table:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set city_code=NULL where city_code='';" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=NULL where state_code like '%null%';" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=NULL where length(state_code)=2;" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "update ${TABLE} set state_code=substring(state_code,2,2) where length(state_code)=4;" -} - -# -function countRows() { - echo "Counting the rows from the ${TABLE} table:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "select count(*) from ${TABLE};" -} - -# Database -SQL_FILE="../data_structure/ref_db.sql" -createDatabase - -# Table: Airport and City -SQL_FILE="../data_structure/ref_city.sql" -createTable - -# Load data into the table -SQL_LOADER_FILE="create_and_fill_mysql_db.sql" -loadData - -# Trim the spaces from the state_code field of the ref_place_details table -TABLE=ref_place_details -trimStateCode - -# Count the rows from the ref_place_details table -TABLE=ref_place_details -countRows - -# Count the rows from the ref_place_names table -TABLE=ref_place_names -countRows Deleted: trunk/opentrep/db/mysql/create_and_fill_mysql_db.sql =================================================================== --- trunk/opentrep/refdata/mysql/create_and_fill_mysql_db.sql 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/mysql/create_and_fill_mysql_db.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -1,16 +0,0 @@ - --- --- Load the Airport and City geographical details into the MySQL table --- -load data local infile '../data/ref_place_details.csv' ignore -into table ref_place_details -fields terminated by ',' enclosed by '' escaped by '\\' -ignore 1 lines; - --- --- Load the Airport and City names into the MySQL table --- -load data local infile '../data/ref_place_names.csv' ignore -into table ref_place_names -fields terminated by ',' enclosed by '' escaped by '\\' -ignore 1 lines; Deleted: trunk/opentrep/db/mysql/create_opentrep_user.sh =================================================================== --- trunk/opentrep/refdata/mysql/create_opentrep_user.sh 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/mysql/create_opentrep_user.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -1,52 +0,0 @@ -#!/bin/sh -# -# Two parameters are required for this script: -# - the administrator username -# - the administrator password -# -# Two parameters are optional: -# - the host server of the database -# - the port of the database -# - -if [ "$1" = "" -o "$2" = "" -o "$1" = "-h" -o "$1" = "--help" ]; -then - echo "Usage: $0 <Admin Username> <Admin password> [<Database Server Hostname> [<Database Server Port>]]" - echo "" - exit -1 -fi - -## -# Database Server Hostname -DB_HOST="localhost" -if [ "$3" != "" ]; -then - DB_HOST="$3" -fi - -# Database Server Port -DB_PORT="3306" -if [ "$4" != "" ]; -then - DB_PORT="$4" -fi - -# Database User -DB_USER="$1" - -# Database Password -DB_PASSWD="$2" - -# Database Name -DB_NAME="mysql" - -function createOpenTrepUser() { - echo "Creating the opentrep user within the database:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_FILE} - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} -e "flush privileges" -} - -# Creating the opentrep user -SQL_FILE="create_opentrep_user.sql" -createOpenTrepUser - Deleted: trunk/opentrep/db/mysql/create_opentrep_user.sql =================================================================== --- trunk/opentrep/refdata/mysql/create_opentrep_user.sql 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/mysql/create_opentrep_user.sql 2009-07-18 00:50:00 UTC (rev 135) @@ -1,7 +0,0 @@ - -insert into `user` (`Host`, `User`, `Password`, `Select_priv`, `Insert_priv`, `Update_priv`, `Delete_priv`, `Create_priv`, `Drop_priv`, `Reload_priv`, `Shutdown_priv`, `Process_priv`, `File_priv`, `Grant_priv`, `References_priv`, `Index_priv`, `Alter_priv`, `Show_db_priv`, `Super_priv`, `Create_tmp_table_priv`, `Lock_tables_priv`, `Execute_priv`, `Repl_slave_priv`, `Repl_client_priv`, `Create_view_priv`, `Show_view_priv`, `Create_routine_priv`, `Alter_routine_priv`, `Create_user_priv`, `ssl_type`, `ssl_cipher`, `x509_issuer`, `x509_subject`, `max_questions`, `max_updates`, `max_connections`, `max_user_connections`) values -('%', 'opentrep', '*C21B5F0DB6BBABAA20B5496E75D652982A6AC65C', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'N', 'N', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'N', 'N', 'Y', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'Y', 'Y', 'N', '', '', '', '', 0, 0, 0, 0), -('localhost', 'opentrep', '*C21B5F0DB6BBABAA20B5496E75D652982A6AC65C', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'N', 'N', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'N', 'N', 'Y', 'N', 'Y', 'N', 'N', 'Y', 'Y', 'Y', 'Y', 'N', '', '', '', '', 0, 0, 0, 0); - -flush privileges; - Deleted: trunk/opentrep/db/mysql/drop_tables_from_mysql_db.sh =================================================================== --- trunk/opentrep/refdata/mysql/drop_tables_from_mysql_db.sh 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/db/mysql/drop_tables_from_mysql_db.sh 2009-07-18 00:50:00 UTC (rev 135) @@ -1,54 +0,0 @@ -#!/bin/sh -# -# One parameter is required for this script: -# - the username -# -# Two parameters are optional: -# - the host server of the database -# - the port of the database -# - -if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; -then - echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" - echo "" - exit -1 -fi - -## -# Database Server Hostname -DB_HOST="localhost" -if [ "$2" != "" ]; -then - DB_HOST="$2" -fi - -# Database Server Port -DB_PORT="3306" -if [ "$3" != "" ]; -then - DB_PORT="$3" -fi - -# Database User -DB_USER="$1" - -# Database Password -DB_PASSWD="${DB_USER}" - -# Database Name -DB_NAME="opentrep" - -# Drop a table -function dropTable() { - echo "The ${TABLE} table will be dropped:" - mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} -e "drop table ${DB_NAME}.${TABLE}" -} - -# Table: drop the ref_place_details table -TABLE=ref_place_details -dropTable - -# Table: drop the ref_place_names table -TABLE=ref_place_names -dropTable Modified: trunk/opentrep/opentrep/OPENTREP_Types.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-18 00:50:00 UTC (rev 135) @@ -56,8 +56,11 @@ /** Xapian document ID. */ typedef int XapianDocID_T; - /** Travel Search Query. */ + /** Travel search query. */ typedef std::string TravelQuery_T; + + /** Number of matching documents. */ + typedef unsigned short NbOfMatches_T; } #endif // __OPENTREP_OPENTREP_TYPES_HPP Property changes on: trunk/opentrep/opentrep/batches ___________________________________________________________________ Modified: svn:ignore - .deps .libs Makefile Makefile.in opentrep_indexer opentrep_searcher + .deps .libs Makefile Makefile.in opentrep_indexer* opentrep_searcher* Modified: trunk/opentrep/opentrep/batches/indexer.cpp =================================================================== --- trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -12,43 +12,56 @@ #include <boost/program_options.hpp> // OPENTREP #include <opentrep/OPENTREP_Service.hpp> +#include <opentrep/config/opentrep-paths.hpp> + +// //////// Type definitions /////// +typedef std::vector<std::string> WordList_T; + + +// //////// Constants ////// +/** Default name and location for the log file. */ +const std::string K_OPENTREP_DEFAULT_LOG_FILENAME ("opentrep_indexer.log"); + +/** Default name and location for the Xapian database. */ +const std::string K_OPENTREP_DEFAULT_DATABSE_FILEPATH("/tmp/opentrep/traveldb"); + + // ///////// Parsing of Options & Configuration ///////// -// A helper function to simplify the main part. -template<class T> std::ostream& operator<< (std::ostream& os, - const std::vector<T>& v) { - std::copy (v.begin(), v.end(), std::ostream_iterator<T> (std::cout, " ")); - return os; -} +/** Early return status (so that it can be differentiated from an error). */ +const int K_OPENTREP_EARLY_RETURN_STATUS = 99; -int readConfiguration (int argc, char* argv[]) { - int opt; - - // Declare a group of options that will be - // allowed only on command line - boost::program_options::options_description generic("Generic options"); +/** Read and parse the command line options. */ +int readConfiguration (int argc, char* argv[], + std::string& ioDatabaseFilepath, + std::string& ioLogFilename) { + + // Declare a group of options that will be allowed only on command line + boost::program_options::options_description generic ("Generic options"); generic.add_options() + ("prefix", "print installation prefix") ("version,v", "print version string") ("help,h", "produce help message"); - // Declare a group of options that will be allowed both on command line and in - // config file - boost::program_options::options_description config("Configuration"); + // Declare a group of options that will be allowed both on command + // line and in config file + boost::program_options::options_description config ("Configuration"); config.add_options() - ("optimization", - boost::program_options::value<int>(&opt)->default_value(10), - "optimization level") - ("include-path,I", - boost::program_options::value< std::vector<std::string> >()->composing(), - "include path"); + ("database,d", + boost::program_options::value< std::string >(&ioDatabaseFilepath)->default_value(K_OPENTREP_DEFAULT_DATABSE_FILEPATH), + "Xapian database filepath (e.g., /tmp/opentrep/traveldb)") + ("log,l", + boost::program_options::value< std::string >(&ioLogFilename)->default_value(K_OPENTREP_DEFAULT_LOG_FILENAME), + "Filepath for the logs") + ; // Hidden options, will be allowed both on command line and // in config file, but will not be shown to the user. - boost::program_options::options_description hidden("Hidden options"); + boost::program_options::options_description hidden ("Hidden options"); hidden.add_options() - ("input-file", + ("copyright", boost::program_options::value< std::vector<std::string> >(), - "input file"); + "Show the copyright (license)"); boost::program_options::options_description cmdline_options; cmdline_options.add(generic).add(config).add(hidden); @@ -56,46 +69,48 @@ boost::program_options::options_description config_file_options; config_file_options.add(config).add(hidden); - boost::program_options::options_description visible("Allowed options"); + boost::program_options::options_description visible ("Allowed options"); visible.add(generic).add(config); boost::program_options::positional_options_description p; - p.add("input-file", -1); + p.add ("copyright", -1); boost::program_options::variables_map vm; boost::program_options:: - store (boost::program_options::command_line_parser(argc, argv). - options (cmdline_options).positional(p).run(), vm); + store (boost::program_options::command_line_parser (argc, argv). + options (cmdline_options).positional(p).run(), vm); - std::ifstream ifs ("request_parser.cfg"); + std::ifstream ifs ("opentrep_indexer.cfg"); boost::program_options::store (parse_config_file (ifs, config_file_options), vm); boost::program_options::notify (vm); if (vm.count ("help")) { std::cout << visible << std::endl; - return 0; + return K_OPENTREP_EARLY_RETURN_STATUS; } if (vm.count ("version")) { - std::cout << "Open Travel Request Parser, version 1.0" << std::endl; - return 0; + std::cout << PACKAGE_NAME << ", version " << PACKAGE_VERSION << std::endl; + return K_OPENTREP_EARLY_RETURN_STATUS; } - if (vm.count ("include-path")) { - std::cout << "Include paths are: " - << vm["include-path"].as< std::vector<std::string> >() - << std::endl; + if (vm.count ("prefix")) { + std::cout << "Installation prefix: " << PREFIXDIR << std::endl; + return K_OPENTREP_EARLY_RETURN_STATUS; } - if (vm.count ("input-file")) { - std::cout << "Input files are: " - << vm["input-file"].as< std::vector<std::string> >() + if (vm.count ("database")) { + ioDatabaseFilepath = vm["database"].as< std::string >(); + std::cout << "Xapian database filepath is: " << ioDatabaseFilepath << std::endl; } - std::cout << "Optimization level is " << opt << std::endl; - + if (vm.count ("log")) { + ioLogFilename = vm["log"].as< std::string >(); + std::cout << "Log filename is: " << ioLogFilename << std::endl; + } + return 0; } @@ -105,19 +120,17 @@ try { // Output log File - std::string lLogFilename ("indexer.log"); + std::string lLogFilename; // Xapian database name (directory of the index) - OPENTREP::TravelDatabaseName_T lXapianDatabaseName ("traveldb"); + OPENTREP::TravelDatabaseName_T lXapianDatabaseName; - if (argc >= 1 && argv[1] != NULL) { - std::istringstream istr (argv[1]); - istr >> lLogFilename; - } + // Call the command-line option parser + const int lOptionParserStatus = + readConfiguration (argc, argv, lXapianDatabaseName, lLogFilename); - if (argc >= 2 && argv[2] != NULL) { - std::istringstream istr (argv[2]); - istr >> lXapianDatabaseName; + if (lOptionParserStatus == K_OPENTREP_EARLY_RETURN_STATUS) { + return 0; } // Set the log parameters @@ -135,7 +148,6 @@ // Close the Log outputFile logOutputFile.close(); - } catch (const OPENTREP::RootException& otexp) { std::cerr << "Standard exception: " << otexp.what() << std::endl; Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -15,15 +15,17 @@ #include <opentrep/OPENTREP_Service.hpp> #include <opentrep/config/opentrep-paths.hpp> + // //////// Type definitions /////// typedef std::vector<std::string> WordList_T; + // //////// Constants ////// /** Default name and location for the log file. */ const std::string K_OPENTREP_DEFAULT_LOG_FILENAME ("opentrep_searcher.log"); /** Default name and location for the Xapian database. */ -const std::string K_OPENTREP_DEFAULT_DATABSE_FILEPATH ("/tmp/opentrep/traveldb"); +const std::string K_OPENTREP_DEFAULT_DATABSE_FILEPATH("/tmp/opentrep/traveldb"); /** Default travel query string, to be seached against the Xapian database. */ const std::string K_OPENTREP_DEFAULT_QUERY_STRING ("sna francicso rio de janero lso anglese reykyavki"); @@ -31,6 +33,7 @@ /** Default error distance for spelling corrections. */ const unsigned short K_OPENTREP_DEFAULT_SPELLING_ERROR_DISTANCE = 3; + // ////////////////////////////////////////////////////////////////////// void tokeniseStringIntoWordList (const std::string& iPhrase, WordList_T& ioWordList) { Modified: trunk/opentrep/opentrep/bom/DocumentList.hpp =================================================================== --- trunk/opentrep/opentrep/bom/DocumentList.hpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/DocumentList.hpp 2009-07-18 00:50:00 UTC (rev 135) @@ -5,14 +5,23 @@ // Import section // ////////////////////////////////////////////////////////////////////// // STL -#include <map> +#include <list> +// OpenTREP +#include <opentrep/OPENTREP_Types.hpp> // Xapian #include <xapian.h> namespace OPENTREP { - /** List of Xapian documents. */ - typedef std::multimap<Xapian::percent, Xapian::Document> DocumentList_T; + /** Xapian document and its associated matching percentage. */ + typedef std::pair<Xapian::percent, Xapian::Document> MatchingDocument_T; + /** A matching Xapian document, along with the query string which it + matches. */ + typedef std::pair<TravelQuery_T, MatchingDocument_T> QueryAndDocument_T; + + /** List of matching Xapian documents. */ + typedef std::list<QueryAndDocument_T> DocumentList_T; + } #endif // __OPENTREP_BOM_DOCUMENTLIST_HPP Modified: trunk/opentrep/opentrep/bom/Result.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Result.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/Result.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -5,7 +5,7 @@ #include <cassert> #include <string> #include <sstream> -// OPENTREP +// OpenTREP #include <opentrep/bom/StringMatcher.hpp> #include <opentrep/bom/Result.hpp> #include <opentrep/service/Logger.hpp> @@ -24,7 +24,6 @@ // ////////////////////////////////////////////////////////////////////// void Result::init () { - _documentList.clear(); } // ////////////////////////////////////////////////////////////////////// @@ -44,14 +43,11 @@ std::ostringstream oStr; oStr << describeShortKey() << std::endl; - for (DocumentList_T::const_iterator itDoc = _documentList.begin(); - itDoc != _documentList.end(); ++itDoc) { - const Xapian::percent& lPercent = itDoc->first; - const Xapian::Document& lDocument = itDoc->second; - const Xapian::docid& lDocID = lDocument.get_docid(); - oStr << "Document ID " << lDocID << "\t" << lPercent - << "% [" << lDocument.get_data() << "]" << std::endl; - } + const Xapian::percent& lPercentage = _matchingDocument.first; + const Xapian::Document& lDocument = _matchingDocument.second; + const Xapian::docid& lDocID = lDocument.get_docid(); + oStr << "Document ID " << lDocID << "\t" << lPercentage + << "% [" << lDocument.get_data() << "]" << std::endl; return oStr.str(); } @@ -65,67 +61,4 @@ void Result::fromStream (std::istream& ioIn) { } - // ////////////////////////////////////////////////////////////////////// - const Xapian::Document& Result::getBestMatchingDocument() const { - /** - Retrieve the best matching document. As the document list (STL map) - is sorted by ascending order of the matching percentage, the best - matching one is located at the end (back) of the list (STL map). - */ - DocumentList_T::const_reverse_iterator itDocument = _documentList.rbegin(); - return itDocument->second; - } - - // ////////////////////////////////////////////////////////////////////// - const Xapian::percent& Result::getBestMatchingPercentage() const { - /** - Retrieve the best matching document. As the document list (STL map) - is sorted by ascending order of the matching percentage, the best - matching one is located at the end (back) of the list (STL map). - */ - DocumentList_T::const_reverse_iterator itDocument = _documentList.rbegin(); - return itDocument->first; - } - - // ////////////////////////////////////////////////////////////////////// - void Result::searchString () { - - // Catch any Xapian::Error exceptions thrown - try { - - bool shouldStop = false; - while (shouldStop == false) { - // DEBUG - /* - OPENTREP_LOG_DEBUG (std::endl << "--------------------------------" - << std::endl << "Current query string: `" << ioQueryString << "'"); - */ - - // Retrieve the list of documents matching the query string - Xapian::MSet lMatchingSet; - StringMatcher::searchString (lMatchingSet, _queryString, _database); - - // Create the corresponding list of documents - StringMatcher::createDocumentListFromMSet (lMatchingSet, _documentList); - - // Stop if a result is found. - if (_documentList.empty() == false) { - shouldStop = true; - break; - } - - // Remove a word from the query string - StringMatcher::removeOneWord (_queryString); - - // Stop when the resulting string gets empty. - if (_queryString.empty() == true) { - shouldStop = true; - } - } - - } catch (const Xapian::Error& error) { - OPENTREP_LOG_ERROR ("Exception: " << error.get_msg()); - } - } - } Modified: trunk/opentrep/opentrep/bom/Result.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Result.hpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/Result.hpp 2009-07-18 00:50:00 UTC (rev 135) @@ -25,31 +25,43 @@ return _queryString; } - /** Get the list of Xapian document objects. */ - const DocumentList_T& getDocumentList() const { - return _documentList; + /** Get the Matching Xapian document object, along with its + corresponding matching percentage. */ + const MatchingDocument_T& getMatchingDocument() const { + return _matchingDocument; } - /** Retrieve the best matching Xapian document object. */ - const Xapian::Document& getBestMatchingDocument() const; + /** Retrieve the percentage corresponding to the matching Xapian + document object. */ + const Xapian::percent& getPercentage() const { + return _matchingDocument.first; + } + + /** Retrieve the matching Xapian document object. */ + const Xapian::Document& getDocument() const { + return _matchingDocument.second; + } - /** Retrieve the percentage corresponding to the best matching - Xapian document object. */ - const Xapian::percent& getBestMatchingPercentage() const; - // ////////////// Setters ///////////// /** Set the query string. */ void setQueryString (const TravelQuery_T& iQueryString) { _queryString = iQueryString; } - - public: - // /////////// Business methods ///////// - /** Retrieve the list of documents matching the query string. */ - void searchString (); + /** Set the matching Xapian document object and its corresponding + matching percentage. */ + void setMatchingDocument (const MatchingDocument_T& iMatchingDocument) { + _matchingDocument = iMatchingDocument; + } + /** Set the matching Xapian document object and its corresponding + matching percentage. */ + void setQueryAndDocument (const QueryAndDocument_T& iQueryAndDocument) { + _queryString = iQueryAndDocument.first; + _matchingDocument = iQueryAndDocument.second; + } + public: // /////////// Display support methods ///////// @@ -98,8 +110,9 @@ /** Xapian database. */ const Xapian::Database& _database; - /** List of Xapian document objects. */ - DocumentList_T _documentList; + /** Matching Xapian document object, along with its corresponding + matching percentage. */ + MatchingDocument_T _matchingDocument; }; } Modified: trunk/opentrep/opentrep/bom/ResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -10,9 +10,6 @@ #include <opentrep/bom/StringMatcher.hpp> #include <opentrep/bom/Result.hpp> #include <opentrep/bom/ResultHolder.hpp> -// TODO: move that out of the BOM layer -#include <opentrep/factory/FacResultHolder.hpp> -#include <opentrep/factory/FacResult.hpp> #include <opentrep/service/Logger.hpp> namespace OPENTREP { @@ -71,11 +68,59 @@ } // ////////////////////////////////////////////////////////////////////// - void ResultHolder::searchString () { + bool ResultHolder::searchString (TravelQuery_T& ioPartialQueryString, + MatchingDocument_T& ioMatchingDocument) { + bool oFoundDocument = false; // Catch any Xapian::Error exceptions thrown try { + bool shouldStop = false; + while (shouldStop == false) { + // DEBUG + /* + OPENTREP_LOG_DEBUG (std::endl << "--------------------------------" + << std::endl << "Current query string: `" << ioPartialQueryString + << "'"); + */ + + // Retrieve the list of documents matching the query string + Xapian::MSet lMatchingSet; + StringMatcher::searchString (lMatchingSet, ioPartialQueryString, + _database); + + // Create the corresponding list of documents + oFoundDocument = StringMatcher:: + extractBestMatchingDocumentFromMSet(lMatchingSet, ioMatchingDocument); + + // Stop if a result is found. + if (oFoundDocument == true) { + shouldStop = true; + break; + } + + // Remove a word from the query string + StringMatcher::removeOneWord (ioPartialQueryString); + + // Stop when the resulting string gets empty. + if (ioPartialQueryString.empty() == true) { + shouldStop = true; + } + } + + } catch (const Xapian::Error& error) { + OPENTREP_LOG_ERROR ("Exception: " << error.get_msg()); + } + + return oFoundDocument; + } + + // ////////////////////////////////////////////////////////////////////// + void ResultHolder::searchString (DocumentList_T& ioDocumentList) { + + // Catch any Xapian::Error exceptions thrown + try { + std::string lRemainingQueryString (_queryString); bool shouldStop = false; while (shouldStop == false) { @@ -91,21 +136,23 @@ again no result, until either a result is found or the resulting string gets empty. */ - DocumentList_T lDocumentList; - // TODO: move that out of the BOM layer - Result& lResult = FacResult::instance().create (_database); - std::string lQueryString (lRemainingQueryString); - // - lResult.setQueryString (lQueryString); - lResult.searchString (); + /** + Main algorithm, altering the query string (suppressing the + furthest right words, so that the remaining left part be matched + against the Xapian database). + */ + MatchingDocument_T lMatchingDocument; + const bool hasFoundDocument = searchString (lQueryString, + lMatchingDocument); + + if (hasFoundDocument == true) { + const QueryAndDocument_T lQueryAndDocument (lQueryString, + lMatchingDocument); + ioDocumentList.push_back (lQueryAndDocument); + } - // Add the Result object (holding the list of matching - // documents) to the dedicated list. - // TODO: move that out of the BOM layer - FacResultHolder::initLinkWithResult (*this, lResult); - /** Remove, from the lRemainingQueryString string, the part which has been already successfully parsed. @@ -117,7 +164,6 @@ 'rio de janeiro'. So, the already parsed part, namely 'sna francisco', must be subtracted from the initial query string. */ - lQueryString = lResult.getQueryString(); StringMatcher::subtractParsedToRemaining (lQueryString, lRemainingQueryString); Modified: trunk/opentrep/opentrep/bom/ResultHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-18 00:50:00 UTC (rev 135) @@ -7,6 +7,7 @@ // OpenTREP #include <opentrep/OPENTREP_Types.hpp> #include <opentrep/bom/BomAbstract.hpp> +#include <opentrep/bom/DocumentList.hpp> #include <opentrep/bom/ResultList.hpp> // Forward declarations @@ -38,9 +39,17 @@ public: // /////////// Business methods ///////// /** Retrieve the list of documents matching the query string. */ - void searchString (); + void searchString (DocumentList_T&); + private: + /** Retrieve the document best matching the query string. + @param TravelQuery_T& The partial query string. + @param MatchingDocument_T& The best matching Xapian document (if found). + @return bool Whether such a best matching document has been found. */ + bool searchString(TravelQuery_T& ioPartialQueryString, MatchingDocument_T&); + + public: // /////////// Display support methods ///////// /** Dump a Business Object into an output stream. Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -272,21 +272,30 @@ } // ////////////////////////////////////////////////////////////////////// - void StringMatcher:: - createDocumentListFromMSet (const Xapian::MSet& iMatchingSet, - DocumentList_T& ioDocumentList) { - // Empty the list of documents - ioDocumentList.clear(); + bool StringMatcher:: + extractBestMatchingDocumentFromMSet (const Xapian::MSet& iMatchingSet, + MatchingDocument_T& ioMatchingDocument) { + bool oFoundDocument = false; - for (Xapian::MSetIterator itDoc = iMatchingSet.begin(); - itDoc != iMatchingSet.end(); ++itDoc) { - const Xapian::Document& lDocument = itDoc.get_document(); + if (iMatchingSet.empty() == true) { + return oFoundDocument; + } + oFoundDocument = true; - ioDocumentList.insert (DocumentList_T::value_type (itDoc.get_percent(), - lDocument)); - } + /** + Retrieve the best matching document. If there are several such + best matching documents (for instance, several at, say, 100%), + one is taken randomly (well, we take the first one of the STL + multimap, so it is not exactly randomly, but the result is the + same: it appears random). + */ + Xapian::MSetIterator itDoc = iMatchingSet.begin(); + ioMatchingDocument.first = itDoc.get_percent(); + ioMatchingDocument.second = itDoc.get_document(); + + return oFoundDocument; } - + // ////////////////////////////////////////////////////////////////////// void StringMatcher::removeOneWord (std::string& ioQueryString) { assert (ioQueryString.empty() == false); Modified: trunk/opentrep/opentrep/bom/StringMatcher.hpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-18 00:50:00 UTC (rev 135) @@ -29,10 +29,17 @@ static void searchString (Xapian::MSet&, const std::string& iSearchString, const Xapian::Database&); - /** Copy the Xapian MSet (matching set) object into a document - list object. */ - static void createDocumentListFromMSet (const Xapian::MSet&, - DocumentList_T&); + /** + Extract the best matching Xapian document. + <br>If there are several such best matching documents (for + instance, several at, say, 100%), one is taken randomly. Well, + as we take the first one of the STL multimap, it is not exactly + randomly, but the result is the same: it appears to be random. + @return bool Whether or not there was a matching document. + */ + static bool + extractBestMatchingDocumentFromMSet (const Xapian::MSet&, + MatchingDocument_T&); /** Remove the word furthest at right. */ static void removeOneWord (std::string& ioQueryString); Modified: trunk/opentrep/opentrep/command/RequestInterpreter.cpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-17 17:19:13 UTC (rev 134) +++ trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-18 00:50:00 UTC (rev 135) @@ -15,6 +15,7 @@ #include <opentrep/factory/FacPlaceHolder.hpp> #include <opentrep/factory/FacPlace.hpp> #include <opentrep/factory/FacResultHolder.hpp> +#include <opentrep/factory/FacResult.hpp> #include <opentrep/command/DBManager.hpp> #include <opentrep/command/RequestInterpreter.hpp> #include <opentrep/service/Logger.hpp> @@ -27,8 +28,7 @@ void RequestInterpreter:: interpretTravelRequest (soci::session& ioSociSession, const TravelDatabaseName_T& iTravelDatabaseName, - const TravelQuery_T& iTravelQuery, - PlaceHolder& ioPlaceHolder) { + const TravelQuery_T& iTravelQuery) { try { @@ -39,9 +39,30 @@ ResultHolder& lResultHolder = FacResultHolder::instance().create (iTravelQuery, lXapianDatabase); - // - lResultHolder.searchString(); + // Main algorithm + DocumentList_T lDocumentList; + lResultHolder.searchString (lDocumentList); + // Back-up the (retrieved) matching Xapian documents into still + // to-be-created Result objects. + for (DocumentList_T::const_iterator itDoc = lDocumentList.begin(); + itDoc != lDocumentList.end(); ++itDoc) { + // Retrieve both the Xapian document object and the corresponding + // matching percentage (most of the time, it is 100%) + const QueryAndDocument_T& lQueryAndDocument = *itDoc; + + // Create a Result object + Result& lResult = FacResult::instance().create (lXapianDatabase); + + // Fill the Result object with both the corresponding Document object + // and its associated query string + lResult.setQueryAndDocument (lQueryAndDocument); + + // Add the Result object (holding the list of matching + // documents) to the dedicated list. + FacResultHolder::initLinkWithResult (lResultHolder, lResult); + } + // DEBUG OPENTREP_LOG_DEBUG (std::endl << "=========================================" @@ -50,6 +71,9 @@ << "=========================================" << std::endl << std::endl); + // Create a PlaceHolder object, to collect the matching Place objects + PlaceHolder& lPlaceHolder = FacPlaceHolder::instance().create(); + // Browse the list of result objects const ResultList_T& lResultList = lResultHolder.getResultList(); for (ResultList_T::const_iterator itResult = lResultList.begin(); @@ -59,10 +83,8 @@ assert (lResult_ptr != NULL); // Retrieve the parameters of the best matching document - const Xapian::Document& lDocument = - lResult_ptr->getBestMatchingDocumen... [truncated message content] |
From: <den...@us...> - 2009-07-18 09:33:57
|
Revision: 136 http://opentrep.svn.sourceforge.net/opentrep/?rev=136&view=rev Author: denis_arnaud Date: 2009-07-18 09:33:50 +0000 (Sat, 18 Jul 2009) Log Message: ----------- [DB] Fixed the issue with location of the CSV files. Modified Paths: -------------- trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/bom/StringMatcher.cpp Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-18 00:50:00 UTC (rev 135) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-07-18 09:33:50 UTC (rev 136) @@ -5057,7 +5057,7 @@ en,reg,reggio calabria,reggio calabria,reggio calabria/it:t menniti en,reh,rehoboth beach,rehoboth beach,rehoboth beach/de/us en,rei,regina,regina,regina/gf -en,rek,reykjavik,reykjavik,reykjavik/is +en,rek,reykjavik,reykjavik,reykjavik/is,reykjavik main,reykjavik city en,rel,trelew,trelew,trelew/cb/ar en,ren,orenburg,orenburg,orenburg/ru en,reo,rome,rome,rome/or/us:state Modified: trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh =================================================================== --- trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh 2009-07-18 00:50:00 UTC (rev 135) +++ trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh 2009-07-18 09:33:50 UTC (rev 136) @@ -39,20 +39,34 @@ # Database Name DB_NAME="opentrep" +# Check file existence +function checkSQLFile() { + if [ ! -r ${SQL_FILE} ]; then + echo + echo "The ${SQL_FILE} SQL file can not be found" + echo + exit -1; + fi +} + # Create the database function createDatabase() { + checkSQLFile echo "The '${DB_NAME}' database will be created:" mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} mysql < ${SQL_FILE} } # Scan a SQL script for the names of (database) tables function createTable() { + checkSQLFile echo "The ref_place_details and ref_place_names tables will be created:" mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_FILE} } # function loadData() { + SQL_FILE=${SQL_LOADER_FILE} + checkSQLFile echo "The ref_place_details and ref_place_names tables will be filled from ../data/*.csv files:" mysql -u ${DB_USER} --password=${DB_PASSWD} -P ${DB_PORT} -h ${DB_HOST} ${DB_NAME} < ${SQL_LOADER_FILE} echo "Done" @@ -74,15 +88,15 @@ } # Database -SQL_FILE="../data_structure/ref_db.sql" +SQL_FILE="tables/ref_db.sql" createDatabase # Table: Airport and City -SQL_FILE="../data_structure/ref_city.sql" +SQL_FILE="tables/ref_city.sql" createTable # Load data into the table -SQL_LOADER_FILE="create_and_fill_mysql_db.sql" +SQL_LOADER_FILE="tables/create_and_fill_mysql_db.sql" loadData # Trim the spaces from the state_code field of the ref_place_details table Modified: trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql =================================================================== --- trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql 2009-07-18 00:50:00 UTC (rev 135) +++ trunk/opentrep/db/maintenance/tables/create_and_fill_mysql_db.sql 2009-07-18 09:33:50 UTC (rev 136) @@ -1,3 +1,8 @@ +-- +-- Note: that file is expected to be launched from the +-- $(top_srcdir)/db/maintenance sub-directory, as the CSV files are +-- to be found in $(top_srcdir)/db/data sub-directory +-- -- -- Load the Airport and City geographical details into the MySQL table Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-18 00:50:00 UTC (rev 135) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-18 09:33:50 UTC (rev 136) @@ -28,7 +28,7 @@ const std::string K_OPENTREP_DEFAULT_DATABSE_FILEPATH("/tmp/opentrep/traveldb"); /** Default travel query string, to be seached against the Xapian database. */ -const std::string K_OPENTREP_DEFAULT_QUERY_STRING ("sna francicso rio de janero lso anglese reykyavki"); +const std::string K_OPENTREP_DEFAULT_QUERY_STRING ("sna francicso rio de janero lso angles reykyavki"); /** Default error distance for spelling corrections. */ const unsigned short K_OPENTREP_DEFAULT_SPELLING_ERROR_DISTANCE = 3; Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-18 00:50:00 UTC (rev 135) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-18 09:33:50 UTC (rev 136) @@ -34,7 +34,7 @@ const EditDistance_T lQueryStringSize = iPhrase.size(); - oEditDistance = lQueryStringSize / 3; + oEditDistance = lQueryStringSize / 4; return oEditDistance; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-18 23:30:44
|
Revision: 140 http://opentrep.svn.sourceforge.net/opentrep/?rev=140&view=rev Author: denis_arnaud Date: 2009-07-18 23:30:37 +0000 (Sat, 18 Jul 2009) Log Message: ----------- [Dev] The query parser service now returns the list of unmatched words (along with the list of Location structures). Modified Paths: -------------- trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/opentrep/OPENTREP_Service.hpp trunk/opentrep/opentrep/OPENTREP_Types.hpp trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/bom/Document.cpp trunk/opentrep/opentrep/bom/Document.hpp trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/bom/ResultHolder.hpp trunk/opentrep/opentrep/bom/StringMatcher.cpp trunk/opentrep/opentrep/bom/StringMatcher.hpp trunk/opentrep/opentrep/bom/WordHolder.hpp trunk/opentrep/opentrep/bom/sources.mk trunk/opentrep/opentrep/command/RequestInterpreter.cpp trunk/opentrep/opentrep/command/RequestInterpreter.hpp trunk/opentrep/opentrep/service/OPENTREP_Service.cpp Removed Paths: ------------- trunk/opentrep/opentrep/bom/WordList.hpp Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-07-18 23:30:37 UTC (rev 140) @@ -3550,7 +3550,7 @@ en,cdd,cauquira,cauquira,cauquira/hn:cauquira airport en,cde,caledonia,caledonia,caledonia/pa en,cdf,cortina d'ampezzo,cortina d'ampez,cortina d'ampezzo/it:fiames -en,cdg,paris cdg,paris cdg,paris/fr:charles de gaulle +en,cdg,paris cdg,paris cdg,paris/fr:charles de gaulle,cdg,cdg en,cdh,camden,camden,camden/ar/us:harrell fld en,cdi,cachoeiro,cachoeiro,cachoeiro de i/es/br:cachoeiro en,cdj,conceicao do arag,conceicao do ar,conceicao do arag/pa/br Modified: trunk/opentrep/opentrep/OPENTREP_Service.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -30,9 +30,10 @@ "sna francicso rio de janero lso angles reykyavki nce iev mow"). @param LocationList_T& List of (geographical) locations, if any, matching the given query string. + @param WordList_T& List of non-matched words of the query string. @return NbOfMatches_T Number of matches. */ NbOfMatches_T interpretTravelRequest (const std::string& iTravelQuery, - LocationList_T&); + LocationList_T&, WordList_T&); // ////////// Constructors and destructors ////////// Modified: trunk/opentrep/opentrep/OPENTREP_Types.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -7,6 +7,7 @@ // STL #include <exception> #include <string> +#include <list> namespace OPENTREP { @@ -62,6 +63,12 @@ /** Number of matching documents. */ typedef unsigned short NbOfMatches_T; + /** Word, which is the atomic element of a query string. */ + typedef std::string Word_T; + + /** List of words. */ + typedef std::list<Word_T> WordList_T; + } #endif // __OPENTREP_OPENTREP_TYPES_HPP Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -228,13 +228,16 @@ lXapianDatabaseName); // Query the Xapian database (index) + OPENTREP::WordList_T lNonMatchedWordList; OPENTREP::LocationList_T lLocationList; const OPENTREP::NbOfMatches_T nbOfMatches = - opentrepService.interpretTravelRequest (lTravelQuery, lLocationList); + opentrepService.interpretTravelRequest (lTravelQuery, lLocationList, + lNonMatchedWordList); - std::cout << nbOfMatches << " (geographical) location(s) have been found " - << "matching your query (`" << lTravelQuery << "´)." - << std::endl; + std::cout << nbOfMatches << " (geographical) location(s) have been found " + << "matching your query (`" << lTravelQuery << "´). " + << lNonMatchedWordList.size() << " words were left unmatched." + << std::endl; if (nbOfMatches != 0) { OPENTREP::NbOfMatches_T idx = 1; @@ -245,6 +248,18 @@ std::cout << " [" << idx << "]: " << lLocation << std::endl; } } + + if (lNonMatchedWordList.empty() == false) { + std::cout << "List of unmatched words:" << std::endl; + + OPENTREP::NbOfMatches_T idx = 1; + for (OPENTREP::WordList_T::const_iterator itWord = + lNonMatchedWordList.begin(); + itWord != lNonMatchedWordList.end(); ++itWord, ++idx) { + const OPENTREP::Word_T& lWord = *itWord; + std::cout << " [" << idx << "]: " << lWord << std::endl; + } + } // Close the Log outputFile logOutputFile.close(); Modified: trunk/opentrep/opentrep/bom/Document.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Document.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/Document.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -9,17 +9,6 @@ namespace OPENTREP { // ////////////////////////////////////////////////////////////////////// - Document::Document () { - } - - // ////////////////////////////////////////////////////////////////////// - Document::Document (const Document& iDocument) - : _queryString (iDocument._queryString), - _document (iDocument._document), - _documentList (iDocument._documentList) { - } - - // ////////////////////////////////////////////////////////////////////// Document::~Document () { } @@ -32,17 +21,42 @@ // ////////////////////////////////////////////////////////////////////// std::string Document::describeKey() const { - return describeShortKey(); + std::ostringstream oStr; + oStr << describeShortKey(); + if (_correctedQueryString.empty() == false) { + oStr << " (corrected into " << _correctedQueryString << ")"; + } + return oStr.str(); } // ////////////////////////////////////////////////////////////////////// std::string Document::toString() const { std::ostringstream oStr; - oStr << describeShortKey() << std::endl; + oStr << describeKey() << std::endl; const Xapian::docid& lDocID = _document.get_docid(); oStr << "Document ID " << lDocID << "\t" << _percentage << "% [" << _document.get_data() << "]"; + + if (_documentList.empty() == false) { + oStr << " along with " << _documentList.size() + << " other matching document(s) ("; + + unsigned short idx = 0; + for (XapianDocumentList_T::const_iterator itDoc = _documentList.begin(); + itDoc != _documentList.end(); ++itDoc, ++idx) { + const Xapian::Document& lXapianDoc = *itDoc; + const Xapian::docid& lDocID = lXapianDoc.get_docid(); + if (idx != 0) { + oStr << ", "; + } + oStr << lDocID; + } + oStr << ")." << std::endl; + + } else { + oStr << std::endl; + } return oStr.str(); } Modified: trunk/opentrep/opentrep/bom/Document.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Document.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/Document.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -32,6 +32,12 @@ return _queryString; } + /** Get the corrected query string. + <br>When empty, it means that no correction was necessary. */ + const TravelQuery_T& getCorrectedTravelQuery() { + return _correctedQueryString; + } + /** Get the matching Xapian document. */ const Xapian::Document& getXapianDocument() const { return _document; @@ -49,10 +55,16 @@ // ////////////////// Setters //////////////// + /** Set the query string. */ void setQueryString (const TravelQuery_T& iQueryString) { _queryString = iQueryString; } + /** Set the corrected query string. */ + void setCorrectedQueryString (const TravelQuery_T& iCorrectedQueryString) { + _correctedQueryString = iCorrectedQueryString; + } + /** Set the matching Xapian document. */ void setXapianDocument (const Xapian::Document& iMatchingDocument) { _document = iMatchingDocument; @@ -94,9 +106,9 @@ public: // //////////////// Constructors and Destructors ///////////// /** Default constructor. */ - Document (); + // Document (); /** Default copy constructor. */ - Document (const Document&); + // Document (const Document&); /** Default destructor. */ ~Document (); @@ -106,6 +118,9 @@ /** Query string with which a Xapian full text search is done. */ TravelQuery_T _queryString; + /** Query string with which a Xapian full text search is done. */ + TravelQuery_T _correctedQueryString; + /** Matching percentage, as returned by the Xapian full text search. <br>Generally, that percentage is equal to, or close to, 100%. */ Xapian::percent _percentage; Modified: trunk/opentrep/opentrep/bom/ResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -68,35 +68,41 @@ } // ////////////////////////////////////////////////////////////////////// - bool ResultHolder::searchString (TravelQuery_T& ioPartialQueryString, - Document& ioMatchingDocument) { - bool oFoundDocument = false; + std::string ResultHolder::searchString (TravelQuery_T& ioPartialQueryString, + Document& ioMatchingDocument) { + std::string oMatchedString; // Catch any Xapian::Error exceptions thrown try { bool shouldStop = false; while (shouldStop == false) { - // DEBUG - OPENTREP_LOG_DEBUG ("Current query string: `" - << ioPartialQueryString << "'"); - // Retrieve the list of documents matching the query string Xapian::MSet lMatchingSet; - StringMatcher::searchString (lMatchingSet, ioPartialQueryString, - _database); + oMatchedString = StringMatcher::searchString (lMatchingSet, + ioPartialQueryString, + _database); - // Create the corresponding list of documents - oFoundDocument = StringMatcher:: - extractBestMatchingDocumentFromMSet(lMatchingSet, ioMatchingDocument); - - // Stop if a result is found. - if (oFoundDocument == true) { + // DEBUG + OPENTREP_LOG_DEBUG ("Current initial query string: `" + << ioPartialQueryString << "´ --- Kept query: `" + << oMatchedString << "´ for " + << lMatchingSet.size() << " matches."); + + if (lMatchingSet.empty() == false) { + // Create the corresponding list of documents + StringMatcher:: + extractBestMatchingDocumentFromMSet (lMatchingSet, + ioMatchingDocument); + + // Since a result has been found, the search can be stopped + // for that part of the query. shouldStop = true; break; } - // Remove the furthest right word from the query string + // Since the query, as is, yield no match, the furthest right + // word must be removed from the query string. StringMatcher::removeFurthestRightWord (ioPartialQueryString); // Stop when the resulting string gets empty. @@ -109,11 +115,12 @@ OPENTREP_LOG_ERROR ("Exception: " << error.get_msg()); } - return oFoundDocument; + return oMatchedString; } // ////////////////////////////////////////////////////////////////////// - void ResultHolder::searchString (DocumentList_T& ioDocumentList) { + void ResultHolder::searchString (DocumentList_T& ioDocumentList, + WordList_T& ioWordList) { // Catch any Xapian::Error exceptions thrown try { @@ -140,31 +147,39 @@ against the Xapian database). */ Document lMatchingDocument; - const bool hasFoundDocument = searchString (lQueryString, - lMatchingDocument); + const std::string lMatchedString = searchString (lQueryString, + lMatchingDocument); - if (hasFoundDocument == true) { + if (lMatchedString.empty() == false) { lMatchingDocument.setQueryString (lQueryString); + lMatchingDocument.setCorrectedQueryString (lMatchedString); ioDocumentList.push_back (lMatchingDocument); // DEBUG - OPENTREP_LOG_DEBUG ("==> Matching of the query string: `" - << lQueryString << "'"); + const XapianDocumentList_T& lXapianDocumentList = + lMatchingDocument.getExtraDocumentList(); + const NbOfMatches_T lNbOfMatches = 1 + lXapianDocumentList.size(); + OPENTREP_LOG_DEBUG ("==> " << lNbOfMatches + << " matches for the query string: `" + << lMatchedString << "´ (from `" + << lQueryString << "´)"); /** Remove, from the lRemainingQueryString string, the part - which has been already successfully parsed. <br>For - instance, when 'sna francisco rio de janeiro' is the - initial full clean query string, the searchString() + which has been already successfully parsed. + <br>For instance, when 'sna francisco rio de janeiro' is + the initial full clean query string, the searchString() method first reduce the query string to 'sna francisco', which successfully matches against SFO (San Francisco - airport). <br>Then, the remaining part of the query - string to be parsed is 'rio de janeiro'. So, the already - parsed part, namely 'sna francisco', must be subtracted - from the initial query string. + airport). + <br>Then, the remaining part of the query string to be + parsed is 'rio de janeiro'. So, the already parsed part, + namely 'sna francisco', must be subtracted from the + initial query string. */ StringMatcher::subtractParsedToRemaining (lQueryString, lRemainingQueryString); + } else { // DEBUG OPENTREP_LOG_DEBUG ("==> No matching of the query string: `" @@ -180,8 +195,12 @@ therefore be empty, and the loop will therefore be exited in the next step below. */ - // Remove the furthest right word from the query string - StringMatcher::removeFurthestLeftWord (lRemainingQueryString); + // Remove the furthest right word from the query string... + const Word_T& lNonMatchedWord = + StringMatcher::removeFurthestLeftWord (lRemainingQueryString); + + // ... and add it to the list of unmatched words. + ioWordList.push_back (lNonMatchedWord); } // If there is nothing left to be parsed, we have then finished Modified: trunk/opentrep/opentrep/bom/ResultHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -38,8 +38,10 @@ public: // /////////// Business methods ///////// - /** Retrieve the list of documents matching the query string. */ - void searchString (DocumentList_T&); + /** Retrieve the list of documents matching the query string. + @param DocumentList_T& List of matched documents by the query string. + @param WordList_T& List of non-matched words of the query string. */ + void searchString (DocumentList_T&, WordList_T&); private: @@ -47,7 +49,7 @@ @param TravelQuery_T& The partial query string. @param MatchingDocument_T& The best matching Xapian document (if found). @return bool Whether such a best matching document has been found. */ - bool searchString (TravelQuery_T& ioPartialQueryString, Document&); + std::string searchString (TravelQuery_T& ioPartialQueryString, Document&); public: Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -79,9 +79,10 @@ } // /////////////////////////////////////////////////////////////////// - void StringMatcher::searchString (Xapian::MSet& ioMatchingSet, - const std::string& iSearchString, - const Xapian::Database& ioDatabase) { + std::string StringMatcher::searchString (Xapian::MSet& ioMatchingSet, + const std::string& iSearchString, + const Xapian::Database& ioDatabase) { + std::string oMatchedString; // Catch any Xapian::Error exceptions thrown try { @@ -108,10 +109,11 @@ WordHolder::tokeniseStringIntoWordList (iSearchString, lOriginalWordList); /** - We rebuild a clean query string from the word list. Indeed, the original - string may have contained a few separators (e.g., '/', ';', etc.), which - have been removed by the tokeniseStringIntoWordList() method. All those - separators are thus replaced by spaces. + We rebuild a clean query string from the word list. Indeed, + the original string may have contained a few separators + (e.g., '/', ';', etc.), which have been removed by the + tokeniseStringIntoWordList() method. All those separators are + thus replaced by spaces. For instance, the 'san francisco, ca, us' initial string would be replaced by 'san francisco ca us'. */ @@ -133,14 +135,15 @@ const EditDistance_T lEditDistance = calculateEditDistance (lOriginalQueryString); const std::string lFullWordCorrectedString = - ioDatabase.get_spelling_suggestion (lOriginalQueryString, lEditDistance); + ioDatabase.get_spelling_suggestion(lOriginalQueryString, lEditDistance); // DEBUG /* - OPENTREP_LOG_DEBUG ("Query string `" << lOriginalQueryString - << "' ==> corrected query string: `" << lCorrectedQueryString - << "' and correction for the full query string: `" - << lFullWordCorrectedString << "'"); + OPENTREP_LOG_DEBUG ("Query string `" << lOriginalQueryString + << "' ==> corrected query string: `" + << lCorrectedQueryString + << "' and correction for the full query string: `" + << lFullWordCorrectedString << "'"); */ // Build the query object @@ -220,9 +223,7 @@ int nbMatches = ioMatchingSet.size(); // DEBUG - /* - OPENTREP_LOG_DEBUG (nbMatches << " results found"); - */ + // OPENTREP_LOG_DEBUG (nbMatches << " results found"); /** When no match is found, we search on the corrected phrase/string @@ -236,9 +237,10 @@ nbMatches = ioMatchingSet.size(); // DEBUG - /* - OPENTREP_LOG_DEBUG(nbMatches << " results found on corrected string"); - */ + //OPENTREP_LOG_DEBUG(nbMatches << " results found on corrected string"); + + } else { + oMatchedString = lOriginalQueryString; } /** @@ -253,12 +255,16 @@ nbMatches = ioMatchingSet.size(); // DEBUG - /* - OPENTREP_LOG_DEBUG (nbMatches - << " results found on corrected full string"); - */ + // OPENTREP_LOG_DEBUG (nbMatches + // << " results found on corrected full string"); + + } else { + oMatchedString = lCorrectedQueryString; } + if (nbMatches != 0 && oMatchedString.empty() == true) { + oMatchedString = lFullWordCorrectedString; + } // DEBUG /* @@ -270,19 +276,16 @@ } catch (const Xapian::Error& error) { OPENTREP_LOG_ERROR ("Exception: " << error.get_msg()); } + + return oMatchedString; } // ////////////////////////////////////////////////////////////////////// - bool StringMatcher:: + void StringMatcher:: extractBestMatchingDocumentFromMSet (const Xapian::MSet& iMatchingSet, Document& ioMatchingDocument) { - bool oFoundDocument = false; + assert (iMatchingSet.empty() == false); - if (iMatchingSet.empty() == true) { - return oFoundDocument; - } - oFoundDocument = true; - /** Retrieve the best matching document. If there are several such best matching documents (for instance, several at, say, 100%), @@ -291,14 +294,22 @@ same: it appears random). */ Xapian::MSetIterator itDoc = iMatchingSet.begin(); + assert (itDoc != iMatchingSet.end()); + const Xapian::percent& lBestPercentage = itDoc.get_percent(); ioMatchingDocument.setXapianPercentage (lBestPercentage); ioMatchingDocument.setXapianDocument (itDoc.get_document()); /** Add all the Xapian documents having reached the same matching percentage. */ - for ( ; itDoc != iMatchingSet.end(); ++itDoc) { + NbOfMatches_T idx = 1; + for (++itDoc ; itDoc != iMatchingSet.end(); ++itDoc, ++idx) { const Xapian::percent& lPercentage = itDoc.get_percent(); + + // DEBUG + OPENTREP_LOG_DEBUG ("[" << idx << "] extra. Best " << lBestPercentage + << "%, other " << lPercentage << "% (" + << itDoc.get_document().get_docid() << ")"); if (lPercentage == lBestPercentage) { ioMatchingDocument.addExtraDocument (itDoc.get_document()); @@ -307,8 +318,6 @@ break; } } - - return oFoundDocument; } // ////////////////////////////////////////////////////////////////////// @@ -328,19 +337,24 @@ } // ////////////////////////////////////////////////////////////////////// - void StringMatcher::removeFurthestLeftWord (std::string& ioQueryString) { + Word_T StringMatcher::removeFurthestLeftWord (std::string& ioQueryString) { assert (ioQueryString.empty() == false); WordList_T lWordList; WordHolder::tokeniseStringIntoWordList (ioQueryString, lWordList); assert (lWordList.empty() == false); + // Store (a copy of) the furthest left word + const Word_T lFurthestLeftWord = lWordList.front(); + // Remove the furthest left word lWordList.pop_front(); const std::string& lReducedString = WordHolder::createStringFromWordList (lWordList); ioQueryString = lReducedString; + + return lFurthestLeftWord; } // ////////////////////////////////////////////////////////////////////// Modified: trunk/opentrep/opentrep/bom/StringMatcher.hpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -7,8 +7,8 @@ // STL #include <string> // OpenTREP +#include <opentrep/OPENTREP_Types.hpp> #include <opentrep/bom/BomAbstract.hpp> -#include <opentrep/bom/WordList.hpp> #include <opentrep/bom/Document.hpp> // Forward declarations @@ -25,36 +25,36 @@ class StringMatcher : public BomAbstract { public: /** Search, within the Xapian database, for occurrences of the - words of the search string. */ - static void searchString (Xapian::MSet&, const std::string& iSearchString, - const Xapian::Database&); + words of the search string. + @param Xapian::MSet& The Xapian matching set. It can be empty. + @param const std::string& The query string. + @param const Xapian::Database& The Xapian index/database. + @return std::string The query string, potentially corrected, + which has yielded matches. */ + static std::string searchString (Xapian::MSet&, + const std::string& iSearchString, + const Xapian::Database&); - /** - Extract the best matching Xapian document. - <br>If there are several such best matching documents (for - instance, several at, say, 100%), one is taken randomly. Well, - as we take the first one of the STL multimap, it is not exactly - randomly, but the result is the same: it appears to be random. - @return bool Whether or not there was a matching document. - */ - static bool + /** Extract the best matching Xapian document. + <br>If there are several such best matching documents (for + instance, several at, say, 100%), one is taken randomly. Well, + as we take the first one of the STL multimap, it is not exactly + randomly, but the result is the same: it appears to be random. + @param Xapian::MSet& The Xapian matching set. It can be empty. */ + static void extractBestMatchingDocumentFromMSet (const Xapian::MSet&, Document&); /** Remove the word furthest at right. */ static void removeFurthestRightWord (std::string& ioQueryString); /** Remove the word furthest at left. */ - static void removeFurthestLeftWord (std::string& ioQueryString); + static Word_T removeFurthestLeftWord (std::string& ioQueryString); /** Remove, from a string, the part corresponding to the one given as parameter. */ static void subtractParsedToRemaining (const std::string& iAlreadyParsedQueryString, std::string& ioRemainingQueryString); - - - private: - }; } Modified: trunk/opentrep/opentrep/bom/WordHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/WordHolder.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/WordHolder.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -5,8 +5,8 @@ // Import section // ////////////////////////////////////////////////////////////////////// // OpenTREP +#include <opentrep/OPENTREP_Types.hpp> #include <opentrep/bom/BomAbstract.hpp> -#include <opentrep/bom/WordList.hpp> namespace OPENTREP { Deleted: trunk/opentrep/opentrep/bom/WordList.hpp =================================================================== --- trunk/opentrep/opentrep/bom/WordList.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/WordList.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -1,18 +0,0 @@ -#ifndef __OPENTREP_BOM_WORDLIST_HPP -#define __OPENTREP_BOM_WORDLIST_HPP - -// ////////////////////////////////////////////////////////////////////// -// Import section -// ////////////////////////////////////////////////////////////////////// -// STL -#include <string> -#include <list> - -namespace OPENTREP { - - /** List of simple words (STL strings). */ - typedef std::list<std::string> WordList_T; - -} -#endif // __OPENTREP_BOM_WORDLIST_HPP - Modified: trunk/opentrep/opentrep/bom/sources.mk =================================================================== --- trunk/opentrep/opentrep/bom/sources.mk 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/bom/sources.mk 2009-07-18 23:30:37 UTC (rev 140) @@ -3,7 +3,6 @@ $(top_srcdir)/opentrep/bom/Language.hpp \ $(top_srcdir)/opentrep/bom/GenericBom.hpp \ $(top_srcdir)/opentrep/bom/World.hpp \ - $(top_srcdir)/opentrep/bom/WordList.hpp \ $(top_srcdir)/opentrep/bom/WordHolder.hpp \ $(top_srcdir)/opentrep/bom/Names.hpp \ $(top_srcdir)/opentrep/bom/Place.hpp \ Modified: trunk/opentrep/opentrep/command/RequestInterpreter.cpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -111,7 +111,8 @@ interpretTravelRequest (soci::session& ioSociSession, const TravelDatabaseName_T& iTravelDatabaseName, const TravelQuery_T& iTravelQuery, - LocationList_T& ioLocationList) { + LocationList_T& ioLocationList, + WordList_T& ioWordList) { NbOfMatches_T oNbOfMatches = 0; // Create a PlaceHolder object, to collect the matching Place objects @@ -132,7 +133,7 @@ // Main algorithm DocumentList_T lDocumentList; - lResultHolder.searchString (lDocumentList); + lResultHolder.searchString (lDocumentList, ioWordList); /** Create the list of Result objects corresponding to the list of documents. */ Modified: trunk/opentrep/opentrep/command/RequestInterpreter.hpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.hpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/command/RequestInterpreter.hpp 2009-07-18 23:30:37 UTC (rev 140) @@ -32,11 +32,12 @@ "sna francicso rio de janero lso angles reykyavki nce iev mow"). @param LocationList_T& List of (geographical) locations, if any, matching the given query string. + @param WordList_T& List of non-matched words of the query string. @return NbOfMatches_T Number of matches. */ static NbOfMatches_T interpretTravelRequest (soci::session&, const TravelDatabaseName_T&, const TravelQuery_T&, - LocationList_T&); + LocationList_T&, WordList_T&); private: /** Constructors. */ Modified: trunk/opentrep/opentrep/service/OPENTREP_Service.cpp =================================================================== --- trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-18 20:30:08 UTC (rev 139) +++ trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-18 23:30:37 UTC (rev 140) @@ -112,7 +112,9 @@ // ////////////////////////////////////////////////////////////////////// NbOfMatches_T OPENTREP_Service:: interpretTravelRequest (const std::string& iTravelQuery, - LocationList_T& ioLocationList) { + LocationList_T& ioLocationList, + WordList_T& ioWordList) { + if (_opentrepServiceContext == NULL) { throw NonInitialisedServiceException(); } @@ -133,7 +135,8 @@ const NbOfMatches_T nbOfMatches = RequestInterpreter::interpretTravelRequest (lSociSession, lTravelDatabaseName, - iTravelQuery, ioLocationList); + iTravelQuery, ioLocationList, + ioWordList); const double lRequestInterpreterMeasure = lRequestInterpreterChronometer.elapsed(); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-19 21:48:31
|
Revision: 143 http://opentrep.svn.sourceforge.net/opentrep/?rev=143&view=rev Author: denis_arnaud Date: 2009-07-19 21:48:27 +0000 (Sun, 19 Jul 2009) Log Message: ----------- [i18n] Added a few simple tests on STL i18n (locale) features. Modified Paths: -------------- trunk/opentrep/configure.ac trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/opentrep/dbadaptor/DbaPlace.cpp Added Paths: ----------- trunk/opentrep/test/i18n/ trunk/opentrep/test/i18n/Makefile.am trunk/opentrep/test/i18n/boost_string.cpp trunk/opentrep/test/i18n/loc2.cpp trunk/opentrep/test/i18n/stdlocru.cpp Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-19 13:12:19 UTC (rev 142) +++ trunk/opentrep/configure.ac 2009-07-19 21:48:27 UTC (rev 143) @@ -240,6 +240,7 @@ db/data/Makefile test/com/Makefile test/parsers/Makefile + test/i18n/Makefile test/Makefile win32/Makefile) AC_OUTPUT Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-19 13:12:19 UTC (rev 142) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-07-19 21:48:27 UTC (rev 143) @@ -1,7 +1,4 @@ concat('en,', code, ',', ticketing_name, ',', teleticketing_name, ',', extended_name) -de,muc,münchen,muenchen,munich/de:franz j strauss -fr,muc,munique,munique,munich/de:franz j strauss -ru,muc,мюнхен,munich,munich/de:franz j strauss en,cyz,cauayan,cauayan,cauayan/ph en,cza,chichen itza,chichen itza,chichen itza/mx en,czb,cruz alta,cruz alta,cruz alta/rs/br:carlos ruhl @@ -3710,7 +3707,7 @@ en,bbo,berbera,berbera,berbera/so en,bbp,bembridge,bembridge,bembridge/gb en,bbq,barbuda,barbuda,barbuda/ag -en,bbr,basse terre,basse terre,basse terre/gp:baillif +en,bbr,basse terre,basse terre,basse terre/gp:baillif,guadeloupe en,bbs,blackbush,blackbush,blackbush/gb en,bbt,berberati,berberati,berberati/cf en,bbu,bucharest baneasa,bucharest banea,bucharest/ro:baneasa @@ -3838,7 +3835,7 @@ en,fct,yakima aaf,yakima aaf,yakima/wa/us:firing center aaf en,fcy,forrest city,forrest city,forrest city/ar/us:municipal en,fde,forde,forde,forde/no:bringeland -en,fdf,fort de france,fort de france,fort de france/mq:lamentin +en,fdf,fort de france,fort de france,fort de france/mq:lamentin,martinique en,fdh,friedrichshafen,friedrichshafen,friedrichshafen/de:friedrichsh en,fdk,frederick,frederick,frederick/md/us:municipal en,fdr,frederick,frederick,frederick/ok/us:municipal @@ -7731,7 +7728,7 @@ en,ptm,palmarito,palmarito,palmarito/ve en,ptn,morgan city,morgan city,morgan city/la/us:mncpl hpt en,pto,pato branco,pato branco,pato branco/pr/br:municipal -en,ptp,pointe a pitre,pointe a pitre,pointe a pitr/gp:pole caraibes +en,ptp,pointe a pitre,pointe a pitre,pointe a pitre/gp:pole caraibes,guadeloupe en,ptq,porto de moz,porto de moz,porto de moz/pa/br en,ptr,pleasant harbour,pleasant harbou,pleasant harbour/ak/us en,pts,pittsburg,pittsburg,pittsburg/ks/us:municipal @@ -8853,7 +8850,7 @@ en,pjz,puerto juarez,puerto juarez,puerto juarez/mx en,pka,napaskiak,napaskiak,napaskiak/ak/us:spb en,pkb,mid ohio valley,mid ohio valley,parkersburg/wv/us:mid ohio val -en,pkc,petropavlovsk kam,petropavlovsk k,petropavlovsk kam/ru +en,pkc,petropavlovsk kamchatka,petropavlovsk kam,petropavlovsk kam/ru,petropavlovsk,kamchatka en,pkd,park rapids,park rapids,park rapids/mn/us en,pke,parkes,parkes,parkes/ns/au en,pkf,park falls,park falls,park falls/wi/us @@ -9144,7 +9141,7 @@ en,rui,ruidoso,ruidoso,ruidoso/nm/us:municipal en,ruk,rukumkot,rukumkot,rukumkot/np en,rum,rumjatar,rumjatar,rumjatar/np -en,run,st denis reunion,st denis reunio,st denis reunion/re:roland gar +en,run,st denis reunion,st denis reunion,st denis reunion/re:roland gar,la reunion,reunion en,rup,rupsi,rupsi,rupsi/in en,rur,rurutu,rurutu,rurutu/pf en,rus,marau sound,marau sound,marau sound/sb Modified: trunk/opentrep/opentrep/dbadaptor/DbaPlace.cpp =================================================================== --- trunk/opentrep/opentrep/dbadaptor/DbaPlace.cpp 2009-07-19 13:12:19 UTC (rev 142) +++ trunk/opentrep/opentrep/dbadaptor/DbaPlace.cpp 2009-07-19 21:48:27 UTC (rev 143) @@ -54,7 +54,7 @@ ioPlace.addName (lLanguageCode, lClassicalName); const std::string& lClassicalName2 = - iPlaceValues.get<std::string> ("classical_name", ""); + iPlaceValues.get<std::string> ("classical_name2", ""); ioPlace.addName (lLanguageCode, lClassicalName2); const std::string& lExtendedName = Property changes on: trunk/opentrep/test/i18n ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile.in Makefile boost_string loc2 stdlocru Added: trunk/opentrep/test/i18n/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/Makefile.am (rev 0) +++ trunk/opentrep/test/i18n/Makefile.am 2009-07-19 21:48:27 UTC (rev 143) @@ -0,0 +1,20 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +check_PROGRAMS = boost_string loc2 stdlocru + +boost_string_SOURCES = boost_string.cpp +boost_string_CXXFLAGS = $(BOOST_CFLAGS) +boost_string_LDADD = $(BOOST_LIB) + +loc2_SOURCES = loc2.cpp +loc2_CXXFLAGS = $(BOOST_CFLAGS) +loc2_LDADD = $(BOOST_LIB) + +stdlocru_SOURCES = stdlocru.cpp +stdlocru_CXXFLAGS = $(BOOST_CFLAGS) +stdlocru_LDADD = $(BOOST_LIB) + +EXTRA_DIST = Added: trunk/opentrep/test/i18n/boost_string.cpp =================================================================== --- trunk/opentrep/test/i18n/boost_string.cpp (rev 0) +++ trunk/opentrep/test/i18n/boost_string.cpp 2009-07-19 21:48:27 UTC (rev 143) @@ -0,0 +1,16 @@ +// C +#include <iostream> +// Boost String +#include <boost/algorithm/string.hpp> + +// ////////////// M A I N ////////////// +int main (int argc, char* argv[]) { + + std::string str1(" hello world! "); + std::cout << "Before: " << str1 << std::endl; + + boost::to_upper (str1); + std::cout << "After: " << str1 << std::endl; + + return 0; +} Added: trunk/opentrep/test/i18n/loc2.cpp =================================================================== --- trunk/opentrep/test/i18n/loc2.cpp (rev 0) +++ trunk/opentrep/test/i18n/loc2.cpp 2009-07-19 21:48:27 UTC (rev 143) @@ -0,0 +1,60 @@ +// C +#include <iostream> +#include <locale> +#include <string> +#include <cstdlib> + + +// ////////////// M A I N ////////////// +int main (int argc, char* argv[]) { + + // Create the default locale from the user's environment + std::locale langLocale (""); + + // and assign it to the standard output channel + std::cout.imbue (langLocale); + + // + std::cout << "User's environment locale: " << langLocale.name() << std::endl; + + bool isFrench = false; + if (langLocale.name() == "fr_FR" + || langLocale.name() == "fr" + || langLocale.name() == "french") { + isFrench = true; + } + + // Read locale for the input + if (isFrench) { + std::cout << "Locale pour l'entrée: " << std::endl; + + } else { + std::cout << "Locale for input: " << std::endl; + } + + std::string tmpString; + std::cin >> tmpString; + + if (!std::cin) { + if (isFrench) { + std::cerr << "Error while reading the locale" << std::endl; + + } else { + std::cerr << "Error while reading the locale" << std::endl; + } + return EXIT_FAILURE; + } + + std::locale cinLocale (tmpString.c_str()); + + // and assign it to the standard input channel + std::cin.imbue (cinLocale); + + // Read and output floating-point values in a loop + double value; + while (std::cin >> value) { + std::cout << value << std::endl; + } + + return 0; +} Added: trunk/opentrep/test/i18n/stdlocru.cpp =================================================================== --- trunk/opentrep/test/i18n/stdlocru.cpp (rev 0) +++ trunk/opentrep/test/i18n/stdlocru.cpp 2009-07-19 21:48:27 UTC (rev 143) @@ -0,0 +1,32 @@ +// C +#include <iostream> +#include <locale> +#include <string> +#include <cstdlib> + + +// ////////////// M A I N ////////////// +int main (int argc, char* argv[]) { + + // Create the default locale from the user's environment + std::locale langLocale ("ru_RU.UTF-8"); + + // and assign it to the standard output channel + //std::cout.imbue (langLocale); + + // and assign it to the standard input channel + //std::cin.imbue (langLocale); + + // Display with no processing + std::cout << "de: München" << std::endl; + std::cout << "ru: Мюнхен" << std::endl; + + const std::string mucDE ("München"); + const std::string mucRU ("Мюнхен"); + + // Display the strings + std::cout << "de: " << mucDE << std::endl; + std::cout << "ru: " << mucRU << std::endl; + + return 0; +} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-20 09:59:46
|
Revision: 145 http://opentrep.svn.sourceforge.net/opentrep/?rev=145&view=rev Author: denis_arnaud Date: 2009-07-20 09:59:43 +0000 (Mon, 20 Jul 2009) Log Message: ----------- [Python] Added a test about Boost.Python Modified Paths: -------------- trunk/opentrep/configure.ac Added Paths: ----------- trunk/opentrep/opentrep/python/ trunk/opentrep/test/python/ trunk/opentrep/test/python/Makefile.am trunk/opentrep/test/python/boost_hello.py trunk/opentrep/test/python/boost_python.cpp Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-19 22:25:34 UTC (rev 144) +++ trunk/opentrep/configure.ac 2009-07-20 09:59:43 UTC (rev 145) @@ -241,6 +241,7 @@ test/com/Makefile test/parsers/Makefile test/i18n/Makefile + test/python/Makefile test/Makefile win32/Makefile) AC_OUTPUT Property changes on: trunk/opentrep/test/python ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in Added: trunk/opentrep/test/python/Makefile.am =================================================================== --- trunk/opentrep/test/python/Makefile.am (rev 0) +++ trunk/opentrep/test/python/Makefile.am 2009-07-20 09:59:43 UTC (rev 145) @@ -0,0 +1,15 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +# Library +lib_LTLIBRARIES = libpy@PACKAGE@.la + +libpy@PACKAGE@_la_SOURCES = boost_python.cpp +libpy@PACKAGE@_la_CXXFLAGS = -I/usr/include/python2.6 +#libpy@PACKAGE@_la_LIBADD = +libpy@PACKAGE@_la_LDFLAGS = -L/usr/lib64 -lboost_python-mt -lpython2.6 \ + -version-info $(GENERIC_LIBRARY_VERSION) + +EXTRA_DIST = Added: trunk/opentrep/test/python/boost_hello.py =================================================================== --- trunk/opentrep/test/python/boost_hello.py (rev 0) +++ trunk/opentrep/test/python/boost_hello.py 2009-07-20 09:59:43 UTC (rev 145) @@ -0,0 +1,14 @@ +#!/usr/bin/env python + +import sys +sys.path.append('.libs') +import libpyopentrep +myWorld = libpyopentrep.World() +myWorld.add('Bonjour', 5) +myWorld.add('Hello', 7) +myWorld.add('Gruss Gott', 5) +#MsgList=myWorld.getMsgList() +#IDList=myWorld.getIDList() +#print ', '.join(['%s: %s' % (msg, id) for (msg, id) in zip(MsgList, IDList)]) +a = myWorld.toSimpleString().split(', ') +print '^'.join(a[::-1]) Property changes on: trunk/opentrep/test/python/boost_hello.py ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/test/python/boost_python.cpp =================================================================== --- trunk/opentrep/test/python/boost_python.cpp (rev 0) +++ trunk/opentrep/test/python/boost_python.cpp 2009-07-20 09:59:43 UTC (rev 145) @@ -0,0 +1,98 @@ +// C +#include <sstream> +#include <string> +#include <list> +#include <vector> +// Boost String +#include <boost/python.hpp> + +namespace OPENTREP { + + /** */ + typedef unsigned int OpenTrepID_T; + + /** */ + typedef std::vector<std::string> MsgList_T; + + /** */ + typedef std::vector<OpenTrepID_T> IDList_T; + + /** */ + typedef std::pair<MsgList_T, IDList_T> OpenTrepList_T; + + // ////////////////////////////////////////////// + struct World { + public: + + /** */ + MsgList_T getMsgList () const { + return _msgList; + } + + /** */ + IDList_T getIDList () const { + return _idList; + } + + /** */ + OpenTrepList_T getListPair () const { + return OpenTrepList_T (_msgList, _idList); + } + + /** */ + void add (const std::string& iMessage, const OpenTrepID_T& iID) { + _msgList.push_back (iMessage); + _idList.push_back (iID); + } + + + /** */ + std::string toString() const { + std::ostringstream oStr; + IDList_T::const_iterator itID = _idList.begin(); + OpenTrepID_T idx = 0; + for (MsgList_T::const_iterator itMsg = _msgList.begin(); + itMsg != _msgList.end() && itID != _idList.end(); + ++itMsg, ++itID, ++idx) { + if (idx != 0) { + oStr << ", "; + } + oStr << *itID << ": " << *itMsg; + } + return oStr.str(); + } + + /** */ + std::string toSimpleString() const { + std::ostringstream oStr; + IDList_T::const_iterator itID = _idList.begin(); + OpenTrepID_T idx = 0; + for (MsgList_T::const_iterator itMsg = _msgList.begin(); + itMsg != _msgList.end() && itID != _idList.end(); + ++itMsg, ++itID, ++idx) { + if (idx != 0) { + oStr << ", "; + } + oStr << *itMsg; + } + return oStr.str(); + } + + private: + /** */ + MsgList_T _msgList; + IDList_T _idList; + }; + +} + +// ////////////////////////////////////////////// +BOOST_PYTHON_MODULE(libpyopentrep) { + boost::python::class_<OPENTREP::World> ("World") + .def ("getMsgList", &OPENTREP::World::getMsgList) + .def ("getIDList", &OPENTREP::World::getIDList) + .def ("getListPair", &OPENTREP::World::getListPair) + .def ("add", &OPENTREP::World::add) + .def ("toSimpleString", &OPENTREP::World::toSimpleString) + .def ("toString", &OPENTREP::World::toString); +} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-20 13:22:56
|
Revision: 146 http://opentrep.svn.sourceforge.net/opentrep/?rev=146&view=rev Author: denis_arnaud Date: 2009-07-20 13:22:52 +0000 (Mon, 20 Jul 2009) Log Message: ----------- [Test] Added support for Python wrapper (thanks to Boost Python). Modified Paths: -------------- trunk/opentrep/config/ax_boost.m4 trunk/opentrep/config/ax_mysql.m4 trunk/opentrep/configure.ac trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/test/python/Makefile.am trunk/opentrep/test/python/boost_python.cpp Added Paths: ----------- trunk/opentrep/config/python.m4 trunk/opentrep/test/python/boost_python.py trunk/opentrep/test/python/pyopentrep.cpp trunk/opentrep/test/python/pyopentrep.py Removed Paths: ------------- trunk/opentrep/test/python/boost_hello.py Modified: trunk/opentrep/config/ax_boost.m4 =================================================================== --- trunk/opentrep/config/ax_boost.m4 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/config/ax_boost.m4 2009-07-20 13:22:52 UTC (rev 146) @@ -18,6 +18,7 @@ dnl AC_SUBST(BOOST_SIGNALS_LIB) dnl AC_SUBST(BOOST_DATE_TIME_LIB) dnl AC_SUBST(BOOST_REGEX_LIB) +dnl AC_SUBST(BOOST_PYTHON) dnl AC_SUBST(BOOST_UNIT_TEST_FRAMEWORK_LIB) dnl dnl And sets: @@ -31,6 +32,7 @@ dnl HAVE_BOOST_SIGNALS dnl HAVE_BOOST_DATE_TIME dnl HAVE_BOOST_REGEX +dnl HAVE_BOOST_PYTHON dnl HAVE_BOOST_UNIT_TEST_FRAMEWORK dnl dnl @category InstalledPackages @@ -486,6 +488,27 @@ fi fi + AC_CACHE_CHECK(whether the Boost::Python library is available, + ax_cv_boost_python, + [AC_LANG_SAVE + AC_LANG_CPLUSPLUS + + saved_cflags="${CPPFLAGS}" + saved_ldflags="${LDFLAGS}" + CPPFLAGS="${CPPFLAGS} ${PYTHON_CFLAGS}" + LDFLAGS="${LDFLAGS} ${PYTHON_LIBS} ${PYTHON_ADD_LIBS}" + + AC_COMPILE_IFELSE(AC_LANG_PROGRAM([[@%:@include <boost/python.hpp> char const* greet() { return "hello"; } BOOST_PYTHON_MODULE(hello_ext) { boost::python::def("greet", greet); }]]), + ax_cv_boost_python=yes, ax_cv_boost_python=no) + AC_LANG_RESTORE + ]) + + BOOST_PYTHON_LIB="-lboost_python" + AC_SUBST(BOOST_PYTHON_LIB) + + CPPFLAGS="${saved_cflags}" + LDFLAGS="${saved_ldflags}" + AC_CACHE_CHECK(whether the Boost::UnitTestFramework library is available, ax_cv_boost_unit_test_framework, [AC_LANG_SAVE Modified: trunk/opentrep/config/ax_mysql.m4 =================================================================== --- trunk/opentrep/config/ax_mysql.m4 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/config/ax_mysql.m4 2009-07-20 13:22:52 UTC (rev 146) @@ -33,7 +33,7 @@ dnl dnl $Id: ax_mysql.m4,v 1.3 2007/06/23 01:51:22 mloskot Exp $ dnl -AC_DEFUN([AX_LIB_MYSQL], +AC_DEFUN([AX_MYSQL], [ AC_ARG_WITH([mysql], AC_HELP_STRING([--with-mysql=@<:@ARG@:>@], Added: trunk/opentrep/config/python.m4 =================================================================== --- trunk/opentrep/config/python.m4 (rev 0) +++ trunk/opentrep/config/python.m4 2009-07-20 13:22:52 UTC (rev 146) @@ -0,0 +1,94 @@ +# +# Autoconf macros for configuring the build of Python extension modules +# + +# PGAC_PATH_PYTHON +# ---------------- +# Look for Python and set the output variable 'PYTHON' +# to 'python' if found, empty otherwise. +AC_DEFUN([PGAC_PATH_PYTHON], +[AC_PATH_PROG(PYTHON, python) +if test x"$PYTHON" = x""; then + AC_MSG_ERROR([Python not found]) +fi +]) + + +# _PGAC_CHECK_PYTHON_DIRS +# ----------------------- +# Determine the name of various directories of a given Python installation. +AC_DEFUN([_PGAC_CHECK_PYTHON_DIRS], +[AC_REQUIRE([PGAC_PATH_PYTHON]) +AC_MSG_CHECKING([for Python distutils module]) +if "${PYTHON}" 2>&- -c 'import distutils' +then + AC_MSG_RESULT(yes) +else + AC_MSG_RESULT(no) + AC_MSG_ERROR([distutils module not found]) +fi +AC_MSG_CHECKING([Python configuration directory]) +python_version=`${PYTHON} -c "import sys; print sys.version[[:3]]"` +python_configdir=`${PYTHON} -c "from distutils.sysconfig import get_python_lib as f; import os; print os.path.join(f(plat_specific=1,standard_lib=1),'config')"` +PYTHON_CFLAGS=`${PYTHON} -c "import distutils.sysconfig; print '-I'+distutils.sysconfig.get_python_inc()"` + +AC_SUBST(python_version)[]dnl +AC_SUBST(python_configdir)[]dnl +AC_SUBST(PYTHON_CFLAGS)[]dnl +# This should be enough of a message. +AC_MSG_RESULT([$python_configdir]) +])# _PGAC_CHECK_PYTHON_DIRS + + +# PGAC_CHECK_PYTHON_EMBED_SETUP +# ----------------------------- +# +# Note: selecting libpython from python_configdir works in all Python +# releases, but it generally finds a non-shared library, which means +# that we are binding the python interpreter right into libplpython.so. +# In Python 2.3 and up there should be a shared library available in +# the main library location. +AC_DEFUN([PGAC_CHECK_PYTHON_EMBED_SETUP], +[AC_REQUIRE([_PGAC_CHECK_PYTHON_DIRS]) +AC_MSG_CHECKING([how to link an embedded Python application]) + +python_libdir=`${PYTHON} -c "import distutils.sysconfig,string; print string.join(filter(None,distutils.sysconfig.get_config_vars('LIBDIR')))"` +python_ldlibrary=`${PYTHON} -c "import distutils.sysconfig,string; print string.join(filter(None,distutils.sysconfig.get_config_vars('LDLIBRARY')))"` +python_so=`${PYTHON} -c "import distutils.sysconfig,string; print string.join(filter(None,distutils.sysconfig.get_config_vars('SO')))"` +ldlibrary=`echo "${python_ldlibrary}" | sed "s/${python_so}$//"` + +if test x"${python_libdir}" != x"" -a x"${python_ldlibrary}" != x"" -a x"${python_ldlibrary}" != x"${ldlibrary}" +then + # New way: use the official shared library + ldlibrary=`echo "${ldlibrary}" | sed "s/^lib//"` + PYTHON_LIBS="-L${python_libdir} -l${ldlibrary}" +else + # Old way: use libpython from python_configdir + python_libdir="${python_configdir}" + PYTHON_LIBS="-L${python_libdir} -lpython${python_version}" +fi + +PYTHON_ADD_LIBS=`${PYTHON} -c "import distutils.sysconfig,string; print string.join(filter(None,distutils.sysconfig.get_config_vars('LIBS','LIBC','LIBM','LOCALMODLIBS','BASEMODLIBS')))"` + +AC_MSG_RESULT([${PYTHON_LIBS} ${PYTHON_ADD_LIBS}]) + +AC_SUBST(python_incdir)[]dnl +AC_SUBST(python_libdir)[]dnl +AC_SUBST(PYTHON_LIBS)[]dnl +AC_SUBST(PYTHON_ADD_LIBS)[]dnl + +# threaded python is not supported on bsd's +AC_MSG_CHECKING(whether Python is compiled with thread support) +pythreads=`${PYTHON} -c "import sys; print int('thread' in sys.builtin_module_names)"` +if test "$pythreads" = "1"; then + AC_MSG_RESULT(yes) + case $host_os in + openbsd*|freebsd*) + AC_MSG_ERROR([threaded Python not supported on this platform]) + ;; + esac +else + AC_MSG_RESULT(no) +fi + +])# PGAC_CHECK_PYTHON_EMBED_SETUP Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/configure.ac 2009-07-20 13:22:52 UTC (rev 146) @@ -103,6 +103,15 @@ fi +# ----------------------------------------------------------- +# Python +# ----------------------------------------------------------- +PGAC_CHECK_PYTHON_EMBED_SETUP +AC_SUBST(PYTHON_VERSION) +AC_SUBST(PYTHON_LIBS) +AC_SUBST(PYTHON_CFLAGS) +AC_SUBST(PYTHON_ADD_LIBS) + # -------------------------------------------------------- # Boost (STL Extensions: http://www.boost.org) # -------------------------------------------------------- @@ -112,6 +121,7 @@ AC_SUBST(BOOST_LIBS) AC_SUBST(BOOST_DATE_TIME_LIB) AC_SUBST(BOOST_PROGRAM_OPTIONS_LIB) +AC_SUBST(BOOST_PYTHON_LIB) # -------------------------------------------------------------------- # Support for MySQL (C client API): http://www.mysql.org @@ -287,12 +297,19 @@ - LIBS .............. : ${LIBS} External libraries: + - Python ............. : + o PYTHON_version ... : ${PYTHON_VERSION} + o PYTHON_CFLAGS .... : ${PYTHON_CFLAGS} + o PYTHON_LIBS ...... : ${PYTHON_LIBS} + o PYTHON_ADD_LIBS .. : ${PYTHON_ADD_LIBS} + - Boost ............. : o BOOST_VERSION ... : ${BOOST_VERSION} o BOOST_CFLAGS .... : ${BOOST_CFLAGS} o BOOST_LIBS ...... : ${BOOST_LIBS} o BOOST_DT_LIB .... : ${BOOST_DATE_TIME_LIB} o BOOST_PO_LIB .... : ${BOOST_PROGRAM_OPTIONS_LIB} + o BOOST_PYTH_LIB .. : ${BOOST_PYTHON_LIB} - MySQL ............. : o MYSQL_version ... : ${MYSQL_VERSION} Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-20 13:22:52 UTC (rev 146) @@ -11,7 +11,7 @@ #include <boost/date_time/gregorian/gregorian.hpp> #include <boost/tokenizer.hpp> #include <boost/program_options.hpp> -// OPENTREP +// OpenTREP #include <opentrep/OPENTREP_Service.hpp> #include <opentrep/config/opentrep-paths.hpp> @@ -235,7 +235,7 @@ lNonMatchedWordList); std::cout << nbOfMatches << " (geographical) location(s) have been found " - << "matching your query (`" << lTravelQuery << "´). " + << "matching your query (`" << lTravelQuery << "\xB4). " << lNonMatchedWordList.size() << " words were left unmatched." << std::endl; Modified: trunk/opentrep/test/python/Makefile.am =================================================================== --- trunk/opentrep/test/python/Makefile.am 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/test/python/Makefile.am 2009-07-20 13:22:52 UTC (rev 146) @@ -3,13 +3,19 @@ MAINTAINERCLEANFILES = Makefile.in -# Library -lib_LTLIBRARIES = libpy@PACKAGE@.la +# Libraries +lib_LTLIBRARIES = libpyboost.la libpy@PACKAGE@.la -libpy@PACKAGE@_la_SOURCES = boost_python.cpp -libpy@PACKAGE@_la_CXXFLAGS = -I/usr/include/python2.6 -#libpy@PACKAGE@_la_LIBADD = -libpy@PACKAGE@_la_LDFLAGS = -L/usr/lib64 -lboost_python-mt -lpython2.6 \ - -version-info $(GENERIC_LIBRARY_VERSION) +libpyboost_la_SOURCES = boost_python.cpp +libpyboost_la_CXXFLAGS = ${PYTHON_CFLAGS} +#libpyboost_la_LIBADD = +libpyboost_la_LDFLAGS = ${PYTHON_LIBS} ${PYTHON_ADD_LIBS} \ + ${BOOST_PYTHON_LIB} -version-info $(GENERIC_LIBRARY_VERSION) +libpy@PACKAGE@_la_SOURCES = pyopentrep.cpp +libpy@PACKAGE@_la_CXXFLAGS = ${PYTHON_CFLAGS} ${BOOST_CFLAGS} +libpy@PACKAGE@_la_LIBADD = $(top_builddir)/@PACKAGE@/core/lib@PACKAGE@.la +libpy@PACKAGE@_la_LDFLAGS = ${PYTHON_LIBS} ${PYTHON_ADD_LIBS} \ + ${BOOST_PYTHON_LIB} -version-info $(GENERIC_LIBRARY_VERSION) + EXTRA_DIST = Deleted: trunk/opentrep/test/python/boost_hello.py =================================================================== --- trunk/opentrep/test/python/boost_hello.py 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/test/python/boost_hello.py 2009-07-20 13:22:52 UTC (rev 146) @@ -1,14 +0,0 @@ -#!/usr/bin/env python - -import sys -sys.path.append('.libs') -import libpyopentrep -myWorld = libpyopentrep.World() -myWorld.add('Bonjour', 5) -myWorld.add('Hello', 7) -myWorld.add('Gruss Gott', 5) -#MsgList=myWorld.getMsgList() -#IDList=myWorld.getIDList() -#print ', '.join(['%s: %s' % (msg, id) for (msg, id) in zip(MsgList, IDList)]) -a = myWorld.toSimpleString().split(', ') -print '^'.join(a[::-1]) Modified: trunk/opentrep/test/python/boost_python.cpp =================================================================== --- trunk/opentrep/test/python/boost_python.cpp 2009-07-20 09:59:43 UTC (rev 145) +++ trunk/opentrep/test/python/boost_python.cpp 2009-07-20 13:22:52 UTC (rev 146) @@ -6,6 +6,12 @@ // Boost String #include <boost/python.hpp> +// ////////////////////////////////////////////// +char const* simpleGreet() { + return "hello, world"; +} + +// ////////////////////////////////////////////// namespace OPENTREP { /** */ @@ -19,7 +25,7 @@ /** */ typedef std::pair<MsgList_T, IDList_T> OpenTrepList_T; - + // ////////////////////////////////////////////// struct World { public: @@ -71,7 +77,7 @@ itMsg != _msgList.end() && itID != _idList.end(); ++itMsg, ++itID, ++idx) { if (idx != 0) { - oStr << ", "; + oStr << ","; } oStr << *itMsg; } @@ -87,7 +93,14 @@ } // ////////////////////////////////////////////// -BOOST_PYTHON_MODULE(libpyopentrep) { +// using namespace boost::python; +BOOST_PYTHON_MODULE(hello_ext) { + boost::python::def("greet", simpleGreet); +} + + +// ////////////////////////////////////////////// +BOOST_PYTHON_MODULE(libpyboost) { boost::python::class_<OPENTREP::World> ("World") .def ("getMsgList", &OPENTREP::World::getMsgList) .def ("getIDList", &OPENTREP::World::getIDList) Copied: trunk/opentrep/test/python/boost_python.py (from rev 145, trunk/opentrep/test/python/boost_hello.py) =================================================================== --- trunk/opentrep/test/python/boost_python.py (rev 0) +++ trunk/opentrep/test/python/boost_python.py 2009-07-20 13:22:52 UTC (rev 146) @@ -0,0 +1,18 @@ +#!/usr/bin/env python + +import sys +sys.path.append('.libs') +import libpyboost + +myWorld = libpyboost.World() +myWorld.add('Bonjour', 5) +myWorld.add('Hello', 7) +myWorld.add('Gruss Gott', 5) + +#MsgList=myWorld.getMsgList() +#IDList=myWorld.getIDList() + +#print ', '.join(['%s: %s' % (msg, id) for (msg, id) in zip(MsgList, IDList)]) + +a = myWorld.toSimpleString().split(',') +print '^'.join(a[::-1]) Added: trunk/opentrep/test/python/pyopentrep.cpp =================================================================== --- trunk/opentrep/test/python/pyopentrep.cpp (rev 0) +++ trunk/opentrep/test/python/pyopentrep.cpp 2009-07-20 13:22:52 UTC (rev 146) @@ -0,0 +1,69 @@ +// C +#include <sstream> +#include <string> +#include <list> +#include <vector> +// Boost String +#include <boost/python.hpp> +// OpenTREP +#include <opentrep/OPENTREP_Service.hpp> + +namespace OPENTREP { + + struct OpenTrepSearcher { + + /** Wrapper around the search use case. */ + std::string search (const std::string& iTravelQuery, + const std::string& iXapianDatabaseFilepath) { + std::ostringstream oStr; + + // Output stream (for logs) + std::ostringstream logOutputStream; + + // Initialise the context + OPENTREP_Service opentrepService(logOutputStream, iXapianDatabaseFilepath); + + // Query the Xapian database (index) + WordList_T lNonMatchedWordList; + LocationList_T lLocationList; + const NbOfMatches_T nbOfMatches = + opentrepService.interpretTravelRequest (iTravelQuery, lLocationList, + lNonMatchedWordList); + + if (nbOfMatches != 0) { + NbOfMatches_T idx = 0; + for (LocationList_T::const_iterator itLocation = lLocationList.begin(); + itLocation != lLocationList.end(); ++itLocation, ++idx) { + const Location& lLocation = *itLocation; + if (idx != 0) { + oStr << ","; + } + oStr << lLocation.getLocationCode(); + } + } + + if (lNonMatchedWordList.empty() == false) { + oStr << ";"; + NbOfMatches_T idx = 0; + for (WordList_T::const_iterator itWord = lNonMatchedWordList.begin(); + itWord != lNonMatchedWordList.end(); ++itWord, ++idx) { + const Word_T& lWord = *itWord; + if (idx != 0) { + oStr << ","; + } + oStr << lWord; + } + } + + return oStr.str(); + } + + }; + +} + +// ///////////////////////////////////////////////////////////// +BOOST_PYTHON_MODULE(libpyopentrep) { + boost::python::class_<OPENTREP::OpenTrepSearcher> ("OpenTrepSearcher") + .def ("search", &OPENTREP::OpenTrepSearcher::search); +} Added: trunk/opentrep/test/python/pyopentrep.py =================================================================== --- trunk/opentrep/test/python/pyopentrep.py (rev 0) +++ trunk/opentrep/test/python/pyopentrep.py 2009-07-20 13:22:52 UTC (rev 146) @@ -0,0 +1,13 @@ +#!/usr/bin/env python + +import sys +sys.path.append('.libs') +import libpyopentrep + +mySearch = libpyopentrep.OpenTrepSearcher() +a = mySearch.search('sna francicso rio de janero lso angles reykyavki','../traveldb') +print a + +#print ', '.join(['%s: %s' % (msg, id) for (msg, id) in zip(MsgList, IDList)]) +#a = myWorld.toSimpleString().split(',') +#print '^'.join(a[::-1]) Property changes on: trunk/opentrep/test/python/pyopentrep.py ___________________________________________________________________ Added: svn:executable + * This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-20 22:12:10
|
Revision: 151 http://opentrep.svn.sourceforge.net/opentrep/?rev=151&view=rev Author: denis_arnaud Date: 2009-07-20 22:11:59 +0000 (Mon, 20 Jul 2009) Log Message: ----------- [Python] Improved the robustness of the library. Modified Paths: -------------- trunk/opentrep/configure.ac trunk/opentrep/opentrep/DBParams.hpp trunk/opentrep/opentrep/OPENTREP_Types.hpp trunk/opentrep/opentrep/command/RequestInterpreter.cpp trunk/opentrep/opentrep/service/OPENTREP_Service.cpp Added Paths: ----------- trunk/opentrep/opentrep/python/Makefile.am trunk/opentrep/opentrep/python/pyopentrep.cpp trunk/opentrep/opentrep/python/pyopentrep.py trunk/opentrep/opentrep/python/sources.mk Property Changed: ---------------- trunk/opentrep/opentrep/python/ Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-20 21:12:10 UTC (rev 150) +++ trunk/opentrep/configure.ac 2009-07-20 22:11:59 UTC (rev 151) @@ -232,6 +232,7 @@ opentrep/config/Makefile opentrep/core/Makefile opentrep/batches/Makefile + opentrep/python/Makefile man/Makefile info/Makefile doc/Makefile Modified: trunk/opentrep/opentrep/DBParams.hpp =================================================================== --- trunk/opentrep/opentrep/DBParams.hpp 2009-07-20 21:12:10 UTC (rev 150) +++ trunk/opentrep/opentrep/DBParams.hpp 2009-07-20 22:11:59 UTC (rev 151) @@ -77,6 +77,18 @@ public: + // ///////// Busines methods //////// + /** Check that all the parameters are fine. */ + bool check () const { + if (_user.empty() == true || _passwd.empty() == true + || _host.empty() == true || _port.empty() + || _dbname.empty() == true) { + return false; + } + return true; + } + + public: // ///////// Display methods //////// /** Dump a structure into an output stream. @param ostream& the output stream. */ Modified: trunk/opentrep/opentrep/OPENTREP_Types.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-20 21:12:10 UTC (rev 150) +++ trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-20 22:11:59 UTC (rev 151) @@ -30,13 +30,37 @@ class ObjectNotFoundException : public RootException { }; - class DocumentNotFoundException : public RootException { + class XapianException : public RootException { }; + + class DocumentNotFoundException : public XapianException { + }; - class SQLDatabaseConnectionImpossibleException : public RootException { + class XapianDatabaseFailureException : public XapianException { }; - + class XapianTravelDatabaseEmptyException : public XapianException { + }; + + class SQLDatabaseException : public RootException { + }; + + class SQLDatabaseConnectionImpossibleException : public SQLDatabaseException { + }; + + class BuildIndexException : public RootException { + }; + + class InterpreteUseCaseException : public RootException { + }; + + class InterpreteTravelRequestException : public InterpreteUseCaseException { + }; + + class TravelRequestEmptyException : public InterpreteUseCaseException { + }; + + // /////////////// Log ///////////// /** Level of logs. */ namespace LOG { Modified: trunk/opentrep/opentrep/command/RequestInterpreter.cpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-20 21:12:10 UTC (rev 150) +++ trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-20 22:11:59 UTC (rev 151) @@ -115,6 +115,10 @@ WordList_T& ioWordList) { NbOfMatches_T oNbOfMatches = 0; + // Sanity checks + assert (iTravelDatabaseName.empty() == false); + assert (iTravelQuery.empty() == false); + // Create a PlaceHolder object, to collect the matching Place objects PlaceHolder& lPlaceHolder = FacPlaceHolder::instance().create(); @@ -146,6 +150,7 @@ } catch (const Xapian::Error& error) { OPENTREP_LOG_ERROR ("Exception: " << error.get_msg()); + throw XapianDatabaseFailureException(); } // DEBUG Property changes on: trunk/opentrep/opentrep/python ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile.in Makefile pyopentrep.log Added: trunk/opentrep/opentrep/python/Makefile.am =================================================================== --- trunk/opentrep/opentrep/python/Makefile.am (rev 0) +++ trunk/opentrep/opentrep/python/Makefile.am 2009-07-20 22:11:59 UTC (rev 151) @@ -0,0 +1,15 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +# Library +lib_LTLIBRARIES = libpy@PACKAGE@.la + +libpy@PACKAGE@_la_SOURCES = pyopentrep.cpp +libpy@PACKAGE@_la_CXXFLAGS = ${PYTHON_CFLAGS} ${BOOST_CFLAGS} +libpy@PACKAGE@_la_LIBADD = $(top_builddir)/@PACKAGE@/core/lib@PACKAGE@.la +libpy@PACKAGE@_la_LDFLAGS = ${PYTHON_LIBS} ${PYTHON_ADD_LIBS} \ + ${BOOST_PYTHON_LIB} -version-info $(GENERIC_LIBRARY_VERSION) + +EXTRA_DIST = pyopentrep.py Added: trunk/opentrep/opentrep/python/pyopentrep.cpp =================================================================== --- trunk/opentrep/opentrep/python/pyopentrep.cpp (rev 0) +++ trunk/opentrep/opentrep/python/pyopentrep.cpp 2009-07-20 22:11:59 UTC (rev 151) @@ -0,0 +1,168 @@ +// C +#include <cassert> +// STL +#include <stdexcept> +#include <fstream> +#include <sstream> +#include <string> +#include <list> +#include <vector> +// Boost String +#include <boost/python.hpp> +// OpenTREP +#include <opentrep/OPENTREP_Service.hpp> + +namespace OPENTREP { + + struct OpenTrepSearcher { + public: + + /** Wrapper around the search use case. */ + std::string search (const std::string& iTravelQuery) { + std::ostringstream oStr; + + // Sanity check + if (_logOutputStream == NULL) { + oStr << "The log filepath is not valid." << std::endl; + return oStr.str(); + } + + try { + + // DEBUG + *_logOutputStream << "Python search for '" << iTravelQuery << "'" + << std::endl; + + if (_opentrepService == NULL) { + oStr << "The OpenTREP service has not been initialised, i.e., " + << "the init() method has not been called correctly on the " + << "OpenTrepSearcher object. Please check that all the " + << "parameters are not empty and point to actual files."; + return oStr.str(); + } + assert (_opentrepService != NULL); + + // Query the Xapian database (index) + WordList_T lNonMatchedWordList; + LocationList_T lLocationList; + const NbOfMatches_T nbOfMatches = + _opentrepService->interpretTravelRequest (iTravelQuery, lLocationList, + lNonMatchedWordList); + + // DEBUG + *_logOutputStream << "Python search for '" << iTravelQuery << "' gave " + << nbOfMatches << " matches." << std::endl; + + if (nbOfMatches != 0) { + NbOfMatches_T idx = 0; + for(LocationList_T::const_iterator itLocation = lLocationList.begin(); + itLocation != lLocationList.end(); ++itLocation, ++idx) { + const Location& lLocation = *itLocation; + if (idx != 0) { + oStr << ","; + } + oStr << lLocation.getLocationCode(); + } + } + + if (lNonMatchedWordList.empty() == false) { + oStr << ";"; + NbOfMatches_T idx = 0; + for (WordList_T::const_iterator itWord = lNonMatchedWordList.begin(); + itWord != lNonMatchedWordList.end(); ++itWord, ++idx) { + const Word_T& lWord = *itWord; + if (idx != 0) { + oStr << ","; + } + oStr << lWord; + } + } + + // DEBUG + *_logOutputStream << "Python search for '" << iTravelQuery + << "' yielded:" << std::endl; + + // DEBUG + *_logOutputStream << oStr.str() << std::endl; + + } catch (const std::exception& error) { + *_logOutputStream << "Exception: " << error.what() << std::endl; + } + + return oStr.str(); + } + + public: + /** Default constructor. */ + OpenTrepSearcher() : _opentrepService (NULL), _logOutputStream (NULL) { + } + + /** Default copy constructor. */ + OpenTrepSearcher (const OpenTrepSearcher& iOpenTrepSearcher) + : _opentrepService (iOpenTrepSearcher._opentrepService), + _logOutputStream (iOpenTrepSearcher._logOutputStream) { + } + + /** Default constructor. */ + ~OpenTrepSearcher() { + _opentrepService = NULL; + _logOutputStream = NULL; + } + + /** Wrapper around the search use case. */ + bool init (const std::string& iXapianDatabaseFilepath, + const std::string& iLogFilepath, + const std::string& iDBUser, const std::string& iDBPasswd, + const std::string& iDBHost, const std::string& iDBPort, + const std::string& iDBDBName) { + bool isEverythingOK = true; + + try { + + // TODO: use Boost Filesystem to check the filepaths + if (iXapianDatabaseFilepath.empty() == true + || iLogFilepath.empty() == true) { + isEverythingOK = false; + return isEverythingOK; + } + + // Set the log parameters + _logOutputStream = new std::ofstream; + assert (_logOutputStream != NULL); + + // Open and clean the log outputfile + _logOutputStream->open (iLogFilepath.c_str()); + _logOutputStream->clear(); + + // DEBUG + *_logOutputStream << "Python wrapper initialisation" << std::endl; + + // Initialise the context + DBParams lDBParams (iDBUser, iDBPasswd, iDBHost, iDBPort, iDBDBName); + _opentrepService = new OPENTREP_Service (*_logOutputStream, lDBParams, + iXapianDatabaseFilepath); + + // DEBUG + *_logOutputStream << "Python wrapper initialised" << std::endl; + + } catch (const std::exception& error) { + *_logOutputStream << "Exception: " << error.what() << std::endl; + } + + return isEverythingOK; + } + + private: + /** Handle on the OpenTREP services (API). */ + OPENTREP_Service* _opentrepService; + std::ofstream* _logOutputStream; + }; + +} + +// ///////////////////////////////////////////////////////////// +BOOST_PYTHON_MODULE(libpyopentrep) { + boost::python::class_<OPENTREP::OpenTrepSearcher> ("OpenTrepSearcher") + .def ("search", &OPENTREP::OpenTrepSearcher::search) + .def ("init", &OPENTREP::OpenTrepSearcher::init); +} Added: trunk/opentrep/opentrep/python/pyopentrep.py =================================================================== --- trunk/opentrep/opentrep/python/pyopentrep.py (rev 0) +++ trunk/opentrep/opentrep/python/pyopentrep.py 2009-07-20 22:11:59 UTC (rev 151) @@ -0,0 +1,11 @@ +#!/usr/bin/env python + +import sys +sys.path.append('.libs') +import libpyopentrep + +mySearch = libpyopentrep.OpenTrepSearcher() +mySearch.init('../../test/traveldb', 'pyopentrep.log', 'opentrep', 'opentrep', 'localhost', '3306', 'opentrep') + +a = mySearch.search('sna francicso rio de janero lso angles reykyavki') +print a Property changes on: trunk/opentrep/opentrep/python/pyopentrep.py ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/opentrep/python/sources.mk =================================================================== --- trunk/opentrep/opentrep/python/sources.mk (rev 0) +++ trunk/opentrep/opentrep/python/sources.mk 2009-07-20 22:11:59 UTC (rev 151) @@ -0,0 +1,2 @@ +python_svc_h_sources = $(top_srcdir)/opentrep/python/pyopentrep.cpp +python_svc_sources = Modified: trunk/opentrep/opentrep/service/OPENTREP_Service.cpp =================================================================== --- trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-20 21:12:10 UTC (rev 150) +++ trunk/opentrep/opentrep/service/OPENTREP_Service.cpp 2009-07-20 22:11:59 UTC (rev 151) @@ -56,6 +56,20 @@ // Set the log file logInit (LOG::DEBUG, ioLogStream); + // Check that the Xapian travel database is not empty + if (iTravelDatabaseName.empty() == true) { + OPENTREP_LOG_ERROR ("The filepath for the Xapian travel database is " + << "empty."); + throw XapianTravelDatabaseEmptyException(); + } + + // Check that the parameters for the SQL database are not empty + if (iDBParams.check() == false) { + OPENTREP_LOG_ERROR ("At least one of the parameters for the SQL " + << "database is empty: " << iDBParams); + throw XapianTravelDatabaseEmptyException(); + } + // Initialise the context OPENTREP_ServiceContext& lOPENTREP_ServiceContext = FacOpenTrepServiceContext::instance().create (iTravelDatabaseName); @@ -88,26 +102,33 @@ } assert (_opentrepServiceContext != NULL); OPENTREP_ServiceContext& lOPENTREP_ServiceContext= *_opentrepServiceContext; - - // Retrieve the SOCI Session - soci::session& lSociSession = - lOPENTREP_ServiceContext.getSociSessionHandler(); - - // Retrieve the Xapian database name (directorty of the index) - const TravelDatabaseName_T& lTravelDatabaseName = - lOPENTREP_ServiceContext.getTravelDatabaseName(); - // Delegate the index building to the dedicated command - BasChronometer lBuildSearchIndexChronometer; - lBuildSearchIndexChronometer.start(); - IndexBuilder::buildSearchIndex (lSociSession, lTravelDatabaseName); - const double lBuildSearchIndexMeasure = - lBuildSearchIndexChronometer.elapsed(); + try { + + // Retrieve the SOCI Session + soci::session& lSociSession = + lOPENTREP_ServiceContext.getSociSessionHandler(); + + // Retrieve the Xapian database name (directorty of the index) + const TravelDatabaseName_T& lTravelDatabaseName = + lOPENTREP_ServiceContext.getTravelDatabaseName(); + + // Delegate the index building to the dedicated command + BasChronometer lBuildSearchIndexChronometer; + lBuildSearchIndexChronometer.start(); + IndexBuilder::buildSearchIndex (lSociSession, lTravelDatabaseName); + const double lBuildSearchIndexMeasure = + lBuildSearchIndexChronometer.elapsed(); + + // DEBUG + OPENTREP_LOG_DEBUG ("Build Xapian database (index): " + << lBuildSearchIndexMeasure << " - " + << lOPENTREP_ServiceContext.display()); - // DEBUG - OPENTREP_LOG_DEBUG ("Build Xapian database (index): " - << lBuildSearchIndexMeasure << " - " - << lOPENTREP_ServiceContext.display()); + } catch (const std::exception& error) { + OPENTREP_LOG_ERROR ("Exception: " << error.what()); + throw BuildIndexException(); + } } // ////////////////////////////////////////////////////////////////////// @@ -115,6 +136,7 @@ interpretTravelRequest (const std::string& iTravelQuery, LocationList_T& ioLocationList, WordList_T& ioWordList) { + NbOfMatches_T nbOfMatches = 0; if (_opentrepServiceContext == NULL) { throw NonInitialisedServiceException(); @@ -122,30 +144,45 @@ assert (_opentrepServiceContext != NULL); OPENTREP_ServiceContext& lOPENTREP_ServiceContext= *_opentrepServiceContext; - // Retrieve the SOCI Session - soci::session& lSociSession = - lOPENTREP_ServiceContext.getSociSessionHandler(); + // Check that the travel request is not empty + if (iTravelQuery.empty() == true) { + OPENTREP_LOG_ERROR ("The travel request is empty."); + throw TravelRequestEmptyException(); + } - // Retrieve the Xapian database name (directorty of the index) - const TravelDatabaseName_T& lTravelDatabaseName = - lOPENTREP_ServiceContext.getTravelDatabaseName(); - - // Delegate the query execution to the dedicated command - BasChronometer lRequestInterpreterChronometer; - lRequestInterpreterChronometer.start(); - const NbOfMatches_T nbOfMatches = - RequestInterpreter::interpretTravelRequest (lSociSession, - lTravelDatabaseName, - iTravelQuery, ioLocationList, - ioWordList); - const double lRequestInterpreterMeasure = - lRequestInterpreterChronometer.elapsed(); + try { + + // Retrieve the SOCI Session + soci::session& lSociSession = + lOPENTREP_ServiceContext.getSociSessionHandler(); + + // Retrieve the Xapian database name (directorty of the index) + const TravelDatabaseName_T& lTravelDatabaseName = + lOPENTREP_ServiceContext.getTravelDatabaseName(); + + // Delegate the query execution to the dedicated command + BasChronometer lRequestInterpreterChronometer; + lRequestInterpreterChronometer.start(); + nbOfMatches = + RequestInterpreter::interpretTravelRequest (lSociSession, + lTravelDatabaseName, + iTravelQuery, + ioLocationList, + ioWordList); + const double lRequestInterpreterMeasure = + lRequestInterpreterChronometer.elapsed(); - // DEBUG - OPENTREP_LOG_DEBUG ("Match query on Xapian database (index): " - << lRequestInterpreterMeasure << " - " - << lOPENTREP_ServiceContext.display()); - + // DEBUG + OPENTREP_LOG_DEBUG ("Match query on Xapian database (index): " + << lRequestInterpreterMeasure << " - " + << lOPENTREP_ServiceContext.display()); + + + } catch (const std::exception& error) { + OPENTREP_LOG_ERROR ("Exception: " << error.what()); + throw InterpreteTravelRequestException(); + } + return nbOfMatches; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-20 22:27:08
|
Revision: 152 http://opentrep.svn.sourceforge.net/opentrep/?rev=152&view=rev Author: denis_arnaud Date: 2009-07-20 22:27:02 +0000 (Mon, 20 Jul 2009) Log Message: ----------- [DB] Changed the default SQL database name from opentrep to trep_opentrep. Modified Paths: -------------- trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh trunk/opentrep/opentrep/batches/indexer.cpp trunk/opentrep/opentrep/batches/opentrep_indexer.cfg trunk/opentrep/opentrep/batches/opentrep_searcher.cfg trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/python/pyopentrep.py Modified: trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh =================================================================== --- trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh 2009-07-20 22:27:02 UTC (rev 152) @@ -3,42 +3,59 @@ # One parameter is required for this script: # - the username # -# Two parameters are optional: +# Three parameters are optional: +# - the database name # - the host server of the database # - the port of the database # -if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; +if [ "$1" = "-h" -o "$1" = "--h" -o "$1" = "--help" ]; then - echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" + echo "Usage: $0 [<Database Username> [<Database Name> [<Database Server Hostname> [<Database Server Port>]]]]" echo "" + echo "Default values:" + echo "<Database Username> = opentrep" + echo "<Database Name> = trep_opentrep" + echo "<Database Server Hostname> = localhost" + echo "<Database Server Port> = 3306" + echo "" exit -1 fi ## # Database Server Hostname -DB_HOST="localhost" +# Database User +DB_USER="opentrep" +if [ "$1" != "" ]; +then + DB_USER="$1" +fi + +# Database Password +DB_PASSWD="${DB_USER}" + +# Database Name +DB_NAME="trep_${DB_USER}" if [ "$2" != "" ]; then - DB_HOST="$2" + DB_NAME="$2" fi +DB_HOST="localhost" +if [ "$3" != "" ]; +then + DB_HOST="$3" +fi + # Database Server Port DB_PORT="3306" -if [ "$3" != "" ]; +if [ "$4" != "" ]; then - DB_PORT="$3" + DB_PORT="$4" fi -# Database User -DB_USER="$1" -# Database Password -DB_PASSWD="${DB_USER}" - -# Database Name -DB_NAME="opentrep" - +## # Check file existence function checkSQLFile() { if [ ! -r ${SQL_FILE} ]; then Modified: trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh =================================================================== --- trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/db/maintenance/drop_tables_from_mysql_db.sh 2009-07-20 22:27:02 UTC (rev 152) @@ -3,42 +3,58 @@ # One parameter is required for this script: # - the username # -# Two parameters are optional: +# Three parameters are optional: +# - the database name # - the host server of the database # - the port of the database # -if [ "$1" = "" -o "$1" = "-h" -o "$1" = "--help" ]; +if [ "$1" = "-h" -o "$1" = "--h" -o "$1" = "--help" ]; then - echo "Usage: $0 <Database Username> [<Database Server Hostname> [<Database Server Port>]]" + echo "Usage: $0 [<Database Username> [<Database Name> [<Database Server Hostname> [<Database Server Port>]]]]" echo "" + echo "Default values:" + echo "<Database Username> = opentrep" + echo "<Database Name> = trep_opentrep" + echo "<Database Server Hostname> = localhost" + echo "<Database Server Port> = 3306" + echo "" exit -1 fi ## # Database Server Hostname -DB_HOST="localhost" +# Database User +DB_USER="opentrep" +if [ "$1" != "" ]; +then + DB_USER="$1" +fi + +# Database Password +DB_PASSWD="${DB_USER}" + +# Database Name +DB_NAME="trep_${DB_USER}" if [ "$2" != "" ]; then - DB_HOST="$2" + DB_NAME="$2" fi +DB_HOST="localhost" +if [ "$3" != "" ]; +then + DB_HOST="$3" +fi + # Database Server Port DB_PORT="3306" -if [ "$3" != "" ]; +if [ "$4" != "" ]; then - DB_PORT="$3" + DB_PORT="$4" fi -# Database User -DB_USER="$1" - -# Database Password -DB_PASSWD="${DB_USER}" - -# Database Name -DB_NAME="opentrep" - +## # Drop a table function dropTable() { echo "The ${TABLE} table will be dropped:" Modified: trunk/opentrep/opentrep/batches/indexer.cpp =================================================================== --- trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/opentrep/batches/indexer.cpp 2009-07-20 22:27:02 UTC (rev 152) @@ -30,7 +30,7 @@ /** Default name and location for the Xapian database. */ const std::string K_OPENTREP_DEFAULT_DB_USER ("opentrep"); const std::string K_OPENTREP_DEFAULT_DB_PASSWD ("opentrep"); -const std::string K_OPENTREP_DEFAULT_DB_DBNAME ("opentrep"); +const std::string K_OPENTREP_DEFAULT_DB_DBNAME ("trep_opentrep"); const std::string K_OPENTREP_DEFAULT_DB_HOST ("localhost"); const std::string K_OPENTREP_DEFAULT_DB_PORT ("3306"); Modified: trunk/opentrep/opentrep/batches/opentrep_indexer.cfg =================================================================== --- trunk/opentrep/opentrep/batches/opentrep_indexer.cfg 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/opentrep/batches/opentrep_indexer.cfg 2009-07-20 22:27:02 UTC (rev 152) @@ -4,4 +4,4 @@ passwd=opentrep host=localhost port=3306 -dbname=opentrep +dbname=trep_opentrep Modified: trunk/opentrep/opentrep/batches/opentrep_searcher.cfg =================================================================== --- trunk/opentrep/opentrep/batches/opentrep_searcher.cfg 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/opentrep/batches/opentrep_searcher.cfg 2009-07-20 22:27:02 UTC (rev 152) @@ -4,6 +4,6 @@ passwd=opentrep host=localhost port=3306 -dbname=opentrep +dbname=trep_opentrep error=3 query="sna francicso rio de janero lso angles reykyavki" \ No newline at end of file Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-20 22:27:02 UTC (rev 152) @@ -34,7 +34,7 @@ /** Default name and location for the Xapian database. */ const std::string K_OPENTREP_DEFAULT_DB_USER ("opentrep"); const std::string K_OPENTREP_DEFAULT_DB_PASSWD ("opentrep"); -const std::string K_OPENTREP_DEFAULT_DB_DBNAME ("opentrep"); +const std::string K_OPENTREP_DEFAULT_DB_DBNAME ("trep_opentrep"); const std::string K_OPENTREP_DEFAULT_DB_HOST ("localhost"); const std::string K_OPENTREP_DEFAULT_DB_PORT ("3306"); Modified: trunk/opentrep/opentrep/python/pyopentrep.py =================================================================== --- trunk/opentrep/opentrep/python/pyopentrep.py 2009-07-20 22:11:59 UTC (rev 151) +++ trunk/opentrep/opentrep/python/pyopentrep.py 2009-07-20 22:27:02 UTC (rev 152) @@ -5,7 +5,7 @@ import libpyopentrep mySearch = libpyopentrep.OpenTrepSearcher() -mySearch.init('../../test/traveldb', 'pyopentrep.log', 'opentrep', 'opentrep', 'localhost', '3306', 'opentrep') +mySearch.init('../../test/traveldb', 'pyopentrep.log', 'opentrep', 'opentrep', 'localhost', '3306', 'trep_opentrep') a = mySearch.search('sna francicso rio de janero lso angles reykyavki') print a This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-20 22:42:05
|
Revision: 153 http://opentrep.svn.sourceforge.net/opentrep/?rev=153&view=rev Author: denis_arnaud Date: 2009-07-20 22:41:55 +0000 (Mon, 20 Jul 2009) Log Message: ----------- [Test] Fixed some bugs with the tests after the interface (API) change. Modified Paths: -------------- trunk/opentrep/opentrep/Makefile.am trunk/opentrep/test/IndexBuildingTestSuite.cpp trunk/opentrep/test/python/Makefile.am Removed Paths: ------------- trunk/opentrep/test/python/pyopentrep.cpp trunk/opentrep/test/python/pyopentrep.py Modified: trunk/opentrep/opentrep/Makefile.am =================================================================== --- trunk/opentrep/opentrep/Makefile.am 2009-07-20 22:27:02 UTC (rev 152) +++ trunk/opentrep/opentrep/Makefile.am 2009-07-20 22:41:55 UTC (rev 153) @@ -5,7 +5,7 @@ MAINTAINERCLEANFILES = Makefile.in -SUBDIRS = basic bom factory dbadaptor command service core config batches +SUBDIRS = basic bom factory dbadaptor command service core config batches python EXTRA_DIST = config_msvc.h Modified: trunk/opentrep/test/IndexBuildingTestSuite.cpp =================================================================== --- trunk/opentrep/test/IndexBuildingTestSuite.cpp 2009-07-20 22:27:02 UTC (rev 152) +++ trunk/opentrep/test/IndexBuildingTestSuite.cpp 2009-07-20 22:41:55 UTC (rev 153) @@ -24,16 +24,24 @@ // Open and clean the log outputfile logOutputFile.open (lLogFilename.c_str()); logOutputFile.clear(); + + // SQL database parameters + OPENTREP::DBParams lDBParams ("opentrep", "opentrep", "localhost", "3306", + "trep_opentrep"); // Initialise the context - OPENTREP::OPENTREP_Service opentrepService; - opentrepService.init (logOutputFile, lXapianDatabaseName); + OPENTREP::OPENTREP_Service opentrepService (logOutputFile, lDBParams, + lXapianDatabaseName); // Launch a simulation //opentrepService.buildSearchIndex(); // Query the Xapian database (index) - opentrepService.interpretTravelRequest (lTravelQuery); + OPENTREP::WordList_T lNonMatchedWordList; + OPENTREP::LocationList_T lLocationList; + const OPENTREP::NbOfMatches_T nbOfMatches = + opentrepService.interpretTravelRequest (lTravelQuery, lLocationList, + lNonMatchedWordList); // Close the Log outputFile logOutputFile.close(); Modified: trunk/opentrep/test/python/Makefile.am =================================================================== --- trunk/opentrep/test/python/Makefile.am 2009-07-20 22:27:02 UTC (rev 152) +++ trunk/opentrep/test/python/Makefile.am 2009-07-20 22:41:55 UTC (rev 153) @@ -4,7 +4,7 @@ MAINTAINERCLEANFILES = Makefile.in # Libraries -lib_LTLIBRARIES = libpyboost.la libpy@PACKAGE@.la +lib_LTLIBRARIES = libpyboost.la libpyboost_la_SOURCES = boost_python.cpp libpyboost_la_CXXFLAGS = ${PYTHON_CFLAGS} @@ -12,10 +12,4 @@ libpyboost_la_LDFLAGS = ${PYTHON_LIBS} ${PYTHON_ADD_LIBS} \ ${BOOST_PYTHON_LIB} -version-info $(GENERIC_LIBRARY_VERSION) -libpy@PACKAGE@_la_SOURCES = pyopentrep.cpp -libpy@PACKAGE@_la_CXXFLAGS = ${PYTHON_CFLAGS} ${BOOST_CFLAGS} -libpy@PACKAGE@_la_LIBADD = $(top_builddir)/@PACKAGE@/core/lib@PACKAGE@.la -libpy@PACKAGE@_la_LDFLAGS = ${PYTHON_LIBS} ${PYTHON_ADD_LIBS} \ - ${BOOST_PYTHON_LIB} -version-info $(GENERIC_LIBRARY_VERSION) - -EXTRA_DIST = +EXTRA_DIST = boost_python.py Deleted: trunk/opentrep/test/python/pyopentrep.cpp =================================================================== --- trunk/opentrep/test/python/pyopentrep.cpp 2009-07-20 22:27:02 UTC (rev 152) +++ trunk/opentrep/test/python/pyopentrep.cpp 2009-07-20 22:41:55 UTC (rev 153) @@ -1,110 +0,0 @@ -// C -#include <cassert> -// STL -#include <sstream> -#include <string> -#include <list> -#include <vector> -// Boost String -#include <boost/python.hpp> -// OpenTREP -#include <opentrep/OPENTREP_Service.hpp> - -namespace OPENTREP { - - struct OpenTrepSearcher { - public: - - /** Wrapper around the search use case. */ - std::string search (const std::string& iTravelQuery) { - std::ostringstream oStr; - - if (_opentrepService == NULL) { - oStr << "The OpenTREP service has not been initialise, i.e., " - << "the init() method has not been called on the " - << "OpenTrepSearcher object. Please do so."; - return oStr.str(); - } - assert (_opentrepService != NULL); - - // Query the Xapian database (index) - WordList_T lNonMatchedWordList; - LocationList_T lLocationList; - const NbOfMatches_T nbOfMatches = - _opentrepService->interpretTravelRequest (iTravelQuery, lLocationList, - lNonMatchedWordList); - - if (nbOfMatches != 0) { - NbOfMatches_T idx = 0; - for (LocationList_T::const_iterator itLocation = lLocationList.begin(); - itLocation != lLocationList.end(); ++itLocation, ++idx) { - const Location& lLocation = *itLocation; - if (idx != 0) { - oStr << ","; - } - oStr << lLocation.getLocationCode(); - } - } - - if (lNonMatchedWordList.empty() == false) { - oStr << ";"; - NbOfMatches_T idx = 0; - for (WordList_T::const_iterator itWord = lNonMatchedWordList.begin(); - itWord != lNonMatchedWordList.end(); ++itWord, ++idx) { - const Word_T& lWord = *itWord; - if (idx != 0) { - oStr << ","; - } - oStr << lWord; - } - } - - return oStr.str(); - } - - public: - /** Default constructor. */ - OpenTrepSearcher() : _opentrepService (NULL), _logOutputStream (NULL) { - } - - /** Default copy constructor. */ - OpenTrepSearcher (const OpenTrepSearcher& iOpenTrepSearcher) - : _opentrepService (iOpenTrepSearcher._opentrepService), - _logOutputStream (iOpenTrepSearcher._logOutputStream) { - } - - /** Default constructor. */ - ~OpenTrepSearcher() { - _opentrepService = NULL; - _logOutputStream = NULL; - } - - /** Wrapper around the search use case. */ - void init (const std::string& iXapianDatabaseFilepath, - const std::string& iDBUser, const std::string& iDBPasswd, - const std::string& iDBHost, const std::string& iDBPort, - const std::string& iDBDBName) { - - // Output stream (for logs) - _logOutputStream = new std::ostringstream(); - - // Initialise the context - DBParams lDBParams (iDBUser, iDBPasswd, iDBHost, iDBPort, iDBDBName); - _opentrepService = new OPENTREP_Service (*_logOutputStream, lDBParams, - iXapianDatabaseFilepath); - } - - private: - /** Handle on the OpenTREP services (API). */ - OPENTREP_Service* _opentrepService; - std::ostringstream* _logOutputStream; - }; - -} - -// ///////////////////////////////////////////////////////////// -BOOST_PYTHON_MODULE(libpyopentrep) { - boost::python::class_<OPENTREP::OpenTrepSearcher> ("OpenTrepSearcher") - .def ("search", &OPENTREP::OpenTrepSearcher::search) - .def ("init", &OPENTREP::OpenTrepSearcher::init); -} Deleted: trunk/opentrep/test/python/pyopentrep.py =================================================================== --- trunk/opentrep/test/python/pyopentrep.py 2009-07-20 22:27:02 UTC (rev 152) +++ trunk/opentrep/test/python/pyopentrep.py 2009-07-20 22:41:55 UTC (rev 153) @@ -1,14 +0,0 @@ -#!/usr/bin/env python - -import sys -sys.path.append('.libs') -import libpyopentrep - -mySearch = libpyopentrep.OpenTrepSearcher() -mySearch.init('../traveldb', 'opentrep', 'opentrep', 'localhost', '3306', 'opentrep') -a = mySearch.search('sna francicso rio de janero lso angles reykyavki') -print a - -#print ', '.join(['%s: %s' % (msg, id) for (msg, id) in zip(MsgList, IDList)]) -#a = myWorld.toSimpleString().split(',') -#print '^'.join(a[::-1]) This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-23 05:13:33
|
Revision: 160 http://opentrep.svn.sourceforge.net/opentrep/?rev=160&view=rev Author: denis_arnaud Date: 2009-07-23 05:13:30 +0000 (Thu, 23 Jul 2009) Log Message: ----------- [Dev] Second attempt to fix a matching bug. Modified Paths: -------------- trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/bom/ResultHolder.hpp trunk/opentrep/opentrep/bom/StringMatcher.cpp trunk/opentrep/opentrep/bom/StringMatcher.hpp Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-23 04:35:16 UTC (rev 159) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-07-23 05:13:30 UTC (rev 160) @@ -3550,7 +3550,7 @@ en,cdd,cauquira,cauquira,cauquira/hn:cauquira airport en,cde,caledonia,caledonia,caledonia/pa en,cdf,cortina d'ampezzo,cortina d'ampez,cortina d'ampezzo/it:fiames -en,cdg,paris cdg,paris cdg,paris/fr:charles de gaulle,cdg,cdg +en,cdg,paris cdg,paris cdg,paris/fr:charles de gaulle,cdg,cdg,charles de gaulle en,cdh,camden,camden,camden/ar/us:harrell fld en,cdi,cachoeiro,cachoeiro,cachoeiro de i/es/br:cachoeiro en,cdj,conceicao do arag,conceicao do ar,conceicao do arag/pa/br Modified: trunk/opentrep/opentrep/bom/ResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-23 04:35:16 UTC (rev 159) +++ trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-23 05:13:30 UTC (rev 160) @@ -68,30 +68,53 @@ } // ////////////////////////////////////////////////////////////////////// - std::string ResultHolder:: - searchString (Xapian::MSet& ioMatchingSet, - TravelQuery_T& ioPartialQueryString, - NbOfErrors_T& ioMaxEditDistance, - bool ioHasReachedMaximalAllowableEditDistance, - Document& ioMatchingDocument) { + std::string ResultHolder::searchString (Xapian::MSet& ioMatchingSet, + TravelQuery_T& ioPartialQueryString, + Document& ioMatchingDocument) { std::string oMatchedString; // Catch any Xapian::Error exceptions thrown try { + /** + The query string must first be checked, without allowing any + spelling errors, but by removing the furthest right word at + every step. + <br>If no match is found, the maximal allowable edit + distance/error becomes 1, and the process (trying to match + the whole sentence, then by removing the furthest right word, + etc.) is re-performed. + <br>If no match is found, the maximal allowable edit + distance/error becomes 2. + <br>And so on until the maximum of the edit distance/error + becomes greater than the maximal allowable distance/error. + reached. + */ + NbOfErrors_T lMaxEditDistance = 0; + bool hasReachedMaximalAllowableEditDistance = false; bool shouldStop = false; while (shouldStop == false) { - // Retrieve the list of documents matching the query string + + // DEBUG + OPENTREP_LOG_DEBUG ("--------"); + OPENTREP_LOG_DEBUG ("Current query string: `" + << ioPartialQueryString + << "', with a maximal edit distance of " + << lMaxEditDistance << "."); + + // Retrieve the list of Xapian documents matching the query string oMatchedString = StringMatcher::searchString (ioMatchingSet, ioPartialQueryString, - ioMaxEditDistance, - ioHasReachedMaximalAllowableEditDistance, + lMaxEditDistance, + hasReachedMaximalAllowableEditDistance, _database); // DEBUG - OPENTREP_LOG_DEBUG ("Current initial query string: `" + OPENTREP_LOG_DEBUG ("---- Current query string: `" << ioPartialQueryString << "' --- Kept query: `" - << oMatchedString << "' for " + << oMatchedString + << "', with a maximal edit distance of " + << lMaxEditDistance << ", for " << ioMatchingSet.size() << " matches."); if (ioMatchingSet.empty() == false) { @@ -106,22 +129,15 @@ break; } - // Since the query, as is, yield no match, the furthest right - // word must be removed from the query string. - StringMatcher::removeFurthestRightWord (ioPartialQueryString); - + // Allow for one more spelling error + ++lMaxEditDistance; + /** - Stop when the resulting string gets empty. - - <br>Note that whether maximal allowable edit distance/error - has been reached is not checked at that stage. That - algorithm is performed independently for each level of - maximal allowable edit distance/error. Only the caller - (below) retriggers this process by changing the level of - maximal allowable edit distance/error, until that latter be - reached. + Stop when it is no longer necessary to increase the maximal + allowable edit distance, as it is already greater than the + maximum of the calculated edit distance. */ - if (ioPartialQueryString.empty() == true) { + if (hasReachedMaximalAllowableEditDistance == true) { shouldStop = true; } } @@ -141,39 +157,26 @@ // Catch any Xapian::Error exceptions thrown try { - bool shouldStop = false; - NbOfErrors_T lMaxEditDistance = 0; - /** - The query string must first be checked, without allowing any - spelling errors, but by removing the furthest right word at - every step. - <br>If no match is found, the maximal allowable edit - distance/error becomes 1, and the process (trying to match - the whole sentence, then by removing the furthest right word, - etc.) is re-performed. - <br>If no match is found, the maximal allowable edit - distance/error becomes 2. - <br>And so on until the maximum of the edit distance/error - becomes greater than the maximal allowable distance/error. - reached. + A copy of the query is made, as that copy will be altered by + the below process, whereas a clean copy needs to be reprocessed + for each level of maximal edit distance/error. + <br>However, in case of match, the modifications on the query + string (lPartialQueryString) must be replicated on the + original one (ioPartialQueryString). */ + TravelQuery_T lPartialQueryString (ioPartialQueryString); + + bool shouldStop = false; while (shouldStop == false) { - /** - A copy of the query is made, as that copy will be altered by - the below process, whereas a clean copy needs to be reprocessed - for each level of maximal edit distance/error. - <br>However, at the end, the modifications on the query - string (lPartialQueryString) must be replicated on the - original one (ioPartialQueryString). - */ - TravelQuery_T lPartialQueryString (ioPartialQueryString); + // DEBUG + OPENTREP_LOG_DEBUG ("----------------"); + OPENTREP_LOG_DEBUG ("Current query string: `" + << lPartialQueryString << "'"); + Xapian::MSet lMatchingSet; - bool hasReachedMaximalAllowableEditDistance = false; oMatchedString = searchString (lMatchingSet, lPartialQueryString, - lMaxEditDistance, - hasReachedMaximalAllowableEditDistance, ioMatchingDocument); if (oMatchedString.empty() == false) { @@ -189,17 +192,28 @@ break; } - // Allow for one more spelling error - ++lMaxEditDistance; + // Since the query, as is, yields no match, the furthest right + // word must be removed from the query string. + StringMatcher::removeFurthestRightWord (lPartialQueryString); + + /** + Stop when the resulting string gets empty. - /** - Stop when it is no longer necessary to increase the maximal - allowable edit distance, as it is already greater than the - maximum of the calculated edit distance. + <br>Note that whether maximal allowable edit distance/error + has been reached is not checked at that stage. That + algorithm is performed independently for each level of + maximal allowable edit distance/error. Only the caller + (below) retriggers this process by changing the level of + maximal allowable edit distance/error, until that latter be + reached. */ - if (hasReachedMaximalAllowableEditDistance == true) { - ioPartialQueryString = lPartialQueryString; + if (lPartialQueryString.empty() == true) { shouldStop = true; + + // DEBUG + OPENTREP_LOG_DEBUG ("----------------"); + OPENTREP_LOG_DEBUG ("Still no match for current query string: `" + << ioPartialQueryString << "'"); } } @@ -221,7 +235,7 @@ bool shouldStop = false; while (shouldStop == false) { // DEBUG - OPENTREP_LOG_DEBUG ("---------------------") + OPENTREP_LOG_DEBUG ("+++++++++++++++++++++") OPENTREP_LOG_DEBUG ("Remaining part of the query string: `" << lRemainingQueryString << "'"); Modified: trunk/opentrep/opentrep/bom/ResultHolder.hpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-23 04:35:16 UTC (rev 159) +++ trunk/opentrep/opentrep/bom/ResultHolder.hpp 2009-07-23 05:13:30 UTC (rev 160) @@ -48,16 +48,10 @@ /** Retrieve the document best matching the query string. @param Xapian::MSet& The Xapian matching set. It can be empty. @param TravelQuery_T& The partial query string. - @param NbOfErrors_T& The maximal allowable edit distance/error. - @param bool Whether or not the maximal allowable edit distance/error - has become greater than the maximum of the edit distance/errors - calculated on the phrase. @param MatchingDocument_T& The best matching Xapian document (if found). @return bool Whether such a best matching document has been found. */ std::string searchString (Xapian::MSet& ioMatchingSet, TravelQuery_T& ioPartialQueryString, - NbOfErrors_T& ioMaxEditDistance, - bool ioHasReachedMaximalAllowableEditDistance, Document& ioMatchingDocument); /** Retrieve the document best matching the query string. Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-23 04:35:16 UTC (rev 159) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-23 05:13:30 UTC (rev 160) @@ -92,7 +92,7 @@ searchString (Xapian::MSet& ioMatchingSet, const std::string& iSearchString, NbOfErrors_T& ioMaxEditDistance, - bool ioHasReachedMaximalAllowableEditDistance, + bool& ioHasReachedMaximalAllowableEditDistance, const Xapian::Database& ioDatabase) { NbOfErrors_T lMaxEditDistance = std::numeric_limits<EditDistance_T>::min(); @@ -176,9 +176,11 @@ int nbMatches = ioMatchingSet.size(); // DEBUG + /* OPENTREP_LOG_DEBUG ("Original query `" << lOriginalQueryString << "', i.e., `" << lOriginalQuery.get_description() << "' => " << nbMatches << " results found"); + */ /** When no match is found, we search on the corrected phrase/string @@ -241,10 +243,12 @@ nbMatches = ioMatchingSet.size(); // DEBUG + /* OPENTREP_LOG_DEBUG ("Corrected query `" << lCorrectedQueryString << "', i.e., `" << lCorrectedQuery.get_description() << "' => " << nbMatches << " results found on corrected string"); + */ if (nbMatches != 0) { /** @@ -300,6 +304,7 @@ nbMatches = ioMatchingSet.size(); // DEBUG + /* OPENTREP_LOG_DEBUG ("Query corrected as a full sentence `" << lFullWordCorrectedString << "' with an allowable maximal edit distance of " @@ -308,6 +313,7 @@ << ", i.e., `"<< lFullQueryCorrected.get_description() << "' => " << nbMatches << " results found on corrected full string"); + */ if (nbMatches != 0) { oMatchedString = lFullWordCorrectedString; @@ -329,7 +335,7 @@ of the calculated edit distance, it becomes useless to go on increasing the maximal allowable edit distance. */ - if (lMaxEditDistance <= ioMaxEditDistance) { + if (ioMaxEditDistance >= lMaxEditDistance) { ioHasReachedMaximalAllowableEditDistance = true; } Modified: trunk/opentrep/opentrep/bom/StringMatcher.hpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-23 04:35:16 UTC (rev 159) +++ trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-23 05:13:30 UTC (rev 160) @@ -29,7 +29,7 @@ @param Xapian::MSet& The Xapian matching set. It can be empty. @param const std::string& The query string. @param NbOfErrors_T& The maximal allowable edit distance/error. - @param bool Whether or not the maximal allowable edit distance/error + @param bool& Whether or not the maximal allowable edit distance/error has become greater than the maximum of the edit distance/errors calculated on the phrase. @param const Xapian::Database& The Xapian index/database. @@ -38,7 +38,7 @@ static std::string searchString (Xapian::MSet&, const std::string& iSearchString, NbOfErrors_T& ioMaxEditDistance, - bool ioHasReachedMaximalAllowableEditDistance, + bool& ioHasReachedMaximalAllowableEditDistance, const Xapian::Database&); /** Extract the best matching Xapian document. This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-07-24 05:35:01
|
Revision: 164 http://opentrep.svn.sourceforge.net/opentrep/?rev=164&view=rev Author: denis_arnaud Date: 2009-07-24 05:34:50 +0000 (Fri, 24 Jul 2009) Log Message: ----------- [DB] Suppressed the classical_name2 column of the ref_place_names table. Modified Paths: -------------- trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/db/maintenance/create_and_fill_mysql_db.sh trunk/opentrep/db/maintenance/tables/ref_city.sql trunk/opentrep/db/maintenance/tables/ref_db.sql trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/command/DBManager.cpp trunk/opentrep/opentrep/command/IndexBuilder.cpp trunk/opentrep/opentrep/dbadaptor/DbaPlace.cpp trunk/opentrep/test/xapian/Makefile Added Paths: ----------- trunk/opentrep/db/maintenance/export_ref_data.sh trunk/opentrep/db/maintenance/tables/export_ref_place_details.sql trunk/opentrep/db/maintenance/tables/export_ref_place_names.sql trunk/opentrep/db/maintenance/uptade_db_01.sh trunk/opentrep/test/xapian/string_simple_search.cpp Property Changed: ---------------- trunk/opentrep/test/xapian/ Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-23 17:47:31 UTC (rev 163) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-07-24 05:34:50 UTC (rev 164) @@ -1,11127 +1,11127 @@ -concat('en,', code, ',', ticketing_name, ',', teleticketing_name, ',', extended_name) -en,cyz,cauayan,cauayan,cauayan/ph -en,cza,chichen itza,chichen itza,chichen itza/mx -en,czb,cruz alta,cruz alta,cruz alta/rs/br:carlos ruhl -en,czc,copper center,copper center,copper center/ak/us -en,cze,coro,coro,coro/ve -en,czf,cape romanzof,cape romanzof,cape romanzof/ak/us -en,czh,corozal,corozal,corozal/bz -en,czj,corazon de jesus,corazon de jesu,corazon de jesus/pa -en,czk,cascade locks,cascade locks,cascade locks/or/us:cascade l -en,czl,constantine,constantine,constantine/dz:ain el bey -en,czm,cozumel,cozumel,cozumel/mx -en,czn,chisana,chisana,chisana/ak/us:chisana field -en,czo,chistochina,chistochina,chistochina/ak/us -en,czp,cape pole,cape pole,cape pole/ak/us -en,czs,cruzeiro do sul,cruzeiro do sul,cruzeiro do sul/ac/br:campo -en,czt,carrizo springs,carrizo springs,carrizo springs/tx/us -en,czu,corozal,corozal,corozal/co -en,czw,czestochowa,czestochowa,czestochowa/pl -en,czx,changzhou,changzhou,changzhou/cn -en,czy,cluny,cluny,cluny/ql/au -en,czz,campo,campo,campo/ca/us -en,daa,fort belvoir,fort belvoir,fort belvoir/va/us:davison aaf -en,dab,daytona beach,daytona beach,daytona beach/fl/us:regional -en,dac,dhaka,dhaka,dhaka/bd:zia international -en,dad,da nang,da nang,da nang/vn -en,dae,daparizo,daparizo,daparizo/in -en,daf,daup,daup,daup/pg -en,dag,daggett,daggett,daggett/ca/us:barstow daggett -en,dah,dathina,dathina,dathina/ye -en,dai,darjeeling,darjeeling,darjeeling/in -en,daj,dauan island,dauan island,dauan island/ql/au -en,dak,dakhla,dakhla,dakhla oasis/eg:dakhla -en,dal,dallas dal,dallas dal,dallas/tx/us:love field -en,dam,damascus,damascus,damascus/sy:intl -en,dan,danville,danville,danville/va/us:municipal -en,dao,dabo,dabo,dabo/pg -en,dap,darchula,darchula,darchula/np -en,dar,dar es salaam,dar es salaam,dar es salaam/tz:intl -en,das,great bear lake,great bear lake,great bear lake/nt/ca -en,dat,datong,datong,datong/cn -en,dau,daru,daru,daru/pg -en,dav,david,david,david/pa:enrique malek -en,dax,daxian,daxian,daxian/cn -en,day,dayton day,dayton day,dayton/oh/us:james cox dayton -en,daz,darwaz,darwaz,darwaz/af -en,dba,dalbandin,dalbandin,dalbandin/pk -en,dbb,dabaa alalamain,dabaa alalamain,dabaa city/eg:alalamain intl -en,dbd,dhanbad,dhanbad,dhanbad/in -en,dbm,debra marcos,debra marcos,debra marcos/et -en,dbn,dublin,dublin,dublin/ga/us:municipal -en,dbo,dubbo,dubbo,dubbo/ns/au -en,dbp,debepare,debepare,debepare/pg -en,dbq,dubuque,dubuque,dubuque/ia/us:dubuque mnpl -en,dbs,dubois,dubois,dubois/id/us -en,dbt,debra tabor,debra tabor,debra tabor/et -en,dbu,dambula,dambula,dambula/lk:dambula oya tank -en,dbv,dubrovnik,dubrovnik,dubrovnik/hr -en,dby,dalby,dalby,dalby/ql/au -en,dca,washington dca,washington dca,washington/dc/us:r reagan nat -en,dcf,dominica dcf,dominica dcf,dominica/dm:cane field -en,dci,decimomannu,decimomannu,decimomannu/it:r decimomannu -en,dck,dahl creek,dahl creek,dahl creek/ak/us:dahl creek -en,dcm,castres,castres,castres/fr:mazamet -en,dcn,derby dcn,derby dcn,derby/wa/au:curtin -en,dcr,decatur,decatur,decatur/in/us:decatur hi way -en,dct,duncan town,duncan town,duncan town/bs -en,dcu,decatur,decatur,decatur/al/us:pyor -en,ddc,dodge city,dodge city,dodge city/ks/us:mncpl -en,ddg,dandong,dandong,dandong/cn -en,ddi,daydream is,daydream is,daydream is/ql/au -en,ddm,dodoima,dodoima,dodoima/pg -en,ddn,delta downs,delta downs,delta downs/ql/au -en,ddp,dorado,dorado,dorado/pr:dorado beach -en,ddu,dadu,dadu,dadu/pk -en,bfs,belfast intl,belfast intl,belfast/gb:international -en,bft,beaufort,beaufort,beaufort/sc/us:county -en,bfu,bengbu,bengbu,bengbu/cn -en,bfv,buri ram,buri ram,buri ram/th -en,bfw,sidi belabbes,sidi belabbes,sidi belabbes/dz:sidi belabbes -en,bfx,bafoussam,bafoussam,bafoussam/cm -en,bga,bucaramanga,bucaramanga,bucaramanga/co:palo negro -en,bgb,booue,booue,booue/ga -en,bgc,braganca,braganca,braganca/pt -en,bgd,borger,borger,borger/tx/us -en,bge,bainbridge,bainbridge,bainbridge/ga/us:decatur cnty -en,bgf,bangui,bangui,bangui/cf -en,bgg,bongouanou,bongouanou,bongouanou/ci -en,bgh,boghe,boghe,boghe/mr:abbaye -en,bgi,bridgetown,bridgetown,bridgetown/bb:grantley adams -en,bgj,borgarfjordur eys,borgarfjordur e,borgarfjordur eystri/is -en,bgk,big creek,big creek,big creek/bz -en,bgl,baglung,baglung,baglung/np -en,bgm,binghamton,binghamton,binghamton/ny/us -en,bgn,brueggen,brueggen,brueggen/de:r a f -en,bgo,bergen,bergen,bergen/no:flesland -en,bgp,bongo,bongo,bongo/ga -en,bgq,big lake,big lake,big lake/ak/us -en,bgr,bangor,bangor,bangor/me/us:international -en,bgs,big spring afb,big spring afb,big spring/tx/us:webb afb -en,bgt,bagdad,bagdad,bagdad/az/us -en,bgu,bangassou,bangassou,bangassou/cf -en,bgv,bento goncalves,bento goncalves,bento goncalves/rs/br -en,bgw,baghdad bgw,baghdad bgw,baghdad/iq:baghdad intl -en,bgx,bage,bage,bage/rs/br -en,bgy,milan bgy,milan bgy,milan/it:orio al serio -en,bgz,braga,braga,braga/pt -en,bha,bahia de caraquez,bahia de caraqu,bahia de caraquez/ec -en,bhb,bar harbor,bar harbor,bar harbor/me/us -en,bhc,bhurban,bhurban,bhurban/pk:heliport -en,bhd,belfast city,belfast city,belfast/gb:belfast city -en,bhe,blenheim,blenheim,blenheim/nz -en,bhf,bahia cupica,bahia cupica,bahia cupica/co -en,bhg,brus laguna,brus laguna,brus laguna/hn -en,bhh,bisha,bisha,bisha/sa -en,bhi,bahia blanca,bahia blanca,bahia blanca/ba/ar:comandante -en,bhj,bhuj,bhuj,bhuj/in:rudra mata -en,bhk,bukhara,bukhara,bukhara/uz -en,bhl,bahia angeles,bahia angeles,bahia angeles/mx -en,bhm,birmingham,birmingham,birmingham/al/us -en,bhn,beihan,beihan,beihan/ye -en,bho,bhopal,bhopal,bhopal/in -en,bhp,bhojpur,bhojpur,bhojpur/np -en,bhq,broken hill,broken hill,broken hill/ns/au -en,bhr,bharatpur,bharatpur,bharatpur/np -en,bhs,bathurst,bathurst,bathurst/ns/au:raglan -en,bht,brighton downs,brighton downs,brighton downs/ql/au -en,bhu,bhavnagar,bhavnagar,bhavnagar/in -en,bhv,bahawalpur,bahawalpur,bahawalpur/pk -en,bhw,sargodha bhw,sargodha bhw,sargodha/pk:bhagatanwala -en,bhx,birmingham,birmingham,birmingham/gb:international -en,bhy,beihai,beihai,beihai/cn -en,bhz,belo horizonte,belo horizonte,belo horizonte/mg/br -en,bia,bastia,bastia,bastia/fr:poretta -en,bib,baidoa,baidoa,baidoa/so -en,bic,big creek,big creek,big creek/ak/us -en,bid,block island,block island,block island/ri/us -en,bie,beatrice,beatrice,beatrice/ne/us -en,bif,el paso biggs aaf,el paso biggs a,el paso/tx/us:biggs aaf -en,big,big delta,big delta,big delta/ak/us:intermediate -en,bih,bishop,bishop,bishop/ca/us -en,bii,bikini atoll,bikini atoll,bikini atoll/mh:enyu airfield -en,bij,biliau,biliau,biliau/pg -en,bik,biak,biak,biak/id:mokmer -en,bil,billings,billings,billings/mt/us -en,bim,bimini int,bimini int,bimini/bs:international -en,bin,bamiyan,bamiyan,bamiyan/af -en,bio,bilbao,bilbao,bilbao/es -en,dea,dera ghazi khan,dera ghazi khan,dera ghazi khan/pk -en,deb,debrecen,debrecen,debrecen/hu -en,isd,iscuande,iscuande,iscuande/co -en,ise,isparta,isparta,isparta/tr -en,isg,ishigaki,ishigaki,ishigaki/jp -en,ish,ischia,ischia,ischia/it -en,isi,isisford,isisford,isisford/ql/au -en,isj,isla mujeres,isla mujeres,isla mujeres/mx -en,isk,nasik,nasik,nasik/in:gandhinagar apt -en,isl,isabel pass,isabel pass,isabel pass/ak/us -en,ism,kissimmee,kissimmee,kissimmee/fl/us:municipal -en,isn,williston,williston,williston/nd/us:sloulin field -en,iso,kinston,kinston,kinston/nc/us:stallings field -en,isp,islip,islip,islip/ny/us:long island macar -en,isq,manistique,manistique,manistique/mi/us:schoolcraft -en,iss,wiscasset,wiscasset,wiscasset/me/us -en,ist,istanbul,istanbul,istanbul/tr:ataturk -en,isu,sulaymaniyah,sulaymaniyah,sulaymaniyah/iq:international -en,isw,wisconsin rapids,wisconsin rapid,wisconsin rapids/wi/us:alexan -en,ita,itacoatiara,itacoatiara,itacoatiara/am/br -en,itb,itaituba,itaituba,itaituba/pa/br -en,ite,itubera,itubera,itubera/ba/br -en,ith,ithaca,ithaca,ithaca/ny/us:tompkins county -en,iti,itambacuri,itambacuri,itambacuri/mg/br -en,itj,itajai,itajai,itajai/sc/br:off- -en,itk,itokama,itokama,itokama/pg -en,itm,osaka itami apt,osaka itami apt,osaka itami apt/jp:itami -en,itn,itabuna,itabuna,itabuna/ba/br -en,ito,hilo,hilo,hilo/hi/us:hilo international -en,itp,itaperuna,itaperuna,itaperuna/rj/br:itaperuna -en,itq,itaqui,itaqui,itaqui/rs/br -en,itr,itumbiara,itumbiara,itumbiara/go/br:itumbiara -en,iue,niue island,niue island,niue island/nu:hanan -en,iul,ilu,ilu,ilu/id -en,ium,summit lake,summit lake,summit lake/bc/ca -en,ius,inus,inus,inus/pg -en,iva,ambanja,ambanja,ambanja/mg -en,ivc,invercargill,invercargill,invercargill/nz -en,ivg,ivangrad,ivangrad,ivangrad/me -en,ivh,ivishak,ivishak,ivishak/ak/us -en,ivl,ivalo,ivalo,ivalo/fi -en,ivo,chivolo,chivolo,chivolo/co -en,ivr,inverell,inverell,inverell/ns/au -en,ivw,inverway,inverway,inverway/nt/au -en,iwa,ivanova,ivanova,ivanova/ru -en,iwd,ironwood,ironwood,ironwood/mi/us:gogebic county -en,iwj,iwami,iwami,iwami/jp:iwami aiport -en,iwo,iwo jima vol,iwo jima vol,iwo jima vol/jp:iwo jima base -en,iws,houston west,houston west,houston/tx/us:west houston -en,ixa,agartala,agartala,agartala/in:singerbhil -en,ixb,bagdogra,bagdogra,bagdogra/in -en,ixc,chandigarh,chandigarh,chandigarh/in -en,ixd,allahabad,allahabad,allahabad/in:bamrauli -en,ixe,mangalore,mangalore,mangalore/in:bajpe -en,ixg,belgaum,belgaum,belgaum/in:sambre -en,ixh,kailashahar,kailashahar,kailashahar/in -en,ixi,lilabari,lilabari,lilabari/in -en,ixj,jammu,jammu,jammu/in:satwari -en,ixk,keshod,keshod,keshod/in -en,ixl,leh,leh,leh/in -en,ixm,madurai,madurai,madurai/in -en,ixn,khowai,khowai,khowai/in -en,ixp,pathankot,pathankot,pathankot/in -en,ixq,kamalpur,kamalpur,kamalpur/in -en,ixr,ranchi,ranchi,ranchi/in -en,ixs,silchar,silchar,silchar/in:kumbhirgram -en,ixt,pasighat,pasighat,pasighat/in -en,ixu,aurangabad,aurangabad,aurangabad/in:chikkalthana -en,ixv,along,along,along/in -en,ixw,jamshedpur,jamshedpur,jamshedpur/in:sonari -en,ixy,kandla,kandla,kandla/in -en,ixz,port blair,port blair,port blair/in -en,iyk,inyokern,inyokern,inyokern/ca/us:kern county -en,izm,izmir,izmir,izmir/tr -en,izo,izumo,izumo,izumo/jp -en,izt,ixtepec,ixtepec,ixtepec/mx -en,jaa,jalalabad,jalalabad,jalalabad/af -en,cew,crestview,crestview,crestview/fl/us:bob sikes -en,cex,chena hot springs,chena hot sprin,chena hot springs/ak/us -en,hoy,hoy island,hoy island,hoy island/gb -en,hpa,ha'apai,ha'apai,ha'apai/to:salote pilolevu -en,hpb,hooper bay,hooper bay,hooper bay/ak/us -en,hpe,hope vale,hope vale,hope vale/ql/au -en,hph,haiphong,haiphong,haiphong/vn:catbi -en,hpn,westchester count,westchester cou,westchester county/ny/us:westc -en,hpp,poipet,poipet,poipet/kh -en,hpr,pretoria hpr,pretoria hpr,pretoria/za:central hpr -en,hpt,hampton,hampton,hampton/ia/us:municipal -en,hpv,kauai island hpv,kauai island hp,kauai island/hi/us:princeville -en,hpy,baytown,baytown,baytown/tx/us -en,hqm,aberdeen,aberdeen,aberdeen/wa/us -en,hra,mansehra,mansehra,mansehra/pk -en,hrb,harbin,harbin,harbin/cn -en,hrc,zhairem,zhairem,zhairem/kz -en,hrd,harstad,harstad,harstad/no:harstad -en,hre,harare,harare,harare/zw -en,hrg,hurghada,hurghada,hurghada/eg -en,hrj,chaurjhari,chaurjhari,chaurjhari/np -en,hrk,kharkov,kharkov,kharkov/ua -en,hrl,harlingen,harlingen,harlingen/tx/us:valley intl -en,hrm,hassi r'mel,hassi r'mel,hassi r'mel/dz:tilrempt -en,hrn,heron island,heron island,heron island/ql/au:heliport -en,hro,harrison,harrison,harrison/ar/us:boone county -en,hrr,herrera,herrera,herrera/co -en,hrs,harrismith,harrismith,harrismith/za:harrismith -en,hrt,harrogate,harrogate,harrogate/gb:linton on ouse -en,hry,henbury,henbury,henbury/nt/au -en,hrz,horizontina,horizontina,horizontina/rs/br -en,hsb,harrisburg,harrisburg,harrisburg/il/us:raleigh -en,hsc,shaoguan,shaoguan,shaoguan/cn -en,hsg,saga,saga,saga/jp:saga airport -en,hsh,las vegas hsh,las vegas hsh,las vegas/nv/us:henderson sky -en,hsi,hastings,hastings,hastings/ne/us -en,hsk,huesca,huesca,huesca/es -en,hsl,huslia,huslia,huslia/ak/us -en,hsm,horsham,horsham,horsham/vi/au -en,hsn,zhoushan,zhoushan,zhoushan/cn -en,hsp,hot springs,hot springs,hot springs/va/us:ingalls fld -en,hss,hissar,hissar,hissar/in -en,hst,homestead,homestead,homestead/fl/us:afb -en,hsv,huntsvil decatur,huntsvil decatu,huntsville/al/us:intl apt -en,hsz,hsinchu,hsinchu,hsinchu/tw -en,hta,chita,chita,chita/ru -en,htb,terre de bas,terre de bas,terre de bas/gp -en,htf,hatfield,hatfield,hatfield/gb -en,htg,hatanga,hatanga,hatanga/ru -en,hth,hawthorne,hawthorne,hawthorne/nv/us -en,hti,hamilton island,hamilton island,hamilton island/ql/au -en,htl,houghton,houghton,houghton/mi/us:roscommon cnty -en,htm,khatgal,khatgal,khatgal/mn -en,htn,hotan,hotan,hotan/cn -en,hto,east hampton,east hampton,east hampton/ny/us -en,htr,hateruma,hateruma,hateruma/jp -en,hts,huntington,huntington,huntington/wv/us:tri state -en,htu,hopetoun,hopetoun,hopetoun/vi/au -en,htv,huntsville,huntsville,huntsville/tx/us -en,htw,chesapeake,chesapeake,huntington/oh/us -en,hty,hatay,hatay,hatay/tr:hatay -en,htz,hato corozal,hato corozal,hato corozal/co -en,hua,huntsville aaf,huntsville aaf,huntsville/al/us:redstone aaf -en,hub,humbert river,humbert river,humbert river/nt/au -en,huc,humacao arpt,humacao arpt,humacao/pr:humacao arpt -en,hud,humboldt,humboldt,humboldt/ia/us -en,hue,humera,humera,humera/et -en,huf,terre haute,terre haute,terre haute/in/us:hulman field -en,hug,huehuetenango,huehuetenango,huehuetenango/gt -en,huh,huahine,huahine,huahine/pf:huahine -en,hui,hue,hue,hue/vn:phu bai -en,huj,hugo,hugo,hugo/ok/us -en,huk,hukuntsi,hukuntsi,hukuntsi/bw -en,hul,houlton,houlton,houlton/me/us:international -en,hum,houma,houma,houma/la/us:terrebonne -en,hun,hualien,hualien,hualien/tw -en,huq,houn,houn,houn/ly -en,gtc,green turtle,green turtle,green turtle/bs -en,gte,groote eylandt,groote eylandt,groote eylandt/nt/au:alyangula -en,gtf,great falls intl,great falls int,great falls/mt/us:intl -en,gtg,grantsburg,grantsburg,grantsburg/wi/us:municipal -en,gti,guettin,guettin,guettin/de -en,gtk,sungei tekai,sungei tekai,sungei tekai/my -en,gtn,mount cook gtn,mount cook gtn,mount cook/nz:glentanner -en,gto,gorontalo,gorontalo,gorontalo/id:tolotio -en,gtp,grants pass,grants pass,grants pass/or/us -en,gtr,columbus gtr,columbus gtr,columbus/ms/us:golden triangle -en,gts,granites,granites,granites/nt/au -en,gtt,georgetown,georgetown,georgetown/ql/au -en,gtw,zlin,zlin,zlin/cz:holesov -en,gty,gettysburg,gettysburg,gettysburg/pa/us -en,gua,guatemala city,guatemala city,guatemala city/gt:la aurora -en,gub,guerrero negro,guerrero negro,guerrero negro/mx -en,guc,gunnison,gunnison,gunnison/co/us -en,gud,goundam,goundam,goundam/ml -en,gue,guriaso,guriaso,guriaso/pg -en,guf,gulf shores,gulf shores,gulf shores/al/us:edwards -en,gug,guari,guari,guari/pg -en,guh,gunnedah,gunnedah,gunnedah/ns/au -en,gui,guiria,guiria,guiria/ve -en,guj,guaratingueta,guaratingueta,guaratingueta/sp/br -en,gul,goulburn,goulburn,goulburn/ns/au -en,gum,guam won pat intl,guam won pat in,guam/gu:a.b won pat intl -en,gun,montgomery gu afb,montgomery gu a,montgomery/al/us:gunter afb -en,guo,gualaco,gualaco,gualaco/hn -en,gup,gallup,gallup,gallup/nm/us:senator clark -en,guq,guanare,guanare,guanare/ve -en,gur,alotau,alotau,alotau/pg:gurney -en,gus,peru,peru,peru/in/us:grissom afb -en,gut,guetersloh,guetersloh,guetersloh/de -en,guu,grundarfjordur,grundarfjordur,grundarfjordur/is -en,guv,mougulu,mougulu,mougulu/pg -en,guw,atyrau,atyrau,atyrau/kz -en,gux,guna,guna,guna/in -en,guy,guymon,guymon,guymon/ok/us -en,guz,guarapari,guarapari,guarapari/es/br -en,gva,geneva,geneva,geneva/ch:geneva intl -en,gve,gordonsville,gordonsville,gordonsville/va/us:municipal -en,gvi,green river,green river,green river/pg -en,gvl,gainesville,gainesville,gainesville/ga/us:lee gilmer -en,gvn,sovetskaya gavan,sovetskaya gava,sovetskaya gavan/ru:sovetskaya -en,gvp,greenvale,greenvale,greenvale/ql/au -en,gvr,governador valada,governador vala,governador valada/mg/br -en,gvt,greenville,greenville,greenville/tx/us:majors field -en,gvw,grandview,grandview,grandview/mo/us:richards gebau -en,gvx,gavle,gavle,gavle/se:sandviken -en,gwa,gwa,gwa,gwa/mm -en,gwd,gwadar,gwadar,gwadar/pk -en,gwe,gweru,gweru,gweru/zw -en,gwl,gwalior,gwalior,gwalior/in -en,gwn,gnarowein,gnarowein,gnarowein/pg -en,gwo,greenwood,greenwood,greenwood/ms/us:leflore -en,gws,glenwood springs,glenwood spring,glenwood springs/co/us -en,gwt,westerland,westerland,westerland/de:westerland sylt -en,gwv,glendale,glendale,glendale/wv/us -en,gwy,galway,galway,galway/ie:carnmore -en,gxf,seiyun,seiyun,seiyun/ye -en,gxg,negage,negage,negage/ao -en,gxh,mildenhall naf,mildenhall naf,mildenhall/gb:naf -en,gxq,coyhaique,coyhaique,coyhaique/cl:ten vidal -en,gxx,yagoua,yagoua,yagoua/cm -en,gxy,greeley,greeley,greeley/co/us:weld county -en,gya,guayaramerin,guayaramerin,guayaramerin/bo -en,gyd,baku intl,baku intl,baku/az:heydar aliyev intl -en,gye,guayaquil,guayaquil,guayaquil/ec:jose joaquin de o -en,gyi,gisenyi,gisenyi,gisenyi/rw -en,gyl,argyle,argyle,argyle/wa/au -en,gym,guaymas,guaymas,guaymas/mx:gen jose m yanez -en,gyn,goiania,goiania,goiania/go/br:santa genoveva -en,gyp,gympie,gympie,gympie/ql/au -en,dec,decatur,decatur,decatur/il/us:decatur apt -en,ded,dehra dun,dehra dun,dehra dun/in -en,def,dezful,dezful,dezful/ir:dezful -en,deh,decorah,decorah,decorah/ia/us:municipal -en,dei,denis island,denis island,denis island/sc -en,del,delhi,delhi,delhi/in:indira gandhi intl -en,dem,dembidollo,dembidollo,dembidollo/et -en,den,denver,denver,denver/co/us:denver intl -en,deo,dearborn,dearborn,dearborn/mi/us:hyatt regency -en,dep,deparizo,deparizo,deparizo/in -en,der,derim,derim,derim/pg -en,des,desroches,desroches,desroches/sc -en,det,detroit city,detroit city,detroit/mi/us:detroit city -en,dez,deirezzor,deirezzor,deirezzor/sy:al -en,dfi,defiance,defiance,defiance/oh/us:memorial -en,dfp,drumduff,drumduff,drumduff/ql/au -en,dfw,dallas fort worth,dallas fort wor,dallas/tx/us:dallas ft worth -en,dga,dangriga,dangriga,dangriga/bz -en,dgb,danger bay,danger bay,danger bay/ak/us -en,dgc,degahbur,degahbur,degahbur/et -en,dgd,dalgaranga,dalgaranga,dalgaranga/wa/au -en,dge,mudgee,mudgee,mudgee/ns/au -en,dgf,douglas lake,douglas lake,douglas lake/bc/ca -en,dgg,daugo,daugo,daugo/pg -en,dgk,dugong,dugong,dugong/mz -en,dgl,douglas municipal,douglas municip,douglas/az/us:municipal -en,dgm,dongguan,dongguan,dongguan/cn -en,dgn,dahlgren,dahlgren,dahlgren/va/us:naf -en,dgo,durango,durango,durango/mx:guadalupe victoria -en,dgp,daugavpils,daugavpils,daugavpils/lv -en,dgr,dargaville,dargaville,dargaville/nz -en,dgt,dumaguete,dumaguete,dumaguete/ph -en,dgu,dedougou,dedougou,dedougou/bf -en,dgw,douglas,douglas,douglas/wy/us:converse county -en,dha,dhahran,dhahran,dhahran/sa -en,dhd,durham downs,durham downs,durham downs/ql/au -en,dhf,abu dhabi al dhaf,abu dhabi al dh,abu dhabi/ae:al dhafra milit -en,dhi,dhangarhi,dhangarhi,dhangarhi/np -en,dhl,dhala,dhala,dhala/ye -en,dhm,dharamsala,dharamsala,dharamsala/in:gaggal airport -en,dhn,dothan,dothan,dothan/al/us:regional apt -en,dhr,den helder,den helder,den helder/nl:de kooy -en,dht,dalhart,dalhart,dalhart/tx/us -en,dib,dibrugarh,dibrugarh,dibrugarh/in:chabua -en,dic,dili,dili,dili/cd -en,die,antsiranana,antsiranana,antsiranana/mg:antsiranana -en,dig,diqing,diqing,diqing/cn:diqing -en,dij,dijon,dijon,dijon/fr -en,dik,dickinson,dickinson,dickinson/nd/us -en,dil,dili,dili,dili/tl:comoro -en,dim,dimbokro,dimbokro,dimbokro/ci -en,din,dien bien phu,dien bien phu,dien bien/vn:dien bien airport -en,dio,diomede island,diomede island,diomede island/ak/us -en,dip,diapaga,diapaga,diapaga/bf -en,diq,divinopolis,divinopolis,divinopolis/mg/br -en,dir,dire dawa,dire dawa,dire dawa/et:aba tenna d yilma -en,dis,loubomo,loubomo,loubomo/cg -en,diu,diu,diu,diu/in -en,div,divo,divo,divo/ci -en,diw,dickwella mawella,dickwella mawel,dickwella/lk:mawella lagoon -en,diy,diyarbakir,diyarbakir,diyarbakir/tr -en,dja,djougou,djougou,djougou/bj -en,djb,jambi,jambi,jambi/id:sultan taha syarifudn -en,dje,djerba,djerba,djerba/tn:melita -en,djg,djanet,djanet,djanet/dz:inedbirenne -en,djj,jayapura,jayapura,jayapura/id:sentani -en,djm,djambala,djambala,djambala/cg -en,djn,delta junction,delta junction,delta junction/ak/us -en,djo,daloa,daloa,daloa/ci -en,dju,djupivogur,djupivogur,djupivogur/is -en,dki,dunk island,dunk island,dunk island/ql/au -en,dkk,dunkirk,dunkirk,dunkirk/ny/us -en,dkr,dakar,dakar,dakar/sn:yoff -en,dks,dikson,dikson,dikson/ru -en,anu,antigua,antigua,antigua/ag:vc bird intl -en,anv,anvik,anvik,anvik/ak/us -en,anw,ainsworth,ainsworth,ainsworth/ne/us -en,cey,murray,murray,murray/ky/us:calloway county -en,cez,cortez,cortez,cortez/co/us:montezuma county -en,cfa,coffee point,coffee point,coffee point/ak/us -en,cfb,cabo frio,cabo frio,cabo f/rj/br:cabo frio airport -en,cfc,cacador,cacador,cacador/sc/br -en,cfd,bryan,bryan,bryan/tx/us:coulter field -en,cfe,clermont ferrand,clermont ferran,clermont ferrand/fr:aulnat -en,cff,cafunfo,cafunfo,cafunfo/ao -en,cfg,cienfuegos,cienfuegos,cienfuegos/cu -en,cfh,clifton hills,clifton hills,clifton hills/sa/au -en,cfi,camfield,camfield,camfield/nt/au -en,cfk,chlef,chlef,chlef/dz:abou bakr belkaid -en,cfn,donegal,donegal,donegal/ie -en,cfo,confreza,confreza,confreza/mt/br -en,cfp,carpentaria downs,carpentaria dow,carpentaria downs/ql/au -en,cfq,creston,creston,creston/bc/ca -en,cfr,caen,caen,caen/fr:carpiquet -en,cfs,coffs harbour,coffs harbour,coffs harbour/ns/au -en,cft,clifton,clifton,clifton/az/us:morenci -en,cfu,kerkyra/i kapodi,kerkyra/i kapod,kerkyra/gr:i kapodistrais -en,cfv,coffeyville,coffeyville,coffeyville/ks/us:municipal -en,cga,craig,craig,craig/ak/us:craig spb -en,cgb,cuiaba,cuiaba,cuiaba/mt/br:m rondon -en,cgc,cape gloucester,cape gloucester,cape gloucester/pg -en,cgd,changde,changde,changde/cn -en,cge,cambridge,cambridge,cambridge/md/us -en,cgf,cleveland cgf,cleveland cgf,cleveland/oh/us:cuyahoga -en,cgg,casiguran,casiguran,casiguran/ph -en,cgh,sao paulo cgh,sao paulo cgh,sao paulo/sp/br:congonhas -en,cgi,cape girardeau,cape girardeau,cape girardeau/mo/us -en,cgj,chingola,chingola,chingola/zm -en,cgk,jakarta intl,jakarta intl,jakarta/id:soekarno hatta intl -en,cgm,camiguin,camiguin,camiguin/ph:mambajao -en,cgn,cologne/bonn,cologne/bonn,cologne/de:koeln/bonn -en,cgo,zhengzhou,zhengzhou,zhengzhou/cn -en,cgp,chittagong,chittagong,chittagong/bd:patenga -en,cgq,changchun,changchun,changchun/cn -en,cgr,campo grande,campo grande,campo grande/ms/br:intl -en,cgs,college park,college park,college park/md/us -en,cgt,chinguitti,chinguitti,chinguitti/mr -en,cgu,ciudad guayana,ciudad guayana,ciudad guayana/ve -en,cgv,caiguna,caiguna,caiguna/wa/au -en,cgx,chicago cgx,chicago cgx,chicago/il/us:merrill c meigs -en,cgy,cagayan de oro,cagayan de oro,cagayan de oro/ph:lumbia -en,cgz,casa grande,casa grande,casa grande/az/us:municipal -en,cha,chattanooga,chattanooga,chattanooga/tn/us:lovell field -en,chb,chilas,chilas,chilas/pk -en,chc,christchurch,christchurch,christchurch/nz -en,che,caherciveen,caherciveen,caherciveen/ie:reenroe -en,chf,jinhae,jinhae,jinhae/kr -en,chg,chaoyang,chaoyang,chaoyang/cn:chaoyang airport -en,chh,chachapoyas,chachapoyas,chachapoyas/pe -en,chi,chicago,chicago,chicago/il/us -en,chj,chipinge,chipinge,chipinge/zw -en,chk,chickasha,chickasha,chickasha/ok/us:municipal -en,chl,challis,challis,challis/id/us -en,chm,chimbote,chimbote,chimbote/pe -en,chn,jeonju,jeonju,jeonju/kr -en,cho,charlottesville,charlottesville,charlottesville/va/us:albemarl -en,chp,circle hot spring,circle hot spri,circle hot spring/ak/us -en,chq,chania,chania,chania/gr:souda -en,chr,chateauroux,chateauroux,chateauroux/fr -en,chs,charleston,charleston,charleston/sc/us -en,cht,chatham island,chatham island,chatham island/nz:karewa -en,chu,chuathbaluk,chuathbaluk,chuathbaluk/ak/us -en,chv,chaves,chaves,chaves/pt -en,chw,jiuquan,jiuquan,jiuquan/cn -en,chx,changuinola,changuinola,changuinola/pa -en,chy,choiseul bay,choiseul bay,choiseul bay/sb -en,fkl,franklin,franklin,franklin/pa/us:chess lambertin -en,fkn,franklin,franklin,franklin/va/us:municipal -en,efw,jefferson,jefferson,jefferson/ia/us:municipal -en,ega,engati,engati,engati/pg -en,egc,bergerac,bergerac,bergerac/fr:roumanieres -en,ege,vail eagle,vail eagle,vail eagle/co/us:eagle county -en,egi,valparaiso duke f,valparaiso duke,valparaiso/fl/us:duke field -en,egl,neghelli,neghelli,neghelli/et -en,egm,sege,sege,sege/sb -en,egn,geneina,geneina,geneina/sd -en,ego,belgorod,belgorod,belgorod/ru -en,egp,eagle pass,eagle pass,eagle pass/tx/us:maverick co -en,egs,egilsstadir,egilsstadir,egilsstadir/is -en,egv,eagle river,eagle river,eagle river/wi/us -en,egx,egegik,egegik,egegik/ak/us -en,ehd,pseudo city code,pseudo city cod,pseudo city code/zz -en,ehl,el bolson,el bolson,el bolson/rn/ar -en,ehm,cape newenham,cape newenham,cape newenham/ak/us -en,eht,east hartford,east hartford,east hartford/ct/us:rentschler -en,eia,eia,eia,eia/pg:popondetta -en,eib,eisenach,eisenach,eisenach/de -en,eie,eniseysk,eniseysk,eniseysk/ru -en,eih,einasleigh,einasleigh,einasleigh/ql/au -en,eik,eisk,eisk,eisk/ru -en,eil,fairbanks afb,fairbanks afb,fairbanks/ak/us:eielson afb -en,ein,eindhoven,eindhoven,eindhoven/nl -en,eis,beef island,beef island,beef island/vg -en,eiy,ein yahav,ein yahav,ein yahav/il -en,eja,barrancabermeja,barrancabermeja,barrancabermeja/co:variguies -en,ejh,wedjh,wedjh,wedjh/sa -en,ejt,mili atoll,mili atoll,mili atoll/mh:enijet airport -en,eka,eureka,eureka,eureka/ca/us:murray field -en,ekb,ekibastuz,ekibastuz,ekibastuz/kz -en,ekd,elkedra,elkedra,elkedra/nt/au -en,eke,ekereku,ekereku,ekereku/gy -en,eki,elkhart,elkhart,elkhart/in/us:elkhart mnpl -en,ekn,elkins,elkins,elkins/wv/us -en,eko,elko,elko,elko/nv/us -en,eks,shakhtersk,shakhtersk,shakhtersk/ru:shakhtersk -en,ekt,eskilstuna,eskilstuna,eskilstuna/se -en,ekx,elizabethtown,elizabethtown,elizabethtown/ky/us -en,ela,eagle lake,eagle lake,eagle lake/tx/us -en,elb,el banco,el banco,el banco/co:san bernado -en,elc,elcho island,elcho island,elcho island/nt/au -en,eld,el dorado,el dorado,el dorado/ar/us:goodwin field -en,ele,el real,el real,el real/pa -en,elf,el fasher,el fasher,el fasher/sd -en,elg,el golea,el golea,el golea/dz -en,elh,north eleuthera,north eleuthera,north eleuthera/bs:intl -en,eli,elim,elim,elim/ak/us -en,elj,el recreo,el recreo,el recreo/co -en,elk,elk city,elk city,elk city/ok/us:municipal -en,ell,ellisras,ellisras,ellisras/za -en,elm,elmira,elmira,elmira/ny/us -en,eln,ellensburg,ellensburg,ellensburg/wa/us:bowers field -en,elo,eldorado,eldorado,eldorado/mi/ar -en,elp,el paso intl,el paso intl,el paso/tx/us:el paso intl -en,elq,gassim,gassim,gassim/sa -en,elr,elelim indonesia,elelim,elelim/id -en,els,east london,east london,east london/za:east london -en,elt,tour sinai city,tour sinai city,tour sinai city/eg -en,elu,el oued,el oued,el oued/dz:guemar -en,elv,elfin cove,elfin cove,elfin cove/ak/us:elfin cove s -en,elw,ellamar,ellamar,ellamar/ak/us -en,elx,el tigre,el tigre,el tigre/ve -en,ely,ely,ely,ely/nv/us:yelland -en,elz,wellsville,wellsville,wellsville/ny/us:municipal -en,ema,east midlands,east midlands,nottingham/gb:east midlands -en,emb,san francisco emb,san francisco e,san francisco/ca/us:embarkader -en,emd,emerald,emerald,emerald/ql/au -en,eme,emden,emden,emden/de -en,emg,empangeni,empangeni,empangeni/za -en,emi,emirau,emirau,emirau/pg -en,emk,emmonak,emmonak,emmonak/ak/us -en,eml,emmen,emmen,emmen/ch:emmen -en,emm,kemerer,kemerer,kemerer/wy/us -en,gog,gobabis,gobabis,gobabis/na -en,goh,nuuk,nuuk,nuuk/gl -en,goi,goa,goa,goa/in:dabolim -en,goj,nizhniy novgorod,nizhniy novgoro,nizhniy novgorod/ru -en,gok,guthrie,guthrie,guthrie/ok/us -en,gol,gold beach,gold beach,gold beach/or/us:state -en,gom,goma,goma,goma/cd -en,gon,groton,groton,groton/ct/us:groton new london -en,goo,goondiwindi,goondiwindi,goondiwindi/ql/au -en,gop,gorakhpur,gorakhpur,gorakhpur/in -en,goq,golmud,golmud,golmud/cn -en,gor,gore,gore,gore/et -en,gos,gosford,gosford,gosford/ns/au -en,got,gothenburg got,gothenburg got,gothenburg/se:landvetter -en,gou,garoua,garoua,garoua/cm -en,gov,gove,gove,gove/nt/au:nhulunbuy -en,goy,gal oya,gal oya,gal oya/lk:amparai -en,goz,gorna orechovitsa,gorna orechovit,gorna orechovitsa/bg -en,gpa,patras,patras,patras/gr:araxos airport -en,gpb,guarapuava,guarapuava,guarapuava/pr/br:tancredo -en,gpi,guapi,guapi,guapi/co -en,gpl,guapiles,guapiles,guapiles/cr -en,gpn,garden point,garden point,garden point/nt/au -en,gpo,general pico,general pico,general pico/lp/ar -en,gps,galapagos is,galapagos is,galapagos is/ec:baltra -en,gpt,gulfport,gulfport,gulfport/ms/us:biloxi regional -en,gpz,grand rapids,grand rapids,grand rapids/mn/us -en,gqj,machrihanish,machrihanish,machrihanish/gb:raf station -en,gqq,galion,galion,galion/oh/us -en,gra,gamarra,gamarra,gamarra/co -en,grb,green bay,green bay,green bay/wi/us:austin field -en,grc,grand cess,grand cess,grand cess/lr -en,grd,greenwood,greenwood,greenwood/sc/us -en,gre,greenville,greenville,greenville/il/us:municipal -en,grf,tacoma gray aaf,tacoma gray aaf,tacoma/wa/us:gray aaf -en,grg,gardez,gardez,gardez/af -en,grh,garuahi,garuahi,garuahi/pg -en,gri,grand island,grand island,grand island/ne/us -en,grj,george,george,george/za -en,grk,killeen gray aaf,killeen gray aa,killeen/tx/us:gray aaf -en,grl,garasa,garasa,garasa/pg -en,grm,grand marais,grand marais,grand marais/mn/us:devils trck -en,grn,gordon,gordon,gordon/ne/us -en,gro,girona,girona,girona/es:costa brava -en,grp,gurupi,gurupi,gurupi/go/br -en,grq,groningen,groningen,groningen/nl:eelde -en,grr,grand rapids,grand rapids,grand rapids/mi/us:gerald r fo -en,grs,grosseto,grosseto,grosseto/it:baccarini -en,grt,gujrat,gujrat,gujrat/pk -en,gru,sao paulo gru,sao paulo gru,sao paulo/sp/br:guarulhos intl -en,grv,groznyj,groznyj,groznyj/ru -en,grw,graciosa island,graciosa island,graciosa island/pt -en,grx,granada,granada,granada/es -en,gry,grimsey,grimsey,grimsey/is -en,grz,graz,graz,graz/at:thalerhof -en,gsa,long pasia,long pasia,long pasia/my -en,gsb,goldsboro,goldsboro,goldsboro/nc/us:seymour johns -en,gsc,gascoyne junction,gascoyne juncti,gascoyne junction/wa/au -en,gse,gothenburg gse,gothenburg gse,gothenburg gse/se:saeve -en,gsh,goshen,goshen,goshen/in/us -en,gsi,guadalcanal,guadalcanal,guadalcanal/sb -en,gsl,taltheilei,taltheilei,taltheilei/nt/ca -en,gsm,gheshm,gheshm,gheshm/ir -en,gsn,mount gunson,mount gunson,mount gunson/sa/au -en,gso,greensboro,greensboro,greensboro/high point/nc/us:pi -en,gsp,greenvl spartanbg,greenvl spartan,greenville/sc/us:spartanburg -en,gsq,shark alowainat,shark alowainat,shark alowainat/eg -en,gsr,gardo,gardo,gardo/so -en,gss,sabi sabi,sabi sabi,sabi sabi/za -en,gst,gustavus airport,gustavus airpor,gustavus/ak/us:gustavus -en,gsu,gedaref,gedaref,gedaref/sd -en,gsy,grimsby,grimsby,grimsby/gb:binbrook -en,gta,gatokae,gatokae,gatokae/sb:aerodrom -en,gtb,genting,genting,genting/my -en,anx,andenes,andenes,andenes/no -en,any,anthony,anthony,anthony/ks/us -en,anz,angus downs,angus downs,angus downs/nt/au -en,aoa,aroa,aroa,aroa/pg -en,aob,annanberg,annanberg,annanberg/pg -en,aoc,altenburg,altenburg,altenburg/de:altenburg nobitz -en,aod,abou deia,abou deia,abou deia/td -en,aoe,anadolu ubiv apt,anadolu ubiv ap,eskisehir/tr:anadolu universit -en,aog,anshan,anshan,anshan/cn -en,aoh,lima,lima,lima/oh/us:allen county -en,aoi,ancona,ancona,ancona/it:falconara -en,aoj,aomori,aomori,aomori/jp -en,aok,karpathos,karpathos,karpathos/gr -en,aol,paso de los libre,paso de los lib,paso de los libres/cr/ar -en,aon,arona,arona,arona/pg -en,aoo,altoona/martinsbg,altoona,altoona/pa/us -en,aor,alor setar,alor setar,alor setar/my -en,aos,amook,amook,amook/ak/us -en,aot,aosta,aosta,aosta/it:corrado gex airport -en,aou,attopeu,attopeu,attopeu/la -en,apa,denver arapahoe,denver arapahoe,denver/co/us:arapahoe co -en,apb,apolo,apolo,apolo/bo -en,apc,napa,napa,napa/ca/us:napa county -en,ape,san juan aposento,san juan aposen,san juan aposento/pe -en,apf,naples,naples,naples/fl/us -en,apg,aberdeen,aberdeen,aberdeen/md/us:phillips aaf -en,aph,bowling green,bowling green,bowling green/va/us:camp a p h -en,api,apiay,apiay,apiay/co -en,apk,apataki,apataki,apataki/pf -en,apl,nampula,nampula,nampula/mz -en,apn,alpena,alpena,alpena/mi/us:county regional -en,apo,apartado,apartado,apartado/co -en,app,asapa,asapa,asapa/pg -en,apq,arapiraca,arapiraca,arapiraca/al/br -en,apr,april river,april river,april river/pg -en,aps,anapolis,anapolis,anapolis/go/br -en,apt,jasper,jasper,jasper/tn/us:marion county -en,apu,apucarana,apucarana,apucarana/pr/br -en,apv,apple valley,apple valley,apple valley/ca/us -en,apw,apia faleolo,apia faleolo,apia/ws:faleolo -en,apx,arapongas,arapongas,arapongas/pr/br -en,apy,alto parnaiba,alto parnaiba,alto parnaiba/ma/br -en,apz,zapala,zapala,zapala/ne/ar -en,aqa,araraquara,araraquara,araraquara/sp/br -en,aqb,quiche,quiche,quiche/gt:quiche -en,aqg,anqing,anqing,anqing/cn -en,aqi,qaisumah,qaisumah,qaisumah/sa -en,aqj,aqaba,aqaba,aqaba/jo:king hussein intl -en,aqm,ariquemes,ariquemes,ariquemes/ro/br -en,aqp,arequipa,arequipa,arequipa/pe:rodriguez ballon -en,aqs,saqani,saqani,saqani/fj -en,aqy,alyeska,alyeska,alyeska/ak/us -en,ara,new iberia,new iberia,new iberia/la/us:acadiana rgnl -en,arb,ann arbor,ann arbor,ann arbor/mi/us:municipal -en,arc,arctic village,arctic village,arctic village/ak/us -en,ard,alor island,alor island,alor island/id -en,are,arecibo,arecibo,arecibo/pr -en,arf,acaricuara,acaricuara,acaricuara/co -en,arg,walnut ridge,walnut ridge,walnut ridge/ar/us -en,arh,arkhangelsk,arkhangelsk,arkhangelsk/ru -en,ari,arica,arica,arica/cl:chacalluta -en,arj,arso,arso,arso/id -en,ark,arusha,arusha,arusha/tz -en,arl,arly,arly,arly/bf -en,arm,armidale,armidale,armidale/ns/au -en,arn,stockholm arlanda,stockholm arlan,stockholm/se:arlanda -en,aro,arboletas,arboletas,arboletas/co -en,arp,aragip,aragip,aragip/pg -en,arq,arauquita,arauquita,arauquita/co -en,arr,alto rio senguerr,alto rio sengue,alto rio senguerr/cb/ar -en,ars,aragarcas,aragarcas,aragarcas/go/br -en,art,watertown,watertown,watertown/ny/us -en,aru,aracatuba,aracatuba,aracatuba/sp/br -en,hus,hughes,hughes,hughes/ak/us:municipal -en,hut,hutchinson,hutchinson,hutchinson/ks/us -en,huu,huanuco,huanuco,huanuco/pe -en,fkq,fak fak,fak fak,fak fak/id -en,fks,fukushima,fukushima,fukushima/jp:airport -en,fla,florencia,florencia,florencia/co:capitolio -en,flb,floriano,floriano,floriano/pi/br:cangapara -en,flc,falls creek,falls creek,falls creek/vi/au -en,fld,fond du lac,fond du lac,fond du lac/wi/us -en,fle,petersburg aaf,petersburg aaf,petersburg/va/us:fort lee aaf -en,flf,flensburg,flensburg,flensburg/de:schaferhaus -en,flg,flagstaff,flagstaff,flagstaff/az/us:pulliam field -en,flh,flotta,flotta,flotta/gb -en,fli,flateyri,flateyri,flateyri/is -en,flj,falls bay,falls bay,falls bay/ak/us -en,fll,ft lauderdale,ft lauderdale,ft lauderdale/fl/us:fll intl -en,flm,filadelfia,filadelfia,filadelfia/py -en,fln,florianopolis,florianopolis,florianopolis/sc/br:hercilio l -en,flo,florence,florence,florence/sc/us -en,flp,flippin,flippin,flippin/ar/us -en,flr,florence,florence,florence/it:peretola -en,fls,flinders island,flinders island,flinders island/ts/au -en,flt,flat,flat,flat/ak/us -en,flu,new york flushing,new york flushi,new york/ny/us:flushing helipo -en,flv,fort leavenworth,fort leavenwort,fort leavenworth/ks/us:sherman -en,flw,flores island/san,flores island,flores island/pt:santa cruz -en,flx,fallon municipal,fallon municipa,fallon/nv/us:municipal -en,fly,finley,finley,finley/ns/au -en,fma,formosa,formosa,formosa/fo/ar:el pucu -en,fmc,five mile,five mile,five mile/ak/us -en,fme,fort meade,fort meade,fort meade/md/us:tipton aaf -en,fmg,flamingo,flamingo,flamingo/cr -en,fmh,falmouth,falmouth,falmouth/ma/us:otis afb -en,fmi,kalemie,kalemie,kalemie/cd -en,fmm,memmingen,memmingen,memmingen/de:allgaeu -en,fmn,farmington,farmington,farmington/nm/us:mnpl -en,fmo,muenster,muenster,muenster/de -en,fms,fort madison,fort madison,fort madison/ia/us:municipal -en,fmu,florence mnpl,florence mnpl,florence/or/us:municipal -en,fmy,fort myers fmy,fort myers fmy,fort myers/fl/us:page field -en,fna,freetown fna,freetown fna,freetown/sl:lungi intl -en,fnb,neubrandenburg,neubrandenburg,neubrandenburg/de -en,fnc,madeira,madeira,madeira/pt:madeira -en,fne,fane,fane,fane/pg -en,fng,fada ngourma,fada ngourma,fada ngourma/bf -en,fnh,fincha,fincha,fincha/et -en,fni,nimes,nimes,nimes/fr:garons -en,fnj,pyongyang,pyongyang,pyongyang/kp:sunan -en,fnk,fin creek,fin creek,fin creek/ak/us -en,fnl,ft collins love,ft collins love,fort collins love/co/us:mncpl -en,fnr,funter bay,funter bay,funter bay/ak/us:spb -en,fnt,flint,flint,flint/mi/us:bishop -en,foa,foula,foula,foula/gb -en,fob,fort bragg,fort bragg,fort bragg/ca/us -en,foc,fuzhou,fuzhou,fuzhou/cn -en,fod,fort dodge,fort dodge,fort dodge/ia/us -en,foe,topeka forbes,topeka forbes,topeka forbes/ks/us:forbes afb -en,fog,foggia,foggia,foggia/it:gino lisa -en,fok,westhampton,westhampton,westhampton/ny/us:suffolk cnty -en,fom,foumban,foumban,foumban/cm -en,fon,fortuna,fortuna,fortuna/cr:fortuna airport -en,foo,numfoor,numfoor,numfoor/id -en,fop,forest park,forest park,forest park/ga/us:morris aaf -en,for,fortaleza,fortaleza,fortaleza/ce/br:pinto martins -en,fos,forrest,forrest,forrest/wa/au -en,fot,forster,forster,forster/ns/au -en,fou,fougamou,fougamou,fougamou/ga -en,fox,fox,fox,fox/ak/us -en,foy,foya,foya,foya/lr -en,fpo,freeport,freeport,freeport/bs:grand bahama intl -en,fpr,fort pierce,fort pierce,fort pierce/fl/us:st lucie cnt -en,fpy,perry,perry,perry/fl/us:perry foley -en,bcw,benguera island,benguera island,benguera island/mz -en,bcx,beloreck,beloreck,beloreck/ru -en,bcy,bulchi,bulchi,bulchi/et -en,lar,laramie,laramie,laramie/wy/us:general brees -en,las,las vegas las,las vegas las,las vegas/nv/us:mccarran intl -en,lat,la uribe,la uribe,la uribe/co -en,lau,lamu,lamu,lamu/ke -en,lav,lalomalava,lalomalava,lalomalava/ws -en,law,lawton,lawton,lawton/ok/us:lawton mnpl -en,lax,los angeles,los angeles,los angeles/ca/us:intl,los angeles -en,lay,ladysmith,ladysmith,ladysmith/za -en,laz,bom jesus da lapa,bom jesus da la,bom jesus da lapa/ba/br -en,lba,leeds,leeds,leeds/gb:leeds bradford -en,lbb,lubbock intl,lubbock intl,lubbock/tx/us:lubbock intl -en,lbc,luebeck,luebeck,hamburg/de:luebeck -en,lbd,khudzhand,khudzhand,khudzhand/tj -en,lbe,latrobe,latrobe,latrobe/pa/us:westmoreland cnt -en,lbf,north platte,north platte,north platte/ne/us:lee bird fl -en,lbg,paris le bourget,paris le bourge,paris le bourget/fr:le bourget -en,lbh,sydney palm beach,sydney palm bea,sydney/ns/au:palm beach spb -en,lbi,albi,albi,albi/fr:le sequestre -en,lbj,labuan bajo,labuan bajo,labuan bajo/id:mutiara -en,lbk,liboi,liboi,liboi/ke -en,lbl,liberal,liberal,liberal/ks/us:municipal -en,lbm,luabo,luabo,luabo/mz -en,lbn,lake baringo,lake baringo,lake baringo/ke -en,lbo,lusambo,lusambo,lusambo/cd -en,lbp,long banga,long banga,long banga/my:airfield -en,lbq,lambarene,lambarene,lambarene/ga -en,lbr,labrea,labrea,labrea/am/br -en,lbs,labasa,labasa,labasa/fj -en,lbt,lumberton,lumberton,lumberton/nc/us -en,lbu,labuan,labuan,labuan/my -en,lbv,libreville,libreville,libreville/ga:leon m ba -en,lbw,long bawan,long bawan,long bawan/id -en,lbx,lubang,lubang,lubang/ph -en,lby,la baule,la baule,la baule/fr:montoir -en,lbz,lukapa,lukapa,lukapa/ao -en,lca,larnaca,larnaca,larnaca/cy -en,lcb,pontes e lacerda,pontes e lacerd,pontes e lacerda/mt/br -en,lcc,lecce,lecce,lecce/it:galatina -en,lcd,louis trichardt,louis trichardt,louis trichardt/za -en,lce,la ceiba,la ceiba,la ceiba/hn:goloson intl -en,lcf,rio dulce,rio dulce,rio dulce/gt:las vegas airport -en,lcg,la coruna,la coruna,la coruna/es -en,lch,lake charles,lake charles,lake charles/la/us:mnpl -en,lci,laconia,laconia,laconia/nh/us:municipal -en,lcj,lodz,lodz,lodz/pl:lodz lublinek airport -en,lck,columbus lck,columbus lck,columbus/oh/us:rickenbacker -en,lcl,la coloma,la coloma,la coloma/cu -en,lcm,la cumbre,la cumbre,la cumbre/cd/ar -en,lcn,balcanoona,balcanoona,balcanoona/sa/au -en,lco,lague,lague,lague/cg -en,lcp,loncopue,loncopue,loncopue/ne/ar -en,lcr,la chorrera,la chorrera,la chorrera/co -en,lcs,las canas,las canas,las canas/cr -en,lcv,lucca,lucca,lucca/it -en,lcx,longyan,longyan,longyan/cn:liancheng apt -en,lcy,london city,london city,london/gb:london city -en,lda,malda,malda,malda/in -en,ldb,londrina,londrina,londrina/pr/br -en,ldc,lindeman island,lindeman island,lindeman island/ql/au -en,lde,lourdes tarbes,lourdes tarbes,lourdes/fr:lourdes tarbes intl -en,ldg,leshukonskoye,leshukonskoye,leshukonskoye/ru:leshukonskoye -en,ldh,lord howe island,lord howe islan,lord howe island/ns/au -en,ldi,lindi,lindi,lindi/tz:kikwetu -en,ldj,linden,linden,linden/nj/us -en,ldk,lidkoping,lidkoping,lidkoping/se:hovby -en,ldm,ludington,ludington,ludington/mi/us:mason county -en,ldn,lamidanda,lamidanda,lamidanda/np -en,ldo,ladouanie,ladouanie,ladouanie/sr -en,ldr,lodar,lodar,lodar/ye -en,lds,fictitious,fictitious,fictitious/zz -en,ldu,lahad datu,lahad datu,lahad datu/my -en,ldv,landivisiau,landivisiau,landivisiau/fr -en,ldw,lansdowne,lansdowne,lansdowne/wa/au -en,ldx,st laurent du mar,st laurent du m,st laurent du maroni/gf -en,lyt,lady elliot islan,lady elliot isl,lady elliot islan/ql/au -en,lyu,ely,ely,ely/mn/us -en,lyx,lydd ashford,lydd ashford,lydd/gb:london ashford -en,lza,luiza,luiza,luiza/cd -en,lzc,lazaro cardenas,lazaro cardenas,lazaro cardenas/mx -en,lzd,lanzhou lzd,lanzhou lzd,lanzhou/cn:lanzhoudong -en,lzh,liuzhou,liuzhou,liuzhou/cn -en,lzi,luozi,luozi,luozi/cd -en,lzm,luzamba,luzamba,luzamba/ao -en,lzn,nangan,nangan,nangan/cn -en,lzo,luzhou,luzhou,luzhou/cn -en,lzr,lizard island,lizard island,lizard island/ql/au -en,lzy,lin zhi,lin zhi,lin zhi/cn -en,maa,chennai,chennai,chennai/in -en,mab,maraba,maraba,maraba/pa/br -en,mac,macon smart,macon smart,macon/ga/us:smart -en,mad,madrid barajas,madrid barajas,madrid/es:barajas -en,mae,madera,madera,madera/ca/us -en,maf,midland odessa rg,midland odessa,midland/tx/us:odessa -en,mag,madang,madang,madang/pg -en,mah,menorca,menorca,menorca/es -en,mai,mangochi,mangochi,mangochi/mw -en,maj,majuro,majuro,majuro/mh:amata kabua intl -en,mak,malakal,malakal,malakal/sd -en,mal,mangole,mangole,mangole/id -en,mam,matamoros,matamoros,matamoros/mx -en,man,manchester int,manchester int,manchester/gb:manchester intl, manchester, manchester, manchester, man, man -en,mao,manaus,manaus,manaus/am/br:eduardo gomes -en,map,mamai,mamai,mamai/pg -en,maq,mae sot,mae sot,mae sot/th -en,mar,maracaibo,maracaibo,maracaibo/ve:la chinita -en,mas,manus island,manus island,manus island/pg:momote -en,mat,matadi,matadi,matadi/cd -en,mau,maupiti,maupiti,maupiti/pf -en,mav,maloelap island,maloelap island,maloelap island/mh -en,maw,malden,malden,malden/mo/us -en,max,matam,matam,matam/sn -en,may,mangrove cay,mangrove cay,mangrove cay/bs -en,maz,mayaguez de hosto,mayaguez de hos,mayaguez/pr:de hostos -en,mba,mombasa,mombasa,mombasa/ke:moi intl -en,mbb,marble bar,marble bar,marble bar/wa/au -en,mbc,mbigou,mbigou,mbigou/ga -en,mbd,mmabatho,mmabatho,mmabatho/za:intl -en,mbe,monbetsu,monbetsu,monbetsu/jp -en,mbf,mount buffalo,mount buffalo,mount buffalo/vi/au -en,mbg,mobridge,mobridge,mobridge/sd/us -en,mbh,maryborough,maryborough,maryborough/ql/au -en,mbi,mbeya,mbeya,mbeya/tz -en,mbj,montego bay,montego bay,montego bay/jm:sangster intl -en,mbk,matupa,matupa,matupa/mt/br -en,mbl,manistee,manistee,manistee/mi/us:blacker -en,mbm,mkambati,mkambati,mkambati/za -en,mbn,mt barnett,mt barnett,mt barnett/wa/au -en,mbo,mamburao,mamburao,mamburao/ph -en,mbp,moyobamba,moyobamba,moyobamba/pe -en,mbq,mbarara,mbarara,mbarara/ug -en,mbr,mbout,mbout,mbout/mr -en,mbs,saginaw baycity,saginaw baycity,saginaw/mi/us:tri city -en,mbt,masbate,masbate,masbate/ph -en,mbu,mbambanakira,mbambanakira,mbambanakira/sb -en,mbv,masa,masa,masa/pg -en,mbw,moorabbin,moorabbin,moorabbin/vi/au -en,mbx,maribor,maribor,maribor/si -en,mby,moberly,moberly,moberly/mo/us -en,mbz,maues,maues,maues/am/br -en,mca,macenta,macenta,macenta/gn -en,mcb,mccomb,mccomb,mccomb/ms/us:pike county -en,mcc,sacramento mcc,sacramento mcc,sacramento/ca/us:mcclellan afb -en,mcd,mackinac island,mackinac island,mackinac island/mi/us -en,mce,macready regional,macready region,merced/ca/us:macready regional -en,mcf,tampa afb,tampa afb,tampa/fl/us:mac dill afb -en,mcg,mcgrath,mcgrath,mcgrath/ak/us -en,mch,machala,machala,machala/ec -en,mci,kansas city int,kansas city int,kansas city/mo/us:intl -en,mcj,maicao,maicao,maicao/co -en,mck,mccook,mccook,mccook/ne/us -en,huv,hudiksvall,hudiksvall,hudiksvall/se -en,hux,huatulco,huatulco,huatulco/mx -en,huy,humberside,humberside,humberside/gb:humberside -en,huz,huizhou,huizhou,huizhou/cn -en,hva,analalava,analalava,analalava/mg -en,hvb,hervey bay,hervey bay,hervey bay/ql/au -en,hvd,khovd,khovd,khovd/mn:khovd -en,hve,hanksville,hanksville,hanksville/ut/us:intermediate -en,hvg,honningsvag,honningsvag,honningsvag/no:valan -en,hvk,holmavik,holmavik,holmavik/is -en,hvm,hvammstangi,hvammstangi,hvammstangi/is -en,hvn,new haven,new haven,new haven/ct/us -en,hvr,havre,havre,havre/mt/us:city county -en,hvs,hartsville,hartsville,hartsville/sc/us:municipal -en,hwa,hawabango,hawabango,hawabango/pg -en,hwd,hayward,hayward,hayward/ca/us:air terminal -en,hwi,hawk inlet,hawk inlet,hawk inlet/ak/us:spb -en,hwk,hawker,hawker,hawker/sa/au:wilpena pound -en,hwn,hwange nat park,hwange nat park,hwange nat park/zw -en,hwo,hollywood,hollywood,hollywood/fl/us:north perry -en,hxx,hay,hay,hay/ns/au -en,hya,hyannis,hyannis,hyannis/ma/us:barnstable -en,hyc,high wycombe,high wycombe,high wycombe/gb -en,hyd,shamshabad,shamshabad,shamshabad/in:rajiv gandhi int -en,hyf,hayfields,hayfields,hayfields/pg -en,hyg,hydaburg,hydaburg,hydaburg/ak/us:spb -en,hyl,hollis,hollis,hollis/ak/us:spb -en,hyn,huangyan,huangyan,huangyan/cn -en,hyr,hayward,hayward,hayward/wi/us:municipal -en,hys,hays,hays,hays/ks/us:municipal -en,hyv,hyvinkaa,hyvinkaa,hyvinkaa/fi:hyvinkaa -en,hzb,hazebrouck,hazebrouck,hazebrouck/fr:merville calonne -en,hzg,hanzhong,hanzhong,hanzhong/cn -en,hzh,liping,liping,liping city/cn:liping -en,hzk,husavik,husavik,husavik/is -en,hzl,hazleton,hazleton,hazleton/pa/us -en,hzp,fort mackay,fort mackay,fort mackay/ab/ca:horizon -en,hzv,hazyview,hazyview,hazyview/za -en,iaa,igarka,igarka,igarka/ru -en,iab,wichita afb,wichita afb,wichita/ks/us:mcconnell afb -en,iad,washington dulles,washington dull,washington/dc/us:dulles intl -en,iag,niagara falls,niagara falls,niagara falls/ny/us:intl -en,iah,houston-iah,houston-iah,houston/tx/us:g.bush intercont -en,iam,in amenas,in amenas,in amenas/dz -en,ian,kiana,kiana,kiana/ak/us:bob barker mem -en,iaq,bahregan,bahregan,bahregan/ir -en,iar,yaroslavl,yaroslavl,yaroslavl/ru -en,ias,iasi,iasi,iasi/ro -en,iat,iata traffic serv,iata traffic se,iata traffic serv/zz -en,iau,iaura,iaura,iaura/pg -en,iba,ibadan,ibadan,ibadan/ng -en,ibe,ibague,ibague,ibague/co -en,ibi,iboki,iboki,iboki/pg -en,ibl,indigo bay lodge,indigo bay lodg,indigo bay lodge/mz -en,ibo,ibo,ibo,ibo/mz -en,ibp,iberia,iberia,iberia/pe -en,ibz,ibiza,ibiza,ibiza/es -en,ica,icabaru,icabaru,icabaru/ve -en,ici,cicia,cicia,cicia/fj -en,ick,nieuw nickerie,nieuw nickerie,nieuw nickerie/sr -en,icl,clarinda,clarinda,clarinda/ia/us:municipal -en,icn,seoul incheon int,seoul incheon i,seoul/kr:incheon international -en,ico,sicogon island,sicogon island,sicogon island/ph -en,icr,nicaro,nicaro,nicaro/cu -en,ics,cascade,cascade,cascade/id/us -en,ict,wichita mid cont,wichita mid con,wichita/ks/us:mid continent -en,icy,icy bay,icy bay,icy bay/ak/us -en,ida,idaho falls,idaho falls,idaho falls/id/us:fanning fld -en,idb,idre,idre,idre/se -en,idc,iladachilonzuene,iladachilonzuen,ila da chilonzuene/mz -en,idf,idiofa,idiofa,idiofa/cd -en,idg,ida grove,ida grove,ida grove/ia/us:municipal -en,arv,minocqua,minocqua,minocqua/wi/us:noble f lee -en,arw,arad,arad,arad/ro -en,arx,asbury park,asbury park,asbury park/nj/us -en,bcz,bickerton island,bickerton islan,bickerton island/nt/au -en,bda,bermuda,bermuda,bermuda/bm:bermuda intl -en,bdb,bundaberg,bundaberg,bundaberg/ql/au -en,bdc,barra do corda,barra do corda,barra do corda/ma/br -en,bdd,badu island,badu island,badu island/ql/au -en,bde,baudette,baudette,baudette/mn/us -en,bdf,bradford,bradford,bradford/il/us:rinkenberger -en,bdg,blanding,blanding,blanding/ut/us -en,bdh,bandar lengeh,bandar lengeh,bandar lengeh/ir -en,bdi,bird island,bird island,bird island/sc -en,bdj,banjarmasin,banjarmasin,banjarmasin/id:sjamsudin noor -en,bdk,bondoukou,bondoukou,bondoukou/ci -en,bdl,hartford/sprngfld,hartford/sprngf,hartford/ct/us:bradley intl -en,bdm,bandirma,bandirma,bandirma/tr -en,bdn,badin,badin,badin/pk:talhar -en,bdo,bandung,bandung,bandung/id:husein sastranegara -en,bdp,bhadrapur,bhadrapur,bhadrapur/np -en,bdq,vadodara,vadodara,vadodara/in -en,bdr,bridgeport,bridgeport,bridgeport/ct/us:i sikorsky -en,bds,brindisi,brindisi,brindisi/it:papola casale -en,bdt,gbadolite,gbadolite,gbadolite/cd -en,bdu,bardufoss,bardufoss,bardufoss/no -en,bdv,moba,moba,moba/cd -en,bdw,bedford downs,bedford downs,bedford downs/wa/au -en,bdx,broadus,broadus,broadus/mt/us -en,bdy,bandon,bandon,bandon/or/us:state -en,bdz,baindoung,baindoung,baindoung/pg -en,bea,bereina,bereina,bereina/pg -en,beb,benbecula,benbecula,benbecula/gb -en,bec,wichita beech,wichita beech,wichita beech/ks/us:beech -en,bed,bedford hanscom,bedford hanscom,bedford ha/ma/us:hanscom field -en,bee,beagle bay,beagle bay,beagle bay/wa/au -en,bef,bluefields,bluefields,bluefields/ni -en,beg,belgrade,belgrade,belgrade/rs:beograd -en,beh,benthnhbr/stjosep,benton,benton harbor/mi/us:ross field -en,bei,beica,beica,beica/et -en,bej,berau,berau,berau/id -en,bek,bareli,bareli,bareli/in -en,bel,belem,belem,belem/pa/br:val de cans -en,bem,bossembele,bossembele,bossembele/cf -en,ben,benghazi,benghazi,benghazi/ly:benina intl -en,beo,newcastle beo,newcastle beo,newcastle/ns/au:belmont -en,bep,bellary,bellary,bellary/in -en,beq,bury st edmunds,bury st edmunds,bury st edmunds/gb:honington -en,ber,berlin,berlin,berlin/de -en,bes,brest,brest,brest/fr:guipavas -en,bet,bethel airport,bethel airport,bethel/ak/us:bethel airport -en,beu,bedourie,bedourie,bedourie/ql/au -en,bev,beer sheba,beer sheba,beer sheba/il -en,bew,beira,beira,beira/mz -en,bex,benson,benson,benson/gb:raf station -en,bey,beirut,beirut,beirut/lb:international -en,bez,beru,beru,beru/ki -en,bfa,bahia negra,bahia negra,bahia negra/py:bahia negra -en,bfb,blue fox bay,blue fox bay,blue fox bay/ak/us -en,bfc,bloomfield,bloomfield,bloomfield/ql/au -en,bfd,bradford,bradford,bradford/pa/us -en,bfe,bielefeld,bielefeld,bielefeld/de:bielefeld -en,bff,scottsbluff,scottsbluff,scottsbluff/ne/us:scotts bluff -en,bfg,bullfrog basin,bullfrog basin,bullfrog basin/ut/us -en,bfh,curitiba,curitiba,curitiba/pr/br:bacacheri -en,bfi,seattle bfi,seattle bfi,seattle/wa/us:boeing fld intl -en,bfj,ba,ba,ba/fj:ba -en,bfk,denver,denver,denver/co/us:buckley angb -en,bfl,bakersfield,bakersfield,bakersfield/ca/us:meadows fld -en,bfm,mobile aerospace,mobile aerospac,mobile/al/us:mob aerospace -en,bfn,bloemfontein,bloemfontein,bloemfontein/za:intl -en,bfo,buffalo range,buffalo range,buffalo range/zw -en,bfp,beaver falls,beaver falls,beaver falls/pa/us -en,mqt,sawyer intl,sawyer intl,marquette/mi/us:sawyer intl -en,mqu,mariquita,mariquita,mariquita/co -en,mqv,mostaganem,mostaganem,mostaganem/dz:mostaganem -en,bza,bonanza,bonanza,bonanza/ni:san pedro -en,bzb,bazaruto island,bazaruto island,bazaruto island/mz -en,bzc,buzios,buzios,buzios/rj/br -en,bzd,balranald,balranald,balranald/ns/au -en,bze,belize city,belize city,belize/bz:p.s.w. goldson intl -en,bzf,benton field,benton field,redding/ca/us:benton field -en,bzg,bydgoszcz,bydgoszcz,bydgoszcz/pl -en,bzh,bumi hills,bumi hills,bumi hills/zw:airfield -en,bzi,balikesir,balikesir,balikesir/tr -en,bzk,briansk,briansk,briansk/ru -en,bzl,barisal,barisal,barisal/bd -en,bzm,bergen op zoom,bergen op zoom,bergen op zoom/nl:woensdrecht -en,bzn,bozeman,bozeman,bozeman/mt/us:gallatin field -en,bzo,bolzano bozen,bolzano bozen,bolzano bozen/it:bolzano bozen -en,bzp,bizant,bizant,bizant/ql/au -en,bzr,beziers,beziers,beziers/fr:vias -en,bzs,washington bzs,washington bzs,washington/dc/us:buzzards pt s -en,bzt,brazoria,brazoria,brazoria/tx/us:hinkles ferry -en,bzu,buta,buta,buta/cd -en,bzv,brazzaville,brazzaville,brazzaville/cg:maya maya -en,bzy,beltsy,beltsy,beltsy/md -en,bzz,brize norton,brize norton,brize norton/gb:raf station -en,caa,catacamas,catacamas,catacamas/hn -en,cab,cabinda,cabinda,cabinda/ao -en,cac,cascavel,cascavel,cascavel/pr/br -en,cad,cadillac,cadillac,cadillac/mi/us -en,cae,columbia met,columbia met,columbia/sc/us:columbia met -en,caf,carauari,carauari,carauari/am/br -en,cag,cagliari,cagliari,cagliari/it:elmas -en,cah,ca mau,ca mau,ca mau/vn -en,cai,cairo,cairo,cairo/eg:cairo intl -en,caj,canaima,canaima,canaima/ve -en,cak,canton akron,canton akron,canton akron/oh/us:akron -en,cal,campbeltown,campbeltown,campbeltown/gb:machrihanish -en,cam,camiri,camiri,camiri/bo -en,can,guangzhou,guangzhou,guangzhou/cn:baiyun -en,cao,clayton,clayton,clayton/nm/us -en,cap,cap haitien,cap haitien,cap haitien/ht -en,caq,caucasia,caucasia,caucasia/co -en,car,caribou,caribou,caribou/me/us:municipal -en,cas,casablanca,casablanca,casablanca/ma:anfa -en,cat,cat island,cat island,cat island/bs -en,cau,caruaru,caruaru,caruaru/pe/br -en,cav,cazombo,cazombo,cazombo/ao -en,caw,campos,campos,campos/rj/br:bartolomeu lisand -en,cax,carlisle,carlisle,carlisle/gb -en,cay,cayenne,cayenne,cayenne/gf:rochambeau -en,caz,cobar,cobar,cobar/ns/au -en,cba,corner bay,corner bay,corner bay/ak/us -en,cbb,cochabamba,cochabamba,cochabamba/bo:j wilsterman -en,cbc,cherrabun,cherrabun,cherrabun/wa/au -en,cbd,car nicobar,car nicobar,car nicobar/in -en,cbe,cumberland,cumberland,cumberland/md/us:wiley ford -en,cbf,council bluffs,council bluffs,council bluffs/ia/us:municipal -en,cbg,cambridge,cambridge,cambridge/gb -en,cbh,bechar,bechar,bechar/dz:leger -en,cbi,cape barren,cape barren,cape barren/ts/au -en,cbj,cabo rojo,cabo rojo,cabo rojo/do -en,cbk,colby,colby,colby/ks/us:municipal -en,cbl,ciudad bolivar,ciudad bolivar,ciudad bolivar/ve -en,cbm,columbus afb,columbus afb,columbus/ms/us:columbus afb -en,cbn,cirebon,cirebon,cirebon/id:penggung -en,cbo,cotabato,cotabato,cotabato/ph:awang -en,cbp,coimbra,coimbra,coimbra/pt -en,cbq,calabar,calabar,calabar/ng -en,cbr,canberra,canberra,canberra/ac/au -en,cbs,cabimas,cabimas,cabimas/ve:oro negro -en,cbt,catumbela,catumbela,catumbela/ao -en,cbu,cottbus,cottbus,cottbus/de:cottbus apt -en,cbv,coban,coban,coban/gt -en,cbw,campo mourao,campo mourao,campo mourao/pr/br -en,cbx,condobolin,condobolin,condobolin/ns/au -en,cby,canobie,canobie,canobie/ql/au -en,cbz,cabin creek,cabin creek,cabin creek/ak/us -en,dkv,docker river,docker river,docker river/nt/au -en,dla,douala,douala,douala/cm -en,dlb,dalbertis,dalbertis,dalbertis/pg -en,dlc,dalian,dalian,dalian/cn -en,dld,geilo,geilo,geilo/no:dagali airport -en,dle,dole,dole,dole/fr:tavaux -en,dlf,del rio afb,del rio afb,del rio/tx/us:laughlin afb -en,dlg,dillingham,dillingham,dillingham/ak/us:municipal -en,dlh,duluth int,duluth int,duluth/mn/us:duluth intl -en,dli,dalat,dalat,dalat/vn:lienkhang -en,dlk,dulkaninna,dulkaninna,dulkaninna/sa/au -en,dll,dillon,dillon,dillon/sc/us -en,dlm,dalaman,dalaman,dalaman/tr -en,dln,dillon,dillon,dillon/mt/us -en,dlo,dolomi,dolomi,dolomi/ak/us -en,dls,the dalles,the dalles,the dalles/or/us -en,dlu,dali,dali,dali/cn:dali -en,dlv,delissaville,delissaville,delissaville/nt/au -en,dly,dillons bay,dillons bay,dillons bay/vu -en,dlz,dalanzadgad,dalanzadgad,dalanzadgad/mn -en,dma,tucson afb,tucson afb,tucson/az/us:davis monthan afb -en,dmb,zhambyl,zhambyl,zhambyl/kz -en,dmd,doomadgee,doomadgee,doomadgee/ql/au -en,dme,moscow dme,moscow dme,moscow/ru:domodedovo -en,dmk,bkk don mueang,bkk don mueang,bangkok/th:don mueang intl -en,dmm,dammam,dammam,dammam/sa:king fahad intl -en,dmn,deming,deming,deming/nm/us -en,dmo,sedalia,sedalia,sedalia/mo/us -en,dmr,dhamar,dhamar,dhamar/ye -en,dmt,diamantino,diamantino,diamantino/mt/br -en,dmu,dimapur,dimapur,dimapur/in -en,dna,okinawa kad afb,okinawa kad afb,okinawa/jp:kadena afb -en,dnb,dunbar,dunbar,dunbar/ql/au -en,dnc,danane,danane,danane/ci -en,dnd,dundee,dundee,dundee/gb -en,dne,dallas north arpt,dallas north ar,dallas/tx/us:dallas north -en,dnf,derna,derna,derna/ly:martuba -en,dng,doogan airport,doogan airport,doogan/wa/au:doogan airport -en,dnh,dunhuang,dunhuang,dunhuang/cn -en,dni,wad medani,wad medani,wad medani/sd -en,dnk,dnepropetrovsk,dnepropetrovsk,dnepropetrovsk/ua -en,dnl,augusta daniel fd,augusta daniel,augusta daniel/ga/us:daniel -en,dnm,denham,denham,denham/wa/au -en,dnn,dalton,dalton,dalton/ga/us:municipal -en,dno,dianopolis,dianopolis,dianopolis/to/br -en,dnp,dang,dang,dang/np -en,dnq,deniliquin,deniliquin,deniliquin/ns/au -en,dnr,dinard,dinard,dinard/fr:pleurtuit -en,dns,denison,denison,denison/ia/us:municipal -en,dnt,stana laguna dnt,stana laguna dn,santa ana/ca/us:downtown hpt -en,dnu,dinangat,dinangat,dinangat/pg -en,dnv,danville,danville,danville/il/us:vermilion cnty -en,dnx,dinder,dinder,dinder/sd:galegu -en,dnz,denizli,denizli,denizli/tr:cardak -en,doa,doany,doany,doany/mg -en,dob,dobo aru,dobo aru,dobo/id:airport -en,doc,dornoch,dornoch,dornoch/gb -en,dod,dodoma,dodoma,dodoma/tz -en,doe,djoemoe,djoemoe,djoemoe/sr -en,dof,dora bay,dora bay,dora bay/ak/us -en,dog,dongola,dongola,dongola/sd -en,doh,doha,doha,doha/qa -en,doi,doini,doini,doini/pg -en,dok,donetsk,donetsk,donetsk/ua -en,dol,deauville,deauville,deauville/fr:st gatien -en,dom,dominica dom,dominica dom,dominica/dm:melville hall -en,don,dos lagunas,dos lagunas,dos lagunas/gt -en,doo,dorobisoro,dorobisoro,dorobisoro/pg -en,dop,dolpa,dolpa,dolpa/np -en,dor,dori,dori,dori/bf -en,dos,dios,dios,dios/pg -en,dou,dourados,dourados,dourados/ms/br -en,dov,dover cheswold,dover cheswold,dover cheswold/de/us:dover afb -en,dox,dongara,dongara,dongara/wa/au -en,doy,dongying,dongying,dongying/cn:dongying -en,dpa,chicago dpa,chicago dpa,chicago/il/us:dupage county -en,dpe,dieppe,dieppe,dieppe/fr -en,ary,ararat,ararat,ararat/vi/au -en,arz,n'zeto,n'zeto,n'zeto/ao -en,asa,assab,assab,assab/er -en,asb,ashgabad,ashgabad,ashgabad/tm:ashgabad -en,asc,ascension,ascension,ascension/bo -en,asd,andros town,andros town,andros town/bs -en,ase,aspen,aspen,aspen/co/us -en,asf,astrakhan,astrakhan,astrakhan/ru -en,asg,ashburton,ashburton,ashburton/nz -en,ash,nashua,nashua,nashua/nh/us:boire field -en,asi,georgetown,georgetown,georgetown/sh:wideawake field -en,asj,amami o shima,amami o shima,amami o shima/jp -en,ask,yamoussoukro,yamoussoukro,yamoussoukro/ci -en,asl,marshall,marshall,marshall/tx/us:harrison county -en,asm,asmara,asmara,asmara/er:asmara international -en,asn,talladega,talladega,talladega/al/us -en,aso,asosa,asosa,asosa/et -en,asp,alice springs,alice springs,alice springs/nt/au -en,asq,austin,austin,austin/nv/us -en,asr,kayseri,kayseri,kayseri/tr -en,ast,astoria,astoria,astoria/or/us -en,asu,asuncion,asuncion,asuncion/py:silvio pettirossi -en,asv,amboseli,amboseli,amboseli/ke -en,asw,aswan,aswan,aswan/eg -en,asx,ashland,ashland,ashland/wi/us -en,asy,ashley,ashley,ashley/nd/us -en,asz,asirim,asirim,asirim/pg -en,ata,anta,anta,anta/pe -en,atb,atbara,atbara,atbara/sd -en,atc,arthur's town,arthur's town,arthur's town/bs -en,atd,atoifi,atoifi,atoifi/sb -en,ate,antlers,antlers,antlers/ok/us -en,atf,ambato,ambato,ambato/ec:chachoan -e... [truncated message content] |
From: <den...@us...> - 2009-08-08 14:55:48
|
Revision: 171 http://opentrep.svn.sourceforge.net/opentrep/?rev=171&view=rev Author: denis_arnaud Date: 2009-08-08 14:55:34 +0000 (Sat, 08 Aug 2009) Log Message: ----------- 1. [Dev] Finished the work on bringing extra and additional Location objects into the API. 2. [DB] In the search batch, looking in the database is now based on the airport/city code, rather than on the Xapian document ID. That way, no database update is necessary when re-indexing, and any search from any Xapian index will find the corresponding result details within the database. The easiest way is to extract the first three letters of the Xapian document data. 3. [Dev] Wrote a (Python-based) PSP page in order to render in HTML the output of the search batch. There is still some work to do in order to adapt it to the new API (with extra and alternate locations). Modified Paths: -------------- trunk/opentrep/Makefile.am trunk/opentrep/config/soci.m4 trunk/opentrep/configure.ac trunk/opentrep/db/data/ref_place_names.csv trunk/opentrep/opentrep/Location.hpp trunk/opentrep/opentrep/OPENTREP_Service.hpp trunk/opentrep/opentrep/OPENTREP_Types.hpp trunk/opentrep/opentrep/batches/opentrep_indexer.cfg trunk/opentrep/opentrep/batches/opentrep_searcher.cfg trunk/opentrep/opentrep/batches/searcher.cpp trunk/opentrep/opentrep/bom/Place.cpp trunk/opentrep/opentrep/bom/Place.hpp trunk/opentrep/opentrep/bom/ResultHolder.cpp trunk/opentrep/opentrep/bom/StringMatcher.cpp trunk/opentrep/opentrep/bom/StringMatcher.hpp trunk/opentrep/opentrep/command/DBManager.cpp trunk/opentrep/opentrep/command/DBManager.hpp trunk/opentrep/opentrep/command/IndexBuilder.cpp trunk/opentrep/opentrep/command/RequestInterpreter.cpp trunk/opentrep/opentrep/factory/FacPlace.cpp trunk/opentrep/opentrep/python/pyopentrep.cpp trunk/opentrep/opentrep/python/pyopentrep.py trunk/opentrep/test/i18n/Makefile.am Added Paths: ----------- trunk/opentrep/TODO trunk/opentrep/config/ax_icu.m4 trunk/opentrep/gui/ trunk/opentrep/gui/Makefile.am trunk/opentrep/gui/icons/ trunk/opentrep/gui/icons/Makefile.am trunk/opentrep/gui/icons/opentrep.png trunk/opentrep/gui/icons/opentrep.xcf trunk/opentrep/gui/icons/sources.mk trunk/opentrep/gui/psp/ trunk/opentrep/gui/psp/Makefile.am trunk/opentrep/gui/psp/index.html trunk/opentrep/gui/psp/libpyopentrep_proxy.py trunk/opentrep/gui/psp/localize.py trunk/opentrep/gui/psp/log_service.py trunk/opentrep/gui/psp/opentrep.psp trunk/opentrep/gui/psp/result_parser.py trunk/opentrep/gui/psp/sources.mk trunk/opentrep/opentrep/LocationList.hpp trunk/opentrep/test/i18n/icufmt.cpp trunk/opentrep/test/i18n/ref/ trunk/opentrep/test/i18n/ref/ref_text_en.txt trunk/opentrep/test/i18n/ref/ref_text_ru.txt trunk/opentrep/test/i18n/ref/ref_text_ru_koi8r.txt trunk/opentrep/test/i18n/ref/ref_text_ru_koi8ru.txt trunk/opentrep/test/i18n/ref/ref_text_ru_windows_1251.txt trunk/opentrep/test/i18n/ref/ref_text_ua.txt trunk/opentrep/test/i18n/ref/ref_text_ua_koi8r.txt trunk/opentrep/test/i18n/ref/ref_text_ua_koi8u.txt trunk/opentrep/test/i18n/ref/ref_text_ua_windows_1251.txt trunk/opentrep/test/i18n/simple_io.cpp Property Changed: ---------------- trunk/opentrep/ trunk/opentrep/test/i18n/ Property changes on: trunk/opentrep ___________________________________________________________________ Modified: svn:ignore - configure config.log config.status autom4te.cache aclocal.m4 ABOUT-NLS INSTALL COPYING libtool Makefile.in Makefile opentrep.spec opentrep-config opentrep.m4 opentrep.pc opentrep-*.*.*.tar.* opentrep-html-doc-*.*.*.tar.* + configure config.log config.status autom4te.cache aclocal.m4 ABOUT-NLS INSTALL COPYING libtool Makefile.in Makefile opentrep.spec opentrep-config opentrep.m4 opentrep.pc opentrep-*.*.*.tar.* opentrep-html-doc-*.*.*.tar.* psp.tar.* Modified: trunk/opentrep/Makefile.am =================================================================== --- trunk/opentrep/Makefile.am 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/Makefile.am 2009-08-08 14:55:34 UTC (rev 171) @@ -24,7 +24,8 @@ EXTRA_DIST = @PACKAGE@.spec @PACKAGE@.m4 @PACKAGE@.pc Makefile.common # Build in these directories: -SUBDIRS = opentrep win32 po man $(INFO_DOC_DIR) $(HTML_DOC_DIR) db $(TEST_DIR) +SUBDIRS = @PACKAGE@ win32 po man $(INFO_DOC_DIR) $(HTML_DOC_DIR) db \ + gui $(TEST_DIR) # Configuration helpers @@ -43,8 +44,10 @@ dist-html: $(MAKE) -C doc dist-html +dist-gui: + $(MAKE) -C gui dist-gui -snapshot: snapshot-src snapshot-html +snapshot: snapshot-src snapshot-html snapshot-gui snapshot-src: @@ -53,8 +56,11 @@ snapshot-html: $(MAKE) -C doc dist-html html_tarname=@PACKAGE_TARNAME@-html-doc-`date +"%Y%m%d"` -upload: upload-src upload-html +snapshot-gui: + $(MAKE) -C gui dist-gui +upload: upload-src upload-html upload-gui + upload-src: dist @UPLOAD_COMMAND@ @PACKAGE_TARNAME@-@VERSION@.tar.gz \ @PACKAGE_TARNAME@-@VERSION@.tar.bz2 @@ -63,3 +69,6 @@ @UPLOAD_COMMAND@ @PACKAGE_TARNAME@-html-doc-@VERSION@.tar.gz \ @PACKAGE_TARNAME@-html-doc-@VERSION@.tar.bz2 +upload-gui: dist-gui + @UPLOAD_COMMAND@ @PACKAGE_TARNAME@-gui-@VERSION@.tar.gz \ + @PACKAGE_TARNAME@-gui-@VERSION@.tar.bz2 Added: trunk/opentrep/TODO =================================================================== --- trunk/opentrep/TODO (rev 0) +++ trunk/opentrep/TODO 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,20 @@ +Todo list for the OpenTrep project +---------------------------------- + +* [01/08/2009] Finish the work on bringing extra and additional + Location objects into the API. +OK + +* [01/08/2009] In the search batch, when looking in the database, do + it based on the airport/city code, rather than on the Xapian + document ID. That way, no database update will be necessary when + re-indexing, and any search from any Xapian index will find the + corresponding result details within the database. The easiest way is + to extract the first three letters of the Xapian document data. +OK + +* [01/08/2009] Write a (Python-based) PSP page, in order to test the + different locales of the browsers. +The Python (PSP) page has been created, but there is still some work +to do in order to adapt it to the new API (with extra and alternate +locations). \ No newline at end of file Added: trunk/opentrep/config/ax_icu.m4 =================================================================== --- trunk/opentrep/config/ax_icu.m4 (rev 0) +++ trunk/opentrep/config/ax_icu.m4 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,160 @@ +dnl @synopsis AX_ICU +dnl +dnl This macro tries to find Icu C API header and library locations. +dnl +dnl We define the following configure script flags: +dnl +dnl --with-icu: Give prefix for both library and headers, and try +dnl to guess subdirectory names for each. (e.g. Tack /lib and +dnl /include onto given dir name, and other common schemes.) +dnl --with-icu-lib: Similar to --with-icu, but for library only. +dnl --with-icu-include: Similar to --with-icu, but for headers +dnl only. +dnl +dnl @version 1.2, 2007/02/20 +dnl @author Warren Young <ic...@et...> + +AC_DEFUN([AX_ICU], +[ + # + # Set up configure script macros + # + AC_ARG_WITH(icu, + [ --with-icu=<path> root directory path of Icu installation], + [ICU_lib_check="$with_icu/lib64/icu $with_icu/lib/icu $with_icu/lib64 $with_icu/lib" + ICU_inc_check="$with_icu/include $with_icu/include/icu" + ICU_bin_check="$with_icu/bin"], + [ICU_lib_check="/usr/lib64 /usr/lib /usr/lib64/icu /usr/lib/icu /usr/local/lib64 /usr/local/lib /usr/local/lib/icu /usr/local/icu/lib /usr/local/icu/lib/icu /opt/icu/lib /opt/icu/lib/icu" + ICU_inc_check="/usr/include /usr/local/include /usr/local/icu/include /opt/icu/include" + ICU_bin_check="/usr/bin /usr/local/bin /usr/local/icu/bin"]) + + AC_ARG_WITH(icu-lib, + [ --with-icu-lib=<path> directory path of Icu library installation], + [ICU_lib_check="$with_icu_lib $with_icu_lib/lib64 $with_icu_lib/lib $with_icu_lib/lib64/icu $with_icu_lib/lib/icu"]) + + AC_ARG_WITH(icu-include, + [ --with-icu-include=<path> directory path of Icu header installation], + [ICU_inc_check="$with_icu_include $with_icu_include/include $with_icu_include/include/icu"]) + + + # + # Look for Icu Configuration Script + # + AC_MSG_CHECKING([for Icu configuration script]) + ICU_CONFIG= + ICU_bindir= + for m in $ICU_bin_check + do + if test -d "$m" && test -f "$m/icu-config" + then + ICU_CONFIG=$m/icu-config + ICU_bindir=$m + break + fi + done + + if test -z "$ICU_bindir" + then + AC_MSG_ERROR([Didn't find $ICU_CONFIG binary in '$ICU_bin_check']) + fi + + case "$ICU_bindir" in + /* ) ;; + * ) AC_MSG_ERROR([The Icu binary directory ($ICU_bindir) must be an absolute path.]) ;; + esac + + AC_MSG_RESULT([$ICU_bindir]) + + AC_PATH_PROG(ICU_CONFIG, icu-config, $ICU_bindir) + + if test "x${ICU_CONFIG+set}" != xset + then + ICU_VERSION=`${ICU_CONFIG} --version` + ICU_CFLAGS=`${ICU_CONFIG} --cppflags` + ICU_LIBS=`${ICU_CONFIG} --ldflags` + else + # + # Look for Icu C API library + # + AC_MSG_CHECKING([for Icu library directory]) + ICU_libdir= + ICU_IO_LIB=icuio + for m in $ICU_lib_check + do + if test -d "$m" && \ + (test -f "$m/lib$ICU_IO_LIB.so" \ + || test -f "$m/lib$ICU_IO_LIB.a") + then + ICU_libdir=$m + break + fi + done + + if test -z "$ICU_libdir" + then + AC_MSG_ERROR([Didn't find $ICU_IO_LIB library in '$ICU_lib_check']) + fi + + case "$ICU_libdir" in + /* ) ;; + * ) AC_MSG_ERROR([The Icu library directory ($ICU_libdir) must be an absolute path.]) ;; + esac + + AC_MSG_RESULT([$ICU_libdir]) + + case "$ICU_libdir" in + /usr/lib64) ;; + /usr/lib) ;; + *) LDFLAGS="$LDFLAGS -L${ICU_libdir}" ;; + esac + + # + # Look for Icu C API headers + # + AC_MSG_CHECKING([for Icu include directory]) + ICU_incdir= + for m in $ICU_inc_check + do + if test -d "$m" && test -f "$m/unicode/utf8.h" + then + ICU_incdir=$m + break + fi + done + + if test -z "$ICU_incdir" + then + AC_MSG_ERROR([Didn't find the Icu include dir in '$ICU_inc_check']) + fi + + case "$ICU_incdir" in + /* ) ;; + * ) AC_MSG_ERROR([The Icu include directory ($ICU_incdir) must be an absolute path.]) ;; + esac + + AC_MSG_RESULT([$ICU_incdir]) + + ICU_CFLAGS="-D_REENTRANT -I${ICU_incdir}" + ICU_LIBS="-licui18n -licuuc -licudata -lpthread -lm" + + case "$ICU_libdir" in + /usr/lib64) ;; + /usr/lib) ;; + *) ICU_LIBS="-L${ICU_libdir} $ICU_LIBS" ;; + esac + fi + + AC_SUBST(ICU_VERSION) + AC_SUBST(ICU_CFLAGS) + AC_SUBST(ICU_LIBS) + + save_LIBS="$LIBS" + LIBS="$LIBS $ICU_LIBS" +# AC_CHECK_LIB($ICU_IO_LIB, utext_isWritable, +# [], +# [AC_MSG_ERROR([Could not find working Icu client library!])] +# ) + ICU_IO_LIB="-l${ICU_IO_LIB}" + AC_SUBST(ICU_IO_LIB) + LIBS="$save_LIBS" +]) dnl AX_ICU Modified: trunk/opentrep/config/soci.m4 =================================================================== --- trunk/opentrep/config/soci.m4 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/config/soci.m4 2009-08-08 14:55:34 UTC (rev 171) @@ -60,9 +60,9 @@ SOCI_CORE_LIB=${SOCI_CORE_LIB}-${SOCI_LIB_SUFFIX} SOCI_MYSQL_LIB=${SOCI_MYSQL_LIB}-${SOCI_LIB_SUFFIX} SOCI_libdir=$m + break fi done - break fi done Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/configure.ac 2009-08-08 14:55:34 UTC (rev 171) @@ -147,7 +147,16 @@ AC_SUBST(XAPIAN_CFLAGS) AC_SUBST(XAPIAN_LIBS) +# -------------------------------------------------------------------- +# Support for ICU (i18n C API): http://www.icu-project.org +# -------------------------------------------------------------------- +AX_ICU +AC_SUBST(ICU_VERSION) +AC_SUBST(ICU_CFLAGS) +AC_SUBST(ICU_LIBS) +AC_SUBST(ICU_IO_LIB) + # ------------------------------------------------------------------- # Support for documentation # ------------------------------------------------------------------- @@ -249,6 +258,9 @@ db/maintenance/Makefile db/maintenance/tables/Makefile db/data/Makefile + gui/Makefile + gui/icons/Makefile + gui/psp/Makefile test/com/Makefile test/parsers/Makefile test/i18n/Makefile @@ -327,6 +339,12 @@ o XAPIAN_CFLAGS ... : ${XAPIAN_CFLAGS} o XAPIAN_LIBS ..... : ${XAPIAN_LIBS} + - ICU ............... : + o ICU_version ..... : ${ICU_VERSION} + o ICU_CFLAGS ...... : ${ICU_CFLAGS} + o ICU_LIBS ........ : ${ICU_LIBS} + o ICU_IO_LIB ...... : ${ICU_IO_LIB} + - CPPUNIT ........... : o CPPUNIT_VERSION . : ${CPPUNIT_VERSION} o CPPUNIT_CFLAGS .. : ${CPPUNIT_CFLAGS} Modified: trunk/opentrep/db/data/ref_place_names.csv =================================================================== --- trunk/opentrep/db/data/ref_place_names.csv 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/db/data/ref_place_names.csv 2009-08-08 14:55:34 UTC (rev 171) @@ -6253,7 +6253,7 @@ en,yvo,val d'or,val d'or/qc/ca en,yvp,kuujjuaq,kuujjuaq/qc/ca en,yvq,norman wells,norman wells/nt/ca -en,yvr,vancouver int,vancouver/bc/ca:intl +en,yvr,vancouver int,vancouver/bc/ca:intl,vancouver en,yvs,ski rail station,ski/no:ski rail station en,yvt,buffalo narrows,buffalo narrows/sk/ca en,yvv,wiarton,wiarton/on/ca @@ -9446,7 +9446,7 @@ en,xdx,sarnia,sarnia/on/ca:railway station en,xdy,sudbury,sudbury/on/ca:junction rail st en,xdz,the pas,the pas/mb/ca:railway station -en,xea,vancouver,vancouver/bc/ca:railway statio +en,xea,vancouver railway,vancouver/bc/ca:railway statio en,xeb,evian les bains,evian les bains/fr:off- en,xec,windsor,windsor/on/ca:railway station en,xed,disneyland paris,paris/fr:disneyland paris Property changes on: trunk/opentrep/gui ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in Added: trunk/opentrep/gui/Makefile.am =================================================================== --- trunk/opentrep/gui/Makefile.am (rev 0) +++ trunk/opentrep/gui/Makefile.am 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,27 @@ +# Python Server Pages (PSP) + +SUBDIRS = icons psp + +MAINTAINERCLEANFILES = Makefile.in Makefile + +datadir = @datadir@ +pkgdatadir = $(datadir)/@PACKAGE@ +guidir = $(pkgdatadir)/gui + +psp_sources = psp +psp_dests = $(foreach ext,.tar.gz .tar.bz2,$(addsuffix $(ext),$(psp_sources))) + +# Targets +$(top_builddir)/%.tar.gz $(builddir)/%.tar.gz: %/*.html %/*.py %/*.psp %/../icons/*.png + tar chof - $^ | gzip --best -c > $@ + +$(top_builddir)/%.tar.bz2 $(builddir)/%.tar.bz2: %/*.html %/*.py %/*.psp %/../icons/*.png + tar chof - $^ | bzip2 -9 -c > $@ + +dist-gui: $(addprefix $(top_builddir)/,$(psp_dests)) + +clean-local: + rm -f $(addprefix $(top_builddir)/,$(psp_dests)) + +snapshot-gui: + $(MAKE) dist-gui gui_tarname=@PACKAGE_TARNAME@-gui-`date +"%Y%m%d"` Property changes on: trunk/opentrep/gui/icons ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in Added: trunk/opentrep/gui/icons/Makefile.am =================================================================== --- trunk/opentrep/gui/icons/Makefile.am (rev 0) +++ trunk/opentrep/gui/icons/Makefile.am 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,25 @@ +# gui/icons sub-directory: Images (.png, .gif, etc) +include $(srcdir)/sources.mk + +datadir = @datadir@ +pkgdatadir = $(datadir)/@PACKAGE@ +imgdir = $(pkgdatadir)/gui/icons + +MAINTAINERCLEANFILES = Makefile.in Makefile + +noinst_DATA = $(img_sources) + +EXTRA_DIST = $(noinst_DATA) + +# Targets +install-data-local: + $(mkinstalldirs) $(DESTDIR)$(imgdir); \ + for f in $(noinst_DATA); do \ + $(INSTALL_DATA) $$f $(DESTDIR)$(imgdir); \ + done + +uninstall-local: + rm -rf $(DESTDIR)$(imgdir) + +clean-local: + rm -rf *.log *.tag Added: trunk/opentrep/gui/icons/opentrep.png =================================================================== (Binary files differ) Property changes on: trunk/opentrep/gui/icons/opentrep.png ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: trunk/opentrep/gui/icons/opentrep.xcf =================================================================== (Binary files differ) Property changes on: trunk/opentrep/gui/icons/opentrep.xcf ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: trunk/opentrep/gui/icons/sources.mk =================================================================== --- trunk/opentrep/gui/icons/sources.mk (rev 0) +++ trunk/opentrep/gui/icons/sources.mk 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1 @@ +img_sources = $(top_srcdir)/gui/icons/opentrep.png Property changes on: trunk/opentrep/gui/psp ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile Makefile.in Added: trunk/opentrep/gui/psp/Makefile.am =================================================================== --- trunk/opentrep/gui/psp/Makefile.am (rev 0) +++ trunk/opentrep/gui/psp/Makefile.am 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,25 @@ +# Python Server Pages (PSP) +include $(srcdir)/sources.mk + +datadir = @datadir@ +pkgdatadir = $(datadir)/@PACKAGE@ +pspdir = $(pkgdatadir)/gui/psp + +MAINTAINERCLEANFILES = Makefile.in Makefile + +noinst_DATA = $(html_sources) $(py_sources) $(psp_sources) + +EXTRA_DIST = $(noinst_DATA) + +# Targets +install-data-local: + $(mkinstalldirs) $(DESTDIR)$(pspdir); \ + for f in $(noinst_DATA); do \ + $(INSTALL_DATA) $$f $(DESTDIR)$(pspdir); \ + done + +uninstall-local: + rm -rf $(DESTDIR)$(pspdir) + +clean-local: + rm -rf *.log *.tag Added: trunk/opentrep/gui/psp/index.html =================================================================== --- trunk/opentrep/gui/psp/index.html (rev 0) +++ trunk/opentrep/gui/psp/index.html 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,14 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml"> +<head> +<meta http-equiv="content-type" content="text/html; charset=UTF-8" /> +<meta http-equiv="refresh" + content="0; url=http://localhost/opentrep/opentrep.psp" /> +<title>Redirection</title> +<meta name="robots" content="noindex,follow" /> +</head> + +<body> +</body> +</html> Added: trunk/opentrep/gui/psp/libpyopentrep_proxy.py =================================================================== --- trunk/opentrep/gui/psp/libpyopentrep_proxy.py (rev 0) +++ trunk/opentrep/gui/psp/libpyopentrep_proxy.py 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,8 @@ +#!/usr/bin/python + +import sys + +def import_libpyopentrep(libpyopentrep_path): + sys.path.append(libpyopentrep_path) + import libpyopentrep + return libpyopentrep Added: trunk/opentrep/gui/psp/localize.py =================================================================== --- trunk/opentrep/gui/psp/localize.py (rev 0) +++ trunk/opentrep/gui/psp/localize.py 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,14 @@ +#!/usr/bin/python + +import socket + +www_log_filename = '/var/log/opentrep/www.log' +trep_log_filename = '/var/log/opentrep/opentrep.log' +tmp_trep_log_filename = '/var/log/opentrep/tmp_opentrep.log' + +hostname = socket.gethostname() +main_name = hostname.split('.')[0] + +traveldb_path = '/var/www/opentrep/traveldb' +libpyopentrep_path = '/tmp/opentrep/lib' +opentrep_dbparams = {'user': 'opentrep', 'password': 'opentrep', 'host': 'localhost', 'port': '3306', 'db': 'trep_opentrep'} Added: trunk/opentrep/gui/psp/log_service.py =================================================================== --- trunk/opentrep/gui/psp/log_service.py (rev 0) +++ trunk/opentrep/gui/psp/log_service.py 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,25 @@ +#!/usr/bin/python + +import socket, os, datetime + +def log(filename, req, query, codes, unrecognized): + req.add_common_vars() + # determine ip + remote_client_ip = req.connection.remote_ip + # determine hostname + hostname = req.connection.remote_host + if hostname == None: hostname = 'localhost' + # determine time + str_time = datetime.datetime.now().strftime('%y%m%d%H%M%S') + # determine user agent + agent = '' + if req.subprocess_env.has_key("HTTP_USER_AGENT"): agent = req.subprocess_env["HTTP_USER_AGENT"] + # determine user allowed languages + languages = '' + if req.subprocess_env.has_key("HTTP_ACCEPT_LANGUAGE"): languages = req.subprocess_env["HTTP_ACCEPT_LANGUAGE"] + # determine user allowed character sets + charsets = '' + if req.subprocess_env.has_key("HTTP_ACCEPT_CHARSET"): charsets = req.subprocess_env["HTTP_ACCEPT_CHARSET"] + # write to file + str_out = '^'.join([str_time,remote_client_ip, hostname, query, ','.join(codes), unrecognized, agent, languages, charsets]) + os.system('echo "%s" >> %s' % (str_out, filename)) Added: trunk/opentrep/gui/psp/opentrep.psp =================================================================== --- trunk/opentrep/gui/psp/opentrep.psp (rev 0) +++ trunk/opentrep/gui/psp/opentrep.psp 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,80 @@ +<% +import os +local_path = '/var/www/opentrep' +from mod_python import apache +localize = apache.import_module('localize', path=[local_path]) +log_service = apache.import_module('log_service', path=[local_path]) + +# defaults +msg, head, form_value, unrecognized = '', '', '', '' +#body_declaration = '<body>' +quiet = True + +# parsing: recognize sequence of three-letter codes +codes = [] +alter_locations = [] +queryStringForm = form +if queryStringForm.has_key('data'): + form_value = queryStringForm['data'] + quiet = False + if form_value.rstrip(' ') == '': + pass + else: + # Use opentrep + libpyopentrep_proxy = apache.import_module('libpyopentrep_proxy', path=[local_path]) + libpyopentrep = libpyopentrep_proxy.import_libpyopentrep(localize.libpyopentrep_path) + mySearch = libpyopentrep.OpenTrepSearcher() + mySearch.init(localize.traveldb_path, localize.tmp_trep_log_filename, localize.opentrep_dbparams['user'], localize.opentrep_dbparams['password'], localize.opentrep_dbparams['host'], localize.opentrep_dbparams['port'], localize.opentrep_dbparams['db']) + str_matches = mySearch.search(form_value) + if ';' in str_matches: + str_matches, unrecognized = str_matches.split(';') + msg = 'unrecognized: %s. ' % unrecognized + str_value = unrecognized + if str_matches != '': + alter_locations = [x for x in str_matches.split(',')] + for alter_location_list in alter_locations: + alter_location_list = [x for x in alter_location_list.split('-')] + for extra_location_list in alter_location_list: + extra_location_list = [x for x in extra_location_list.split(':')] + + codes = [x[0].upper() for x in alter_locations] + if len(codes)>0: form_value = ' '.join(codes) + if str_value != '': form_value += ' ' + str_value + + # Logging + log_service.log(localize.www_log_filename, req, queryStringForm['data'], codes, unrecognized) + os.system('cat %s >> %s' % (localize.tmp_trep_log_filename, localize.trep_log_filename)) + +%> + +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml"> +<head> +<title>OpenTREP</title> +<%= head %> +</head> + +<body> +<div align="center"> +<a href="opentrep.psp"><img src="/icons/opentrep.png" height="80px" border=0></a> +</div> +<br> + +<div align="center"> +<table border="0"> + <tr> + <td> + <form value="queryStringForm" action="opentrep.psp" method="post"> + <input type="text" size=80% name="data" value="<%= form_value%>"> + <input type="submit" value="Send"> + </form> + </td> + </tr> +</table> +</div> + +<p style="font-size:small;"><%= msg %></p> + +</body> +</html> Added: trunk/opentrep/gui/psp/result_parser.py =================================================================== --- trunk/opentrep/gui/psp/result_parser.py (rev 0) +++ trunk/opentrep/gui/psp/result_parser.py 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,64 @@ +#!/usr/bin/env python + +import sys + +# Default result string +defaultResultString = 'yvr:xea/98-xtw/87,sfo/100,led:dft:htl/96;niznayou' + +# If no result string was supplied as arguments of the command-line, +# ask the user for some +resultString = ','.join(sys.argv[1:]) +if resultString == '' : resultString = defaultResultString + +# Function to parse the result string +def parseResultString(iResultString): + form_value, unrecognized = '', '' + msg = '(parsing successful)' + str_matches = iResultString + alter_locations = [] + + if ';' in str_matches: + str_matches, unrecognized = str_matches.split(';') + msg = '(unrecognized: %s)' % unrecognized + str_value = unrecognized + + if str_matches != '': + alter_locations = str_matches.split(',') + + print 'alter_locations: ', alter_locations + + idx1 = 0 + while idx1 != len(alter_locations): + +# print 'Before - alter_locations['+str(idx1)+']: ', alter_locations[idx1] + alter_locations[idx1] = alter_locations[idx1].split('-') +# print 'After - alter_locations['+str(idx1)+']: ', alter_locations[idx1], alter_locations + + idx2 = 0 + while idx2 != len(alter_locations[idx1]): + + alter_locations[idx1][idx2] = alter_locations[idx1][idx2].split(':') + + idx3 = 0 + while idx3 != len(alter_locations[idx1][idx2]): + + alter_locations[idx1][idx2][idx3] = alter_locations[idx1][idx2][idx3].split('/') + idx3 += 1 + + idx2 += 1 + + idx1 += 1 + +# codes = [x.upper() for x in alter_locations] +# if len(codes) > 0: form_value = ' '.join(codes) + if str_value != '': form_value += ' ' + str_value + + print 'After - alter_locations: ', alter_locations + + print 'Result ' + msg + ':' + return form_value + +# Main +print 'Before: ' + resultString +resultString = parseResultString(resultString) +print 'After: ' + resultString Property changes on: trunk/opentrep/gui/psp/result_parser.py ___________________________________________________________________ Added: svn:executable + * Added: trunk/opentrep/gui/psp/sources.mk =================================================================== --- trunk/opentrep/gui/psp/sources.mk (rev 0) +++ trunk/opentrep/gui/psp/sources.mk 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,6 @@ +html_sources = $(top_srcdir)/gui/psp/index.html +psp_sources = $(top_srcdir)/gui/psp/opentrep.psp +py_sources = \ + $(top_srcdir)/gui/psp/localize.py \ + $(top_srcdir)/gui/psp/log_service.py \ + $(top_srcdir)/gui/psp/libpyopentrep_proxy.py Modified: trunk/opentrep/opentrep/Location.hpp =================================================================== --- trunk/opentrep/opentrep/Location.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/Location.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -11,6 +11,7 @@ #include <list> // OpenTrep #include <opentrep/OPENTREP_Types.hpp> +#include <opentrep/LocationList.hpp> #include <opentrep/OPENTREP_Abstract.hpp> namespace OPENTREP { @@ -73,6 +74,26 @@ return _nameList; } + /** Get the matching percentage. */ + const MatchingPercentage_T& getPercentage() const { + return _percentage; + } + + /** Get the allowed edit distance/error. */ + const NbOfErrors_T& getEditDistance() const { + return _editDistance; + } + + /** Get the list of extra matching (similar) locations. */ + const LocationList_T& getExtraLocationList() const { + return _extraLocationList; + } + + /** Get the list of alternate matching (less similar) locations. */ + const LocationList_T& getAlternateLocationList() const { + return _alternateLocationList; + } + // ///////// Setters ////////// /** Set the Location code. */ @@ -125,7 +146,27 @@ _nameList = iNameList; } + /** Set the Xapian matching percentage. */ + void setPercentage (const MatchingPercentage_T& iPercentage) { + _percentage = iPercentage; + } + + /** Set the allowed edit distance/error. */ + void setEditDistance (const NbOfErrors_T& iEditDistance) { + _editDistance = iEditDistance; + } + + /** Add an extra matching location. */ + void addExtraLocation (const Location& iExtraLocation) { + _extraLocationList.push_back (iExtraLocation); + } + /** Add an alternate matching location. */ + void addAlternateLocation (const Location& iAlternateLocation) { + _alternateLocationList.push_back (iAlternateLocation); + } + + public: // ///////// Display methods //////// /** Dump a structure into an output stream. @@ -145,7 +186,17 @@ oStr << _locationCode << ", " << _cityCode << ", " << _stateCode << ", " << _countryCode << ", " << _regionCode << ", " << _continentCode << ", " << _timeZoneGroup - << ", " << _longitude << ", " << _latitude; + << ", " << _longitude << ", " << _latitude + << ", " << _percentage << ", " << _editDistance; + + if (_extraLocationList.empty() == false) { + oStr << " " << _extraLocationList.size() << " extra match(es)"; + } + + if (_alternateLocationList.empty() == false) { + oStr << " " << _alternateLocationList.size() << " alternate match(es)"; + } + return oStr.str(); } @@ -157,6 +208,36 @@ itName != _nameList.end(); ++itName) { oStr << ", " << *itName; } + + if (_extraLocationList.empty() == false) { + oStr << "; Extra matches: {"; + unsigned short idx = 0; + for (LocationList_T::const_iterator itLoc = _extraLocationList.begin(); + itLoc != _extraLocationList.end(); ++itLoc, ++idx) { + if (idx != 0) { + oStr << ", "; + } + const Location& lExtraLocation = *itLoc; + oStr << lExtraLocation.toShortString(); + } + oStr << "}"; + } + + if (_alternateLocationList.empty() == false) { + oStr << "; Alternate matches: {"; + unsigned short idx = 0; + for (LocationList_T::const_iterator itLoc = + _alternateLocationList.begin(); + itLoc != _alternateLocationList.end(); ++itLoc, ++idx) { + if (idx != 0) { + oStr << ", "; + } + const Location& lAlternateLocation = *itLoc; + oStr << lAlternateLocation.toShortString(); + } + oStr << "}"; + } + return oStr.str(); } @@ -168,12 +249,15 @@ const std::string& iRegionCode, const std::string& iContinentCode, const std::string& iTimeZoneGroup, const double iLongitude, const double iLatitude, - const LocationNameList_T& iNameList) + const LocationNameList_T& iNameList, + const MatchingPercentage_T& iPercentage, + const NbOfErrors_T& iEditDistance) : _locationCode (iPlaceCode), _cityCode (iCityCode), _stateCode (iStateCode), _countryCode (iCountryCode), _regionCode (iRegionCode), _continentCode (iContinentCode), _timeZoneGroup (iTimeZoneGroup), _longitude (iLongitude), - _latitude (iLatitude), _nameList (iNameList) { + _latitude (iLatitude), _nameList (iNameList), + _percentage (iPercentage), _editDistance (iEditDistance) { } /** Default Constructor. */ @@ -207,11 +291,19 @@ double _latitude; /** List of (American) English names. */ LocationNameList_T _nameList; - }; + /** Matching percentage. */ + MatchingPercentage_T _percentage; - /** List of (geographical) location structures. */ - typedef std::list<Location> LocationList_T; + /** Allowed edit error/distance. */ + NbOfErrors_T _editDistance; + /** List of extra matching (similar) locations. */ + LocationList_T _extraLocationList; + + /** List of alternate matching (less similar) locations. */ + LocationList_T _alternateLocationList; + }; + } #endif // __OPENTREP_LOCATION_HPP Added: trunk/opentrep/opentrep/LocationList.hpp =================================================================== --- trunk/opentrep/opentrep/LocationList.hpp (rev 0) +++ trunk/opentrep/opentrep/LocationList.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -0,0 +1,20 @@ +#ifndef __OPENTREP_LOCATIONLIST_HPP +#define __OPENTREP_LOCATIONLIST_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// STL +#include <list> + +namespace OPENTREP { + + // Forward declaration + struct Location; + + /** List of (geographical) location structures. */ + typedef std::list<Location> LocationList_T; + +} +#endif // __OPENTREP_LOCATIONLIST_HPP + Modified: trunk/opentrep/opentrep/OPENTREP_Service.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/OPENTREP_Service.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -10,7 +10,7 @@ // OpenTREP #include <opentrep/OPENTREP_Types.hpp> #include <opentrep/DBParams.hpp> -#include <opentrep/Location.hpp> +#include <opentrep/LocationList.hpp> #include <opentrep/DistanceErrorRule.hpp> namespace OPENTREP { Modified: trunk/opentrep/opentrep/OPENTREP_Types.hpp =================================================================== --- trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/OPENTREP_Types.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -45,6 +45,9 @@ class XapianTravelDatabaseEmptyException : public XapianException { }; + class XapianTravelDatabaseNotInSyncWithSQLDatabaseException : public XapianException { + }; + class SQLDatabaseException : public RootException { }; @@ -87,6 +90,9 @@ /** Xapian document ID. */ typedef int XapianDocID_T; + /** Xapian percentage. */ + typedef unsigned int MatchingPercentage_T; + /** Travel search query. */ typedef std::string TravelQuery_T; Modified: trunk/opentrep/opentrep/batches/opentrep_indexer.cfg =================================================================== --- trunk/opentrep/opentrep/batches/opentrep_indexer.cfg 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/batches/opentrep_indexer.cfg 2009-08-08 14:55:34 UTC (rev 171) @@ -1,4 +1,4 @@ -database=../../test/traveldb +database=/tmp/opentrep/share/opentrep/traveldb log=opentrep_indexer.log user=opentrep passwd=opentrep Modified: trunk/opentrep/opentrep/batches/opentrep_searcher.cfg =================================================================== --- trunk/opentrep/opentrep/batches/opentrep_searcher.cfg 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/batches/opentrep_searcher.cfg 2009-08-08 14:55:34 UTC (rev 171) @@ -1,4 +1,4 @@ -database=../../test/traveldb +database=/tmp/opentrep/share/opentrep/traveldb log=opentrep_searcher.log user=opentrep passwd=opentrep Modified: trunk/opentrep/opentrep/batches/searcher.cpp =================================================================== --- trunk/opentrep/opentrep/batches/searcher.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/batches/searcher.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -13,6 +13,7 @@ #include <boost/program_options.hpp> // OpenTREP #include <opentrep/OPENTREP_Service.hpp> +#include <opentrep/Location.hpp> #include <opentrep/DBParams.hpp> #include <opentrep/config/opentrep-paths.hpp> Modified: trunk/opentrep/opentrep/bom/Place.cpp =================================================================== --- trunk/opentrep/opentrep/bom/Place.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/bom/Place.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -22,7 +22,8 @@ _regionCode (iPlace._regionCode), _continentCode (iPlace._continentCode), _timeZoneGroup (iPlace._timeZoneGroup), _longitude (iPlace._longitude), _latitude (iPlace._latitude), _nameMatrix (iPlace._nameMatrix), - _docID (iPlace._docID) { + _docID (iPlace._docID), _percentage (iPlace._percentage), + _editDistance (iPlace._editDistance) { } // ////////////////////////////////////////////////////////////////////// @@ -77,7 +78,9 @@ oStr << ", " << lCityCode << ", " << _stateCode << ", " << _countryCode << ", " << _regionCode << ", " << _continentCode << ", " << _timeZoneGroup - << ", " << _longitude << ", " << _latitude << ", " << _docID << ". "; + << ", " << _longitude << ", " << _latitude + << ", " << _docID << ", " << _percentage + << ", " << _editDistance << ". "; for (NameMatrix_T::const_iterator itNameList = _nameMatrix.begin(); itNameList != _nameMatrix.end(); ++itNameList) { @@ -85,6 +88,37 @@ oStr << lNameList.toString(); } + if (_extraPlaceList.empty() == false) { + oStr << "; Extra matches: {"; + unsigned short idx = 0; + for (PlaceOrderedList_T::const_iterator itLoc = _extraPlaceList.begin(); + itLoc != _extraPlaceList.end(); ++itLoc, ++idx) { + if (idx != 0) { + oStr << "; "; + } + const Place* lExtraPlace_ptr = *itLoc; + assert (lExtraPlace_ptr != NULL); + oStr << lExtraPlace_ptr->toShortString(); + } + oStr << "}"; + } + + if (_alternatePlaceList.empty() == false) { + oStr << "; Alternate matches: {"; + unsigned short idx = 0; + for (PlaceOrderedList_T::const_iterator itLoc = + _alternatePlaceList.begin(); + itLoc != _alternatePlaceList.end(); ++itLoc, ++idx) { + if (idx != 0) { + oStr << "; "; + } + const Place* lAlternatePlace_ptr = *itLoc; + assert (lAlternatePlace_ptr != NULL); + oStr << lAlternatePlace_ptr->toShortString(); + } + oStr << "}"; + } + return oStr.str(); } @@ -100,7 +134,9 @@ oStr << ", " << lCityCode << ", " << _stateCode << ", " << _countryCode << ", " << _regionCode << ", " << _continentCode << ", " << _timeZoneGroup - << ", " << _longitude << ", " << _latitude << ", " << _docID; + << ", " << _longitude << ", " << _latitude + << ", " << _docID << ", " << _percentage + << ", " << _editDistance; NameMatrix_T::const_iterator itNameHolder = _nameMatrix.begin(); if (itNameHolder != _nameMatrix.end()) { @@ -113,6 +149,14 @@ } } + if (_extraPlaceList.empty() == false) { + oStr << " " << _extraPlaceList.size() << " extra match(es)"; + } + + if (_alternatePlaceList.empty() == false) { + oStr << " " << _alternatePlaceList.size() << " alternate match(es)"; + } + return oStr.str(); } @@ -143,6 +187,8 @@ << ", longitude = " << _longitude << ", latitude = " << _latitude << ", docID = " << _docID + << ", percentage = " << _percentage << "%" + << ", edit distance = " << _editDistance << std::endl; return oStr.str(); } @@ -215,7 +261,37 @@ // Copy the parameters from the Place object to the Location structure Location oLocation (_placeCode, lCityCode, _stateCode, _countryCode, _regionCode, _continentCode, _timeZoneGroup, - _longitude, _latitude, lNameList); + _longitude, _latitude, lNameList, + _percentage, _editDistance); + + // Add extra matching locations, whenever they exist + if (_extraPlaceList.empty() == false) { + for (PlaceOrderedList_T::const_iterator itLoc = _extraPlaceList.begin(); + itLoc != _extraPlaceList.end(); ++itLoc) { + const Place* lExtraPlace_ptr = *itLoc; + assert (lExtraPlace_ptr != NULL); + + // Add the extra matching location + const Location& lExtraLocation = lExtraPlace_ptr->createLocation(); + oLocation.addExtraLocation (lExtraLocation); + } + } + + // Add alternate matching locations, whenever they exist + if (_alternatePlaceList.empty() == false) { + for (PlaceOrderedList_T::const_iterator itLoc = + _alternatePlaceList.begin(); + itLoc != _alternatePlaceList.end(); ++itLoc) { + const Place* lAlternatePlace_ptr = *itLoc; + assert (lAlternatePlace_ptr != NULL); + + // Add the alternate matching location + const Location& lAlternateLocation = + lAlternatePlace_ptr->createLocation(); + oLocation.addAlternateLocation (lAlternateLocation); + } + } + return oLocation; } } Modified: trunk/opentrep/opentrep/bom/Place.hpp =================================================================== --- trunk/opentrep/opentrep/bom/Place.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/bom/Place.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -81,6 +81,16 @@ return _docID; } + /** Get the matching percentage. */ + const MatchingPercentage_T& getPercentage() const { + return _percentage; + } + + /** Get the allowed edit distance/error. */ + const NbOfErrors_T& getEditDistance() const { + return _editDistance; + } + /** Get the map of name lists. */ const NameMatrix_T& getNameMatrix () const { return _nameMatrix; @@ -156,6 +166,16 @@ _docID = iDocID; } + /** Set the Xapian matching percentage. */ + void setPercentage (const MatchingPercentage_T& iPercentage) { + _percentage = iPercentage; + } + + /** Set the allowed edit distance/error. */ + void setEditDistance (const NbOfErrors_T& iEditDistance) { + _editDistance = iEditDistance; + } + public: // ////////// Setters in underlying names //////// @@ -247,9 +267,16 @@ double _latitude; /** List of names, for each given language. */ NameMatrix_T _nameMatrix; + /** Xapian document ID. */ XapianDocID_T _docID; + /** Matching percentage. */ + MatchingPercentage_T _percentage; + + /** Allowed edit error/distance. */ + NbOfErrors_T _editDistance; + /** List of extra matching (similar) places. */ PlaceOrderedList_T _extraPlaceList; Modified: trunk/opentrep/opentrep/bom/ResultHolder.cpp =================================================================== --- trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/bom/ResultHolder.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -280,7 +280,7 @@ const NbOfMatches_T lNbOfMatches = lMatchingDocument.notifyIfExtraMatch(); OPENTREP_LOG_DEBUG ("==> " << lNbOfMatches - << " matches for the query string: `" + << " main matches for the query string: `" << lMatchedString << "' (from `" << lQueryString << "')"); Modified: trunk/opentrep/opentrep/bom/StringMatcher.cpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/bom/StringMatcher.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -621,4 +621,21 @@ WordHolder::createStringFromWordList (lRemainingWordList); } + // ////////////////////////////////////////////////////////////////////// + std::string StringMatcher::getPlaceCode (const Xapian::Document& iDocument) { + // Retrieve the Xapian document data + const std::string& lDocumentData = iDocument.get_data(); + + // Tokenise the string into words + WordList_T lWordList; + WordHolder::tokeniseStringIntoWordList (lDocumentData, lWordList); + assert (lWordList.empty() == false); + + // By convention (within OpenTrep), the first word of the Xapian + // document data string is the place code + const std::string& lPlaceCode = lWordList.front(); + + return lPlaceCode; + } + } Modified: trunk/opentrep/opentrep/bom/StringMatcher.hpp =================================================================== --- trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/bom/StringMatcher.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -15,6 +15,7 @@ namespace Xapian { class MSet; class Database; + class Document; } namespace OPENTREP { @@ -24,6 +25,7 @@ for more information. */ class StringMatcher : public BomAbstract { public: + // /////////////////////////////////////////////// /** Search, within the Xapian database, for occurrences of the words of the search string. @param Xapian::MSet& The Xapian matching set. It can be empty. @@ -63,6 +65,15 @@ static void subtractParsedToRemaining (const std::string& iAlreadyParsedQueryString, std::string& ioRemainingQueryString); + + + public: + // /////////////////////////////////////////////// + /** Extract the place code from the document data. + <br>The place code is the first 3-letter string of the Xapian + document data/content. */ + static std::string getPlaceCode (const Xapian::Document&); + }; } Modified: trunk/opentrep/opentrep/command/DBManager.cpp =================================================================== --- trunk/opentrep/opentrep/command/DBManager.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/command/DBManager.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -60,6 +60,52 @@ // ////////////////////////////////////////////////////////////////////// void DBManager:: + prepareSelectOnPlaceCodeStatement (soci::session& ioSociSession, + soci::statement& ioSelectStatement, + const std::string& iPlaceCode, + Place& ioPlace) { + + try { + + // Instanciate a SQL statement (no request is performed at that stage) + /** + select rpd.code AS code, city_code, xapian_docid, is_airport, is_city, + is_main, is_commercial, state_code, country_code, region_code, + continent_code, time_zone_grp, longitude, latitude, language_code, + classical_name, extended_name, alternate_name1, alternate_name2, + alternate_name3, alternate_name4, alternate_name5, alternate_name6, + alternate_name7, alternate_name8, alternate_name9, alternate_name10 + from ref_place_details rpd, ref_place_names rpn + where rpd.code = iPlaceCode + and rpn.code = rpd.code; + */ + + ioSelectStatement = + (ioSociSession.prepare + << "select rpd.code AS code, city_code, xapian_docid, is_airport, " + << "is_city, is_main, is_commercial, state_code, country_code, " + << "region_code, continent_code, time_zone_grp, longitude, latitude, " + << "language_code, classical_name, extended_name, " + << "alternate_name1, alternate_name2, alternate_name3, " + << "alternate_name4, alternate_name5, alternate_name6, " + << "alternate_name7, alternate_name8, alternate_name9, " + << "alternate_name10 " + << "from ref_place_details rpd, ref_place_names rpn " + << "where rpd.code = :place_code " + << "and rpn.code = rpd.code", + soci::into (ioPlace), soci::use (iPlaceCode)); + + // Execute the SQL query + ioSelectStatement.execute(); + + } catch (std::exception const& lException) { + OPENTREP_LOG_ERROR ("Error: " << lException.what()); + throw SQLDatabaseException(); + } + } + + // ////////////////////////////////////////////////////////////////////// + void DBManager:: prepareSelectOnDocIDStatement (soci::session& ioSociSession, soci::statement& ioSelectStatement, const XapianDocID_T& iDocID, @@ -164,7 +210,7 @@ // ////////////////////////////////////////////////////////////////////// bool DBManager::retrievePlace (soci::session& ioSociSession, - const XapianDocID_T& iDocID, + const std::string& iPlaceCode, Place& ioPlace) { bool oHasRetrievedPlace = false; @@ -172,8 +218,9 @@ // Prepare the SQL request corresponding to the select statement soci::statement lSelectStatement (ioSociSession); - DBManager::prepareSelectOnDocIDStatement (ioSociSession, lSelectStatement, - iDocID, ioPlace); + DBManager::prepareSelectOnPlaceCodeStatement (ioSociSession, + lSelectStatement, + iPlaceCode, ioPlace); const bool shouldDoReset = true; bool hasStillData = iterateOnStatement (lSelectStatement, ioPlace, shouldDoReset); Modified: trunk/opentrep/opentrep/command/DBManager.hpp =================================================================== --- trunk/opentrep/opentrep/command/DBManager.hpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/command/DBManager.hpp 2009-08-08 14:55:34 UTC (rev 171) @@ -22,28 +22,41 @@ from the database. */ class DBManager { public: + /** Update the Xapian document ID field of the database row + corresponding to the given Place object. */ + static void updatePlaceInDB (soci::session&, const Place&); + + /** Retrieve, from the (MySQL) database, the row corresponding to + the given place code (e.g., 'sfo' for San Francisco Intl + airport), and fill the given Place object with that retrieved + data. */ + static bool retrievePlace (soci::session&, const std::string& iPlaceCode, + Place&); + + + public: /** Prepare (parse and put in cache) the SQL statement. */ static void prepareSelectStatement (soci::session&, soci::statement&, Place&); - /** Prepare (parse and put in cache) the SQL statement. */ - static void prepareSelectOnDocIDStatement (soci::session&, soci::statement&, - const XapianDocID_T&, Place&); - /** Iterate on the SQL statement. <br>The SQL has to be already prepared. @parameter const bool Tells whether the Place object should be reset. */ static bool iterateOnStatement (soci::statement&, Place&, const bool iShouldDoReset); - /** Update the Xapian document ID field of the database row - corresponding to the given Place object. */ - static void updatePlaceInDB (soci::session&, const Place&); + + private: + /** Prepare (parse and put in cache) the SQL statement. */ + static void prepareSelectOnPlaceCodeStatement(soci::session&, + soci::statement&, + const std::string& iPlaceCode, + Place&); + + /** Prepare (parse and put in cache) the SQL statement. */ + static void prepareSelectOnDocIDStatement (soci::session&, soci::statement&, + const XapianDocID_T&, Place&); - /** Retrieve, from the (MySQL) database, the row corresponding to the - given Xapian Document ID, and fill the given Place object with - that retrieved data. */ - static bool retrievePlace (soci::session&, const XapianDocID_T&, Place&); private: /** Constructors. */ Modified: trunk/opentrep/opentrep/command/IndexBuilder.cpp =================================================================== --- trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/command/IndexBuilder.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -170,16 +170,10 @@ lPlace, shouldDoReset); while (hasStillData == true) { - // Add the document corresponding to the Place object to the + // Add the document, corresponding to the Place object, to the // Xapian index IndexBuilder::addDocumentToIndex (lDatabase, lPlace); - // Update the row in (MySQL) database for the given Place object: - // The Xapian document ID is generated by Xapian when inserting - // the document into the index; that document ID has to be updated - // in the (MySQL) database. - DBManager::updatePlaceInDB (ioSociSession, lPlace); - // DEBUG OPENTREP_LOG_DEBUG ("[" << idx << "] " << lPlace); Modified: trunk/opentrep/opentrep/command/RequestInterpreter.cpp =================================================================== --- trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-07-27 05:56:43 UTC (rev 170) +++ trunk/opentrep/opentrep/command/RequestInterpreter.cpp 2009-08-08 14:55:34 UTC (rev 171) @@ -14,6 +14,7 @@ #include <opentrep/bom/ResultHolder.hpp> #include <opentrep/bom/Result.hpp> #include <opentrep/bom/PlaceHolder.hpp> +#include <opentrep/bom/StringMatcher.hpp> #include <opentrep/factory/FacPlaceHolder.hpp> #include <opentrep/factory/FacPlace.hpp> #include <opentrep/factory/FacResultHolder.hpp> @@ -59,58 +60,171 @@ << "=========================================" << std::endl << std::endl); } + + /** Helper function. */ + // ////////////////////////////////////////////////////////////////////// + bool retrieveAndFillPlace (const Xapian::Document& iDocument, + const Xapian::percent& iDocPercentage, + soci::session& ioSociSession, Place& ioPlace) { + bool hasRetrievedPlace = false; + + // Set the matching percentage + ioPlace.setPercentage (iDocPercentage); + + // Retrieve the parameters of the best matching document + const std::string& lPlaceCode = StringMatcher::getPlaceCode (iDocument); + + // DEBUG + const Xapian::docid& lDocID = iDocument.get_docid(); + const std::string& lDocData = iDocument.get_data(); + OPENTREP_LOG_DEBUG ("Place code: " << lPlaceCode << " - Document ID " + << lDocID << ", " << iDocPercentage + << "% [" << lDocData << "]"); + + // Fill the Place object with the row retrieved from the + // (MySQL) database and corresponding to the given place code + // (e.g., 'sfo' for the San Francisco Intl airport). + hasRetrievedPlace = DBManager::retrievePlace (ioSociSession, lPlaceCode, + ioPlace); + + if (hasRetrievedPlace == false) { + /** + The Xapian database/index should contain only places + available within the SQL database, as the first is built from + the latter. If that happens, it means that the user gave a + wrong Xapian database. + */ + OPENTREP_LOG_ERROR ("There is no document corresponding to " + << lPlaceCode << " (Xapian document ID" << lDocID + << " [" << lDocData << "]) in the SQL database. " + << "It usually means that the Xapian index/database " + << "is not synchronised with the SQL database. " + << "[Hint] Rebuild the Xapian index/database " + << "from the SQL database."); + throw XapianTravelDatabaseNotInSyncWithSQLDatabaseException(); + } + + return hasRetrievedPlace; + } + /** Helper function. */ // ////////////////////////////////////////////////////////////////////// + bool retrieveAndFillPlace (const Document& iDocument, + soci::session& ioSociSession, Place& ioPlace) { + // Delegate + const Xapian::Document& lXapianDocument = iDocument.getXapianDocument(); + const Xapian::percent& lDocPercentage = iDocument.getXapianPercentage(); + return retrieveAndFillPlace (lXapianDocument, lDocPercentage, + ioSociSession, ioPlace); + } + + // ////////////////////////////////////////////////////////////////////// void createPlaces (const ResultHolder& iResultHolder, soci::session& ioSociSession, PlaceHolder& ioPlaceHolder) { - // Browse the list of result objects - const ResultList_T& lResultList = iResultHolder.getResultList(); - for (ResultList_T::const_iterator itResult = lResultList.begin(); - itResult != lResultList.end(); ++itResult) { - // Retrieve the result object - const Result* lResult_ptr = *itResult; - assert (lResult_ptr != NULL); + // Browse the list of result objects + const ResultList_T& lResultList = iResultHolder.getResultList(); + for (ResultList_T::const_iterator itResult = lResultList.begin(); + itResult != lResultList.end(); ++itResult) { + // Retrieve the result object + const Result* lResult_ptr = *itResult; + assert (lResult_ptr != NULL); - /** - TODO: Add a loop for retrieving both extra and alternate Documents - Use FacPlace::initLinkWithExtraPlace() and - FacPlace::initLinkWithAlternatePlace() - */ + // Retrieve the matching document + const Document& lDocument = lResult_ptr->getMatchingDocument(); + + // Instanciate an empty place object, which will be filled from the + // rows retrieved from the database. + Place& lPlace = FacPlace::instance().create(); + + // Retrieve, in the MySQL database, the place corresponding to + // the place code located as the first word of the Xapian + // document data. + bool hasRetrievedPlace = retrieveAndFillPlace (lDocument, ioSociSession, + lPlace); + // If there was no place corresponding to the place code with + // the SQL database, an exception is thrown. Hence, here, by + // construction, the place has been retrieved from the SQL + // database. + assert (hasRetrievedPlace == true); + + // Insert the Place object within the PlaceHolder object + FacPlaceHolder::initLinkWithPlace (ioPlaceHolder, lPlace); + + // DEBUG + OPENTREP_LOG_DEBUG ("Retrieved Document: " << lPlace.toString()); + + // Retrieve the list of extra matching documents (documents + // matching with the same weight/percentage) + const Xapian::percent& lExtraDocPercentage = + lDocument.getXapianPercentage(); + const XapianDocumentList_T& lExtraDocumentList = + lDocument.getExtraDocumentList(); + for (XapianDocumentList_T::const_iterator itExtraDoc = + lExtraDocumentList.begin(); + itExtraDoc != lExtraDocumentList.end(); ++itExtraDoc) { + // Retrieve the extra matching Xapian document + const Xapian::Document& lExtraDocument = *itExtraDoc; - // Retrieve the parameters of the best matching document - const Xapian::Document& lDocument = lResult_ptr->getXapianDocument(); - const Xapian::percent& lDocPercentage = - lResult_ptr->getXapianPercentage(); - const Xapian::docid& lDocID = lDocument.get_docid(); - const std::string& lDocData = lDocument.get_data(); - + // Instanciate an empty place object, which will be filled from the + // ... [truncated message content] |
From: <den...@us...> - 2009-08-14 13:56:46
|
Revision: 174 http://opentrep.svn.sourceforge.net/opentrep/?rev=174&view=rev Author: denis_arnaud Date: 2009-08-14 13:56:37 +0000 (Fri, 14 Aug 2009) Log Message: ----------- [Test] Added a proof-of-concept code sample for STL iterator on BOM classes. Modified Paths: -------------- trunk/opentrep/configure.ac Added Paths: ----------- trunk/opentrep/test/iterator/ trunk/opentrep/test/iterator/Makefile.am trunk/opentrep/test/iterator/pocIterator.cpp Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-08-10 16:23:33 UTC (rev 173) +++ trunk/opentrep/configure.ac 2009-08-14 13:56:37 UTC (rev 174) @@ -265,6 +265,7 @@ test/parsers/Makefile test/i18n/Makefile test/python/Makefile + test/iterator/Makefile test/Makefile win32/Makefile) AC_OUTPUT Property changes on: trunk/opentrep/test/iterator ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile.in Makefile pocIterator Added: trunk/opentrep/test/iterator/Makefile.am =================================================================== --- trunk/opentrep/test/iterator/Makefile.am (rev 0) +++ trunk/opentrep/test/iterator/Makefile.am 2009-08-14 13:56:37 UTC (rev 174) @@ -0,0 +1,12 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +check_PROGRAMS = pocIterator + +pocIterator_SOURCES = pocIterator.cpp +pocIterator_CXXFLAGS = +pocIterator_LDFLAGS = + +EXTRA_DIST = Added: trunk/opentrep/test/iterator/pocIterator.cpp =================================================================== --- trunk/opentrep/test/iterator/pocIterator.cpp (rev 0) +++ trunk/opentrep/test/iterator/pocIterator.cpp 2009-08-14 13:56:37 UTC (rev 174) @@ -0,0 +1,144 @@ +// ////////////////////////////////////////////////////////////////////////// +// Proof-of-concept for STL iterators on Business Object Model (BOM) objects +// ////////////////////////////////////////////////////////////////////////// +// STL +#include <cassert> +#include <iostream> +#include <sstream> +#include <iterator> +#include <vector> + +/** Base class. */ +class BaseClass { +public: + /** Constructor. */ + BaseClass (const std::string& iName) : _name (iName) {} + /** Destructor. */ + ~BaseClass () {} + /** Get the serialised version of the Object. */ + virtual void toStream (std::ostream& ioOut) const = 0; +protected: + /** Name. */ + std::string _name; +}; + +/** Standard display function. */ +template <class charT, class traits> +inline +std::basic_ostream<charT, traits>& +operator<< (std::basic_ostream<charT, traits>& ioOut, + const BaseClass& iBaseClass) { + std::basic_ostringstream<charT,traits> ostr; + ostr.copyfmt (ioOut); + ostr.width (0); + // Fill string stream + iBaseClass.toStream (ostr); + // Print string stream + ioOut << ostr.str(); + return ioOut; +} + +/** Child class. */ +class Child : public BaseClass { +public: + /** Constructor. */ + Child (const std::string& iName) : BaseClass (iName) {} + /** Destructor. */ + ~Child () {} + /** Get the serialised version of the Object. */ + void toStream (std::ostream& ioOut) const { ioOut << "Child: " << _name; } +}; + +/** List of pointers on children objects. */ +typedef std::vector<Child*> ChildList_T; + +/** Parent class. */ +class Parent : public BaseClass { +public: + /** STL iterators on the list of (pointers on) children objects. */ + typedef ChildList_T::const_iterator const_iterator; + typedef ChildList_T::iterator iterator; + typedef ChildList_T::reverse_iterator reverse_iterator; + typedef ChildList_T::const_reverse_iterator const_reverse_iterator; + + /** Constructor. */ + Parent (const std::string& iName) : BaseClass (iName) {} + /** Destructor. */ + ~Parent () {} + + /** Get the serialised version of the Object. */ + void toStream (std::ostream& ioOut) const { ioOut << "Parent: " << _name; } + + /** Add a child in the dedicated list. */ + void push_back (Child& ioChild) { _childList.push_back (&ioChild); } + + /** Return the iterator instantiated on the first element of the + list of children objects. */ + const_iterator begin() const { return _childList.begin(); } + + /** Return the iterator instantiated beyond the last element of the + list of children objects. */ + const_iterator end() const { return _childList.end(); } + + /** Return the iterator instantiated on the last element of the + list of children objects. */ + const_reverse_iterator rbegin() const { return _childList.rbegin(); } + + /** Return the iterator instantiated beyond the first element of the + list of children objects. */ + const_reverse_iterator rend() const { return _childList.rend(); } + +private: + /** List of pointers on children objects. */ + ChildList_T _childList; +}; + +// ///////////// M A I N ///////////// +int main (int argc, char* argv[]) { + + // Initialisation + Parent* lParent_ptr = new Parent ("parent"); + + Child* lChild1_ptr = new Child ("child1"); + lParent_ptr->push_back (*lChild1_ptr); + + Child* lChild2_ptr = new Child ("child2"); + lParent_ptr->push_back (*lChild2_ptr); + + // ///////////// Usage (as a proof of concept) ///////////// + // + // Ascending order + std::cout << *lParent_ptr << " in the ascending order:" << std::endl; + unsigned short idx = 1; + for (Parent::const_iterator itChild = lParent_ptr->begin(); + itChild != lParent_ptr->end(); ++itChild, ++idx) { + if (idx != 1) { + std::cout << "; "; + } + + const Child* lChild_ptr = *itChild; + assert (lChild_ptr != NULL); + + std::cout << *lChild_ptr; + } + std::cout << std::endl; + + // + // Descending order + std::cout << *lParent_ptr << " in the descending order:" << std::endl; + idx = 1; + for (Parent::const_reverse_iterator itChild = lParent_ptr->rbegin(); + itChild != lParent_ptr->rend(); ++itChild, ++idx) { + if (idx != 1) { + std::cout << "; "; + } + + const Child* lChild_ptr = *itChild; + assert (lChild_ptr != NULL); + + std::cout << *lChild_ptr; + } + std::cout << std::endl; + + return 0; +} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-08-14 17:51:16
|
Revision: 176 http://opentrep.svn.sourceforge.net/opentrep/?rev=176&view=rev Author: denis_arnaud Date: 2009-08-14 17:51:06 +0000 (Fri, 14 Aug 2009) Log Message: ----------- [i18n] Added a few examples for the development around the ICU library. Modified Paths: -------------- trunk/opentrep/TODO trunk/opentrep/configure.ac trunk/opentrep/test/IndexBuildingTestSuite.cpp trunk/opentrep/test/i18n/Makefile.am Added Paths: ----------- trunk/opentrep/test/i18n/icu/ trunk/opentrep/test/i18n/icu/Makefile.am trunk/opentrep/test/i18n/icu/icucharsetdetector.cpp trunk/opentrep/test/i18n/icu/icuconv.cpp trunk/opentrep/test/i18n/icu/icuconvref.cpp trunk/opentrep/test/i18n/icu/icufmt.cpp trunk/opentrep/test/i18n/icu/icuustring.cpp trunk/opentrep/test/i18n/icu/icuustringref.cpp Removed Paths: ------------- trunk/opentrep/test/i18n/icufmt.cpp Property Changed: ---------------- trunk/opentrep/test/i18n/ Modified: trunk/opentrep/TODO =================================================================== --- trunk/opentrep/TODO 2009-08-14 15:18:55 UTC (rev 175) +++ trunk/opentrep/TODO 2009-08-14 17:51:06 UTC (rev 176) @@ -1,6 +1,22 @@ Todo list for the OpenTrep project ---------------------------------- +* [01/08/2009] Write a (Python-based) PSP page, in order to test the + different locales of the browsers. +The Python (PSP) page has been created, but there is still some work +to do in order to adapt it to the new API (with extra and alternate +locations). + +* [14/08/2009] With the ICU library, check the encoding of the input, + and convert in Unicode if needed (see the test/i18n/icuustring and + test/i18n/icuconv} for example). First detect and convert hard-coded + strings, then do it on the output of PSP pages. + +* [14/08/2009] Write a transliterator, taking UTF-8 Cyrillic input + (e.g., Russian and/or Ukrainian) and romanising/transliterating + it. Note that, with the ICU library, UTex may be used advantageously + (to take UTF-8 input). + * [01/08/2009] Finish the work on bringing extra and additional Location objects into the API. OK @@ -12,9 +28,3 @@ corresponding result details within the database. The easiest way is to extract the first three letters of the Xapian document data. OK - -* [01/08/2009] Write a (Python-based) PSP page, in order to test the - different locales of the browsers. -The Python (PSP) page has been created, but there is still some work -to do in order to adapt it to the new API (with extra and alternate -locations). \ No newline at end of file Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-08-14 15:18:55 UTC (rev 175) +++ trunk/opentrep/configure.ac 2009-08-14 17:51:06 UTC (rev 176) @@ -264,6 +264,7 @@ test/com/Makefile test/parsers/Makefile test/i18n/Makefile + test/i18n/icu/Makefile test/python/Makefile test/iterator/Makefile test/Makefile Modified: trunk/opentrep/test/IndexBuildingTestSuite.cpp =================================================================== --- trunk/opentrep/test/IndexBuildingTestSuite.cpp 2009-08-14 15:18:55 UTC (rev 175) +++ trunk/opentrep/test/IndexBuildingTestSuite.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -5,6 +5,7 @@ #include <test/com/CppUnitCore.hpp> // OpenTrep #include <opentrep/OPENTREP_Service.hpp> +#include <opentrep/Location.hpp> // OpenTrep Test Suite #include <test/IndexBuildingTestSuite.hpp> @@ -39,7 +40,7 @@ // Query the Xapian database (index) OPENTREP::WordList_T lNonMatchedWordList; OPENTREP::LocationList_T lLocationList; - const OPENTREP::NbOfMatches_T nbOfMatches = + // const OPENTREP::NbOfMatches_T nbOfMatches = opentrepService.interpretTravelRequest (lTravelQuery, lLocationList, lNonMatchedWordList); Property changes on: trunk/opentrep/test/i18n ___________________________________________________________________ Modified: svn:ignore - .libs .deps Makefile.in Makefile boost_string loc2 stdlocru icufmt simple_io + .libs .deps Makefile.in Makefile boost_string loc2 stdlocru simple_io Modified: trunk/opentrep/test/i18n/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/Makefile.am 2009-08-14 15:18:55 UTC (rev 175) +++ trunk/opentrep/test/i18n/Makefile.am 2009-08-14 17:51:06 UTC (rev 176) @@ -3,7 +3,7 @@ MAINTAINERCLEANFILES = Makefile.in -check_PROGRAMS = boost_string loc2 stdlocru icufmt simple_io +check_PROGRAMS = boost_string loc2 stdlocru simple_io boost_string_SOURCES = boost_string.cpp boost_string_CXXFLAGS = $(BOOST_CFLAGS) @@ -18,10 +18,6 @@ stdlocru_CXXFLAGS = $(BOOST_CFLAGS) stdlocru_LDFLAGS = $(BOOST_LIBS) -icufmt_SOURCES = icufmt.cpp -icufmt_CXXFLAGS = $(ICU_CFLAGS) -icufmt_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) - simple_io_SOURCES = simple_io.cpp simple_io_CXXFLAGS = $(BOOST_CFLAGS) simple_io_LDFLAGS = $(BOOST_LIBS) Property changes on: trunk/opentrep/test/i18n/icu ___________________________________________________________________ Added: svn:ignore + .libs .deps Makefile.in Makefile icufmt icucharsetdetector icuustring icuconv Added: trunk/opentrep/test/i18n/icu/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/icu/Makefile.am (rev 0) +++ trunk/opentrep/test/i18n/icu/Makefile.am 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,24 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +check_PROGRAMS = icufmt icuustring icucharsetdetector icuconv + +icufmt_SOURCES = icufmt.cpp +icufmt_CXXFLAGS = $(ICU_CFLAGS) +icufmt_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) + +icuustring_SOURCES = icuustring.cpp +icuustring_CXXFLAGS = $(ICU_CFLAGS) +icuustring_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) + +icucharsetdetector_SOURCES = icucharsetdetector.cpp +icucharsetdetector_CXXFLAGS = $(ICU_CFLAGS) +icucharsetdetector_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) + +icuconv_SOURCES = icuconv.cpp +icuconv_CXXFLAGS = $(ICU_CFLAGS) +icuconv_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) + +EXTRA_DIST = Added: trunk/opentrep/test/i18n/icu/icucharsetdetector.cpp =================================================================== --- trunk/opentrep/test/i18n/icu/icucharsetdetector.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icucharsetdetector.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,21 @@ +// STL +#include <iostream> +#include <string> +// ICU +#include <unicode/utypes.h> +#include <unicode/ucsdet.h> + +int main (int argc, char* argv[]) { + + UErrorCode status = U_ZERO_ERROR; + UCharsetDetector* csd = ucsdet_open (&status); + static char buffer[11] = "0123456789"; + int32_t inputLength = 10; + ucsdet_setText (csd, buffer, inputLength, &status); + const UCharsetMatch* ucm = ucsdet_detect (csd, &status); + const std::string name = ucsdet_getName (ucm, &status); + + std::cout << "Character set encoding: " << name << std::endl; + + return 0; +} Added: trunk/opentrep/test/i18n/icu/icuconv.cpp =================================================================== --- trunk/opentrep/test/i18n/icu/icuconv.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icuconv.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,506 @@ + +// STL +#include <cstdio> +#include <ctype.h> /* for isspace, etc. */ +#include <cassert> +#include <cstring> +#include <cstdlib> /* malloc */ + +#define DEBUG_TMI 0 /* define to 1 to enable Too Much Information */ + +#include "unicode/utypes.h" /* Basic ICU data types */ +#include "unicode/ucnv.h" /* C Converter API */ +#include "unicode/ustring.h" /* some more string fcns*/ +#include "unicode/uchar.h" /* char names */ +#include "unicode/uloc.h" +#include "unicode/unistr.h" + +/* Some utility functions */ + +static const UChar kNone[] = { 0x0000 }; + +#define U_ASSERT(x) { if(U_FAILURE(x)) {fflush(stdout);fflush(stderr); fprintf(stderr, #x " == %s\n", u_errorName(x)); assert(U_SUCCESS(x)); }} + +/* Print a UChar if possible, in seven characters. */ +void prettyPrintUChar(UChar c) +{ + if( (c <= 0x007F) && + (isgraph(c)) ) { + printf(" '%c' ", (char)(0x00FF&c)); + } else if ( c > 0x007F ) { + char buf[1000]; + UErrorCode status = U_ZERO_ERROR; + int32_t o; + + o = u_charName(c, U_UNICODE_CHAR_NAME, buf, 1000, &status); + if(U_SUCCESS(status) && (o>0) ) { + buf[80] = 0; + printf("%7s", buf); + } else { + o = u_charName(c, U_UNICODE_10_CHAR_NAME, buf, 1000, &status); + if(U_SUCCESS(status) && (o>0)) { + buf[5] = 0; + printf("~%6s", buf); + } + else { + printf(" ??????"); + } + } + } else { + switch((char)(c & 0x007F)) { + case ' ': + printf(" ' ' "); + break; + case '\t': + printf(" \\t "); + break; + case '\n': + printf(" \\n "); + break; + default: + printf(" _ "); + break; + } + } +} + + +void printUChars(const char *name = "?", + const UChar *uch = kNone, + int32_t len = -1 ) +{ + int32_t i; + + if( (len == -1) && (uch) ) { + len = u_strlen(uch); + } + + printf("%5s: ", name); + for( i = 0; i <len; i++) { + printf("%-6d ", i); + } + printf("\n"); + + printf("%5s: ", "uni"); + for( i = 0; i <len; i++) { + printf("\\u%04X ", (int)uch[i]); + } + printf("\n"); + + printf("%5s:", "ch"); + for( i = 0; i <len; i++) { + prettyPrintUChar(uch[i]); + } + printf("\n"); +} + +void printBytes(const char *name = "?", + const char *uch = "", + int32_t len = -1 ) +{ + int32_t i; + + if( (len == -1) && (uch) ) { + len = strlen(uch); + } + + printf("%5s: ", name); + for( i = 0; i <len; i++) { + printf("%-4d ", i); + } + printf("\n"); + + printf("%5s: ", "uni"); + for( i = 0; i <len; i++) { + printf("\\x%02X ", 0x00FF & (int)uch[i]); + } + printf("\n"); + + printf("%5s:", "ch"); + for( i = 0; i <len; i++) { + if(isgraph(0x00FF & (int)uch[i])) { + printf(" '%c' ", (char)uch[i]); + } else { + printf(" "); + } + } + printf("\n"); +} + +void printUChar(UChar32 ch32) +{ + if(ch32 > 0xFFFF) { + printf("ch: U+%06X\n", ch32); + } + else { + UChar ch = (UChar)ch32; + printUChars("C", &ch, 1); + } +} + +/******************************************************************* + Very simple C sample to convert the word 'Moscow' in Russian in Unicode, + followed by an exclamation mark (!) into the KOI8-R Russian code page. + + This example first creates a UChar String out of the Unicode chars. + + targetSize must be set to the amount of space available in the target + buffer. After fromUChars is called, + len will contain the number of bytes in target[] which were + used in the resulting codepage. In this case, there is a 1:1 mapping + between the input and output characters. The exclamation mark has the + same value in both KOI8-R and Unicode. + + src: 0 1 2 3 4 5 6 + uni: \u041C \u043E \u0441 \u043A \u0432 \u0430 \u0021 + ch: CYRILL CYRILL CYRILL CYRILL CYRILL CYRILL '!' + + targ: 0 1 2 3 4 5 6 + uni: \xED \xCF \xD3 \xCB \xD7 \xC1 \x21 + ch: '!' + + +Converting FROM unicode + to koi8-r. + You must call ucnv_close to clean up the memory used by the + converter. + + 'len' returns the number of OUTPUT bytes resulting from the + conversion. + */ + +UErrorCode convsample_02() +{ + printf("\n\n==============================================\n" + "Sample 02: C: simple Unicode -> koi8-r conversion\n"); + + + // **************************** START SAMPLE ******************* + // "cat<cat>OK" + UChar source[] = { 0x041C, 0x043E, 0x0441, 0x043A, 0x0432, + 0x0430, 0x0021, 0x0000 }; + char target[100]; + UErrorCode status = U_ZERO_ERROR; + UConverter *conv; + int32_t len; + + // set up the converter + conv = ucnv_open("koi8-r", &status); + assert(U_SUCCESS(status)); + + // convert to koi8-r + len = ucnv_fromUChars(conv, target, 100, source, -1, &status); + assert(U_SUCCESS(status)); + + // close the converter + ucnv_close(conv); + + // ***************************** END SAMPLE ******************** + + // Print it out + printUChars("src", source); + printf("\n"); + printBytes("targ", target, len); + + return U_ZERO_ERROR; +} + + +UErrorCode convsample_03() +{ + printf("\n\n==============================================\n" + "Sample 03: C: print out all converters\n"); + + int32_t count; + int32_t i; + + // **************************** START SAMPLE ******************* + count = ucnv_countAvailable(); + printf("Available converters: %d\n", count); + + for(i=0;i<count;i++) + { + printf("%s ", ucnv_getAvailableName(i)); + } + + // ***************************** END SAMPLE ******************** + + printf("\n"); + + return U_ZERO_ERROR; +} + + + +#define BUFFERSIZE 17 /* make it interesting :) */ + +/* + Converting from a codepage to Unicode in bulk.. + What is the best way to determine the buffer size? + + The 'buffersize' is in bytes of input. + For a given converter, divinding this by the minimum char size + give you the maximum number of Unicode characters that could be + expected for a given number of input bytes. + see: ucnv_getMinCharSize() + + For example, a single byte codepage like 'Latin-3' has a + minimum char size of 1. (It takes at least 1 byte to represent + each Unicode char.) So the unicode buffer has the same number of + UChars as the input buffer has bytes. + + In a strictly double byte codepage such as cp1362 (Windows + Korean), the minimum char size is 2. So, only half as many Unicode + chars as bytes are needed. + + This work to calculate the buffer size is an optimization. Any + size of input and output buffer can be used, as long as the + program handles the following cases: If the input buffer is empty, + the source pointer will be equal to sourceLimit. If the output + buffer has overflowed, U_BUFFER_OVERFLOW_ERROR will be returned. + */ + +UErrorCode convsample_05() +{ + printf("\n\n==============================================\n" + "Sample 05: C: count the number of letters in a UTF-8 document\n"); + + FILE *f; + int32_t count; + char inBuf[BUFFERSIZE]; + const char *source; + const char *sourceLimit; + UChar *uBuf; + UChar *target; + UChar *targetLimit; + UChar *p; + int32_t uBufSize = 0; + UConverter *conv; + UErrorCode status = U_ZERO_ERROR; + uint32_t letters=0, total=0; + + f = fopen("ref/ref_text_ru.txt", "r"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'ref/ref_text_ru.txt' (UTF-8 data file).\n"); + return U_FILE_ACCESS_ERROR; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_open("utf-8", &status); + assert(U_SUCCESS(status)); + + uBufSize = (BUFFERSIZE/ucnv_getMinCharSize(conv)); + printf("input bytes %d / min chars %d = %d UChars\n", + BUFFERSIZE, ucnv_getMinCharSize(conv), uBufSize); + uBuf = (UChar*)malloc(uBufSize * sizeof(UChar)); + assert(uBuf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, 1, BUFFERSIZE , f)) > 0) ) + { + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + do + { + target = uBuf; + targetLimit = uBuf + uBufSize; + + ucnv_toUnicode(conv, &target, targetLimit, + &source, sourceLimit, NULL, + feof(f)?TRUE:FALSE, /* pass 'flush' when eof */ + /* is true (when no more data will come) */ + &status); + + if(status == U_BUFFER_OVERFLOW_ERROR) + { + // simply ran out of space - we'll reset the target ptr the next + // time through the loop. + status = U_ZERO_ERROR; + } + else + { + // Check other errors here. + assert(U_SUCCESS(status)); + // Break out of the loop (by force) + } + + // Process the Unicode + // Todo: handle UTF-16/surrogates + + for(p = uBuf; p<target; p++) + { + if(u_isalpha(*p)) + letters++; + total++; + } + } while (source < sourceLimit); // while simply out of space + } + + printf("%d letters out of %d total UChars.\n", letters, total); + + // ***************************** END SAMPLE ******************** + ucnv_close(conv); + + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + +#define BUFFERSIZE 1024 +typedef struct +{ + UChar32 codepoint; + uint32_t frequency; +} CharFreqInfo; + +UErrorCode convsample_06() +{ + printf("\n\n==============================================\n" + "Sample 06: C: frequency distribution of letters in a UTF-8 document\n"); + + FILE *f; + int32_t count; + char inBuf[BUFFERSIZE]; + const char *source; + const char *sourceLimit; + UChar *uBuf; + int32_t uBufSize = 0; + UConverter *conv; + UErrorCode status = U_ZERO_ERROR; + uint32_t letters=0, total=0; + + CharFreqInfo *info; + UChar32 charCount = 0x10000; /* increase this if you want to handle non bmp.. todo: automatically bump it.. */ + UChar32 p; + + uint32_t ie = 0; + uint32_t gh = 0; + UChar32 l = 0; + + f = fopen("ref/ref_text_ru.txt", "r"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'ref/ref_text_ru.txt' (UTF-8 data file).\n"); + return U_FILE_ACCESS_ERROR; + } + + info = (CharFreqInfo*)malloc(sizeof(CharFreqInfo) * charCount); + if(!info) + { + fprintf(stderr, " Couldn't allocate %d bytes for freq counter\n", sizeof(CharFreqInfo)*charCount); + } + + /* reset frequencies */ + for(p=0;p<charCount;p++) + { + info[p].codepoint = p; + info[p].frequency = 0; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_open("utf-8", &status); + assert(U_SUCCESS(status)); + + uBufSize = (BUFFERSIZE/ucnv_getMinCharSize(conv)); + printf("input bytes %d / min chars %d = %d UChars\n", + BUFFERSIZE, ucnv_getMinCharSize(conv), uBufSize); + uBuf = (UChar*)malloc(uBufSize * sizeof(UChar)); + assert(uBuf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, 1, BUFFERSIZE , f)) > 0) ) + { + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + while(source < sourceLimit) + { + p = ucnv_getNextUChar(conv, &source, sourceLimit, &status); + if(U_FAILURE(status)) + { + fprintf(stderr, "%s @ %d\n", u_errorName(status), total); + status = U_ZERO_ERROR; + continue; + } + U_ASSERT(status); + total++; + + if(u_isalpha(p)) + letters++; + + if((u_tolower(l) == 'i') && (u_tolower(p) == 'e')) + ie++; + + if((u_tolower(l) == 'g') && (u_tolower(p) == 0x0127)) + gh++; + + if(p>charCount) + { + fprintf(stderr, "U+%06X: oh.., we only handle BMP characters so far.. redesign!\n", p); + return U_UNSUPPORTED_ERROR; + } + info[p].frequency++; + l = p; + } + } + + fclose(f); + ucnv_close(conv); + + printf("%d letters out of %d total UChars.\n", letters, total); + printf("%d ie digraphs, %d gh digraphs.\n", ie, gh); + + // now, we could sort it.. + + // qsort(info, charCount, sizeof(info[0]), charfreq_compare); + + for(p=0;p<charCount;p++) + { + if(info[p].frequency) + { + printf("% 5d U+%06X ", info[p].frequency, p); + if(p <= 0xFFFF) + { + prettyPrintUChar((UChar)p); + } + printf("\n"); + } + } + free(info); + // ***************************** END SAMPLE ******************** + + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + +#define BUFFERSIZE 219 + + +/* main */ + +int main() { + + printf("Default Converter=%s\n", ucnv_getDefaultName() ); + + convsample_02(); // C , u->koi8r, conv + convsample_03(); // C, iterate + + convsample_05(); // C, utf8->u, getNextUChar + convsample_06(); // C freq counter thingy + + printf("End of converter samples.\n"); + + fflush(stdout); + fflush(stderr); + + return 0; +} Added: trunk/opentrep/test/i18n/icu/icuconvref.cpp =================================================================== --- trunk/opentrep/test/i18n/icu/icuconvref.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icuconvref.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,1102 @@ +/************************************************************************** +* +* Copyright (C) 2000-2003, International Business Machines +* Corporation and others. All Rights Reserved. +* +*************************************************************************** +* file name: convsamp.c +* encoding: ASCII (7-bit) +* +* created on: 2000may30 +* created by: Steven R. Loomis +* +* Sample code for the ICU conversion routines. +* +* Note: Nothing special is needed to build this sample. Link with +* the icu UC and icu I18N libraries. +* +* I use 'assert' for error checking, you probably will want +* something more flexible. '***BEGIN SAMPLE***' and +* '***END SAMPLE***' mark pieces suitable for stand alone +* code snippets. +* +* +* Each test can define it's own BUFFERSIZE +* +*/ + +#define DEBUG_TMI 0 /* define to 1 to enable Too Much Information */ + +#include <stdio.h> +#include <ctype.h> /* for isspace, etc. */ +#include <assert.h> +#include <string.h> +#include <stdlib.h> /* malloc */ + +#include "unicode/utypes.h" /* Basic ICU data types */ +#include "unicode/ucnv.h" /* C Converter API */ +#include "unicode/ustring.h" /* some more string fcns*/ +#include "unicode/uchar.h" /* char names */ +#include "unicode/uloc.h" +#include "unicode/unistr.h" + +#include "flagcb.h" + +/* Some utility functions */ + +static const UChar kNone[] = { 0x0000 }; + +#define U_ASSERT(x) { if(U_FAILURE(x)) {fflush(stdout);fflush(stderr); fprintf(stderr, #x " == %s\n", u_errorName(x)); assert(U_SUCCESS(x)); }} + +/* Print a UChar if possible, in seven characters. */ +void prettyPrintUChar(UChar c) +{ + if( (c <= 0x007F) && + (isgraph(c)) ) { + printf(" '%c' ", (char)(0x00FF&c)); + } else if ( c > 0x007F ) { + char buf[1000]; + UErrorCode status = U_ZERO_ERROR; + int32_t o; + + o = u_charName(c, U_UNICODE_CHAR_NAME, buf, 1000, &status); + if(U_SUCCESS(status) && (o>0) ) { + buf[6] = 0; + printf("%7s", buf); + } else { + o = u_charName(c, U_UNICODE_10_CHAR_NAME, buf, 1000, &status); + if(U_SUCCESS(status) && (o>0)) { + buf[5] = 0; + printf("~%6s", buf); + } + else { + printf(" ??????"); + } + } + } else { + switch((char)(c & 0x007F)) { + case ' ': + printf(" ' ' "); + break; + case '\t': + printf(" \\t "); + break; + case '\n': + printf(" \\n "); + break; + default: + printf(" _ "); + break; + } + } +} + + +void printUChars(const char *name = "?", + const UChar *uch = kNone, + int32_t len = -1 ) +{ + int32_t i; + + if( (len == -1) && (uch) ) { + len = u_strlen(uch); + } + + printf("%5s: ", name); + for( i = 0; i <len; i++) { + printf("%-6d ", i); + } + printf("\n"); + + printf("%5s: ", "uni"); + for( i = 0; i <len; i++) { + printf("\\u%04X ", (int)uch[i]); + } + printf("\n"); + + printf("%5s:", "ch"); + for( i = 0; i <len; i++) { + prettyPrintUChar(uch[i]); + } + printf("\n"); +} + +void printBytes(const char *name = "?", + const char *uch = "", + int32_t len = -1 ) +{ + int32_t i; + + if( (len == -1) && (uch) ) { + len = strlen(uch); + } + + printf("%5s: ", name); + for( i = 0; i <len; i++) { + printf("%-4d ", i); + } + printf("\n"); + + printf("%5s: ", "uni"); + for( i = 0; i <len; i++) { + printf("\\x%02X ", 0x00FF & (int)uch[i]); + } + printf("\n"); + + printf("%5s:", "ch"); + for( i = 0; i <len; i++) { + if(isgraph(0x00FF & (int)uch[i])) { + printf(" '%c' ", (char)uch[i]); + } else { + printf(" "); + } + } + printf("\n"); +} + +void printUChar(UChar32 ch32) +{ + if(ch32 > 0xFFFF) { + printf("ch: U+%06X\n", ch32); + } + else { + UChar ch = (UChar)ch32; + printUChars("C", &ch, 1); + } +} + +/******************************************************************* + Very simple C sample to convert the word 'Moscow' in Russian in Unicode, + followed by an exclamation mark (!) into the KOI8-R Russian code page. + + This example first creates a UChar String out of the Unicode chars. + + targetSize must be set to the amount of space available in the target + buffer. After fromUChars is called, + len will contain the number of bytes in target[] which were + used in the resulting codepage. In this case, there is a 1:1 mapping + between the input and output characters. The exclamation mark has the + same value in both KOI8-R and Unicode. + + src: 0 1 2 3 4 5 6 + uni: \u041C \u043E \u0441 \u043A \u0432 \u0430 \u0021 + ch: CYRILL CYRILL CYRILL CYRILL CYRILL CYRILL '!' + + targ: 0 1 2 3 4 5 6 + uni: \xED \xCF \xD3 \xCB \xD7 \xC1 \x21 + ch: '!' + + +Converting FROM unicode + to koi8-r. + You must call ucnv_close to clean up the memory used by the + converter. + + 'len' returns the number of OUTPUT bytes resulting from the + conversion. + */ + +UErrorCode convsample_02() +{ + printf("\n\n==============================================\n" + "Sample 02: C: simple Unicode -> koi8-r conversion\n"); + + + // **************************** START SAMPLE ******************* + // "cat<cat>OK" + UChar source[] = { 0x041C, 0x043E, 0x0441, 0x043A, 0x0432, + 0x0430, 0x0021, 0x0000 }; + char target[100]; + UErrorCode status = U_ZERO_ERROR; + UConverter *conv; + int32_t len; + + // set up the converter + conv = ucnv_open("koi8-r", &status); + assert(U_SUCCESS(status)); + + // convert to koi8-r + len = ucnv_fromUChars(conv, target, 100, source, -1, &status); + assert(U_SUCCESS(status)); + + // close the converter + ucnv_close(conv); + + // ***************************** END SAMPLE ******************** + + // Print it out + printUChars("src", source); + printf("\n"); + printBytes("targ", target, len); + + return U_ZERO_ERROR; +} + + +UErrorCode convsample_03() +{ + printf("\n\n==============================================\n" + "Sample 03: C: print out all converters\n"); + + int32_t count; + int32_t i; + + // **************************** START SAMPLE ******************* + count = ucnv_countAvailable(); + printf("Available converters: %d\n", count); + + for(i=0;i<count;i++) + { + printf("%s ", ucnv_getAvailableName(i)); + } + + // ***************************** END SAMPLE ******************** + + printf("\n"); + + return U_ZERO_ERROR; +} + + + +#define BUFFERSIZE 17 /* make it interesting :) */ + +/* + Converting from a codepage to Unicode in bulk.. + What is the best way to determine the buffer size? + + The 'buffersize' is in bytes of input. + For a given converter, divinding this by the minimum char size + give you the maximum number of Unicode characters that could be + expected for a given number of input bytes. + see: ucnv_getMinCharSize() + + For example, a single byte codepage like 'Latin-3' has a + minimum char size of 1. (It takes at least 1 byte to represent + each Unicode char.) So the unicode buffer has the same number of + UChars as the input buffer has bytes. + + In a strictly double byte codepage such as cp1362 (Windows + Korean), the minimum char size is 2. So, only half as many Unicode + chars as bytes are needed. + + This work to calculate the buffer size is an optimization. Any + size of input and output buffer can be used, as long as the + program handles the following cases: If the input buffer is empty, + the source pointer will be equal to sourceLimit. If the output + buffer has overflowed, U_BUFFER_OVERFLOW_ERROR will be returned. + */ + +UErrorCode convsample_05() +{ + printf("\n\n==============================================\n" + "Sample 05: C: count the number of letters in a UTF-8 document\n"); + + FILE *f; + int32_t count; + char inBuf[BUFFERSIZE]; + const char *source; + const char *sourceLimit; + UChar *uBuf; + UChar *target; + UChar *targetLimit; + UChar *p; + int32_t uBufSize = 0; + UConverter *conv; + UErrorCode status = U_ZERO_ERROR; + uint32_t letters=0, total=0; + + f = fopen("data01.txt", "r"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'data01.txt' (UTF-8 data file).\n"); + return U_FILE_ACCESS_ERROR; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_open("utf-8", &status); + assert(U_SUCCESS(status)); + + uBufSize = (BUFFERSIZE/ucnv_getMinCharSize(conv)); + printf("input bytes %d / min chars %d = %d UChars\n", + BUFFERSIZE, ucnv_getMinCharSize(conv), uBufSize); + uBuf = (UChar*)malloc(uBufSize * sizeof(UChar)); + assert(uBuf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, 1, BUFFERSIZE , f)) > 0) ) + { + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + do + { + target = uBuf; + targetLimit = uBuf + uBufSize; + + ucnv_toUnicode(conv, &target, targetLimit, + &source, sourceLimit, NULL, + feof(f)?TRUE:FALSE, /* pass 'flush' when eof */ + /* is true (when no more data will come) */ + &status); + + if(status == U_BUFFER_OVERFLOW_ERROR) + { + // simply ran out of space - we'll reset the target ptr the next + // time through the loop. + status = U_ZERO_ERROR; + } + else + { + // Check other errors here. + assert(U_SUCCESS(status)); + // Break out of the loop (by force) + } + + // Process the Unicode + // Todo: handle UTF-16/surrogates + + for(p = uBuf; p<target; p++) + { + if(u_isalpha(*p)) + letters++; + total++; + } + } while (source < sourceLimit); // while simply out of space + } + + printf("%d letters out of %d total UChars.\n", letters, total); + + // ***************************** END SAMPLE ******************** + ucnv_close(conv); + + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + +#define BUFFERSIZE 1024 +typedef struct +{ + UChar32 codepoint; + uint32_t frequency; +} CharFreqInfo; + +UErrorCode convsample_06() +{ + printf("\n\n==============================================\n" + "Sample 06: C: frequency distribution of letters in a UTF-8 document\n"); + + FILE *f; + int32_t count; + char inBuf[BUFFERSIZE]; + const char *source; + const char *sourceLimit; + UChar *uBuf; + int32_t uBufSize = 0; + UConverter *conv; + UErrorCode status = U_ZERO_ERROR; + uint32_t letters=0, total=0; + + CharFreqInfo *info; + UChar32 charCount = 0x10000; /* increase this if you want to handle non bmp.. todo: automatically bump it.. */ + UChar32 p; + + uint32_t ie = 0; + uint32_t gh = 0; + UChar32 l = 0; + + f = fopen("data06.txt", "r"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'data06.txt' (UTF-8 data file).\n"); + return U_FILE_ACCESS_ERROR; + } + + info = (CharFreqInfo*)malloc(sizeof(CharFreqInfo) * charCount); + if(!info) + { + fprintf(stderr, " Couldn't allocate %d bytes for freq counter\n", sizeof(CharFreqInfo)*charCount); + } + + /* reset frequencies */ + for(p=0;p<charCount;p++) + { + info[p].codepoint = p; + info[p].frequency = 0; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_open("utf-8", &status); + assert(U_SUCCESS(status)); + + uBufSize = (BUFFERSIZE/ucnv_getMinCharSize(conv)); + printf("input bytes %d / min chars %d = %d UChars\n", + BUFFERSIZE, ucnv_getMinCharSize(conv), uBufSize); + uBuf = (UChar*)malloc(uBufSize * sizeof(UChar)); + assert(uBuf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, 1, BUFFERSIZE , f)) > 0) ) + { + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + while(source < sourceLimit) + { + p = ucnv_getNextUChar(conv, &source, sourceLimit, &status); + if(U_FAILURE(status)) + { + fprintf(stderr, "%s @ %d\n", u_errorName(status), total); + status = U_ZERO_ERROR; + continue; + } + U_ASSERT(status); + total++; + + if(u_isalpha(p)) + letters++; + + if((u_tolower(l) == 'i') && (u_tolower(p) == 'e')) + ie++; + + if((u_tolower(l) == 'g') && (u_tolower(p) == 0x0127)) + gh++; + + if(p>charCount) + { + fprintf(stderr, "U+%06X: oh.., we only handle BMP characters so far.. redesign!\n", p); + return U_UNSUPPORTED_ERROR; + } + info[p].frequency++; + l = p; + } + } + + fclose(f); + ucnv_close(conv); + + printf("%d letters out of %d total UChars.\n", letters, total); + printf("%d ie digraphs, %d gh digraphs.\n", ie, gh); + + // now, we could sort it.. + + // qsort(info, charCount, sizeof(info[0]), charfreq_compare); + + for(p=0;p<charCount;p++) + { + if(info[p].frequency) + { + printf("% 5d U+%06X ", info[p].frequency, p); + if(p <= 0xFFFF) + { + prettyPrintUChar((UChar)p); + } + printf("\n"); + } + } + free(info); + // ***************************** END SAMPLE ******************** + + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + + +/****************************************************** + You must call ucnv_close to clean up the memory used by the + converter. + + 'len' returns the number of OUTPUT bytes resulting from the + conversion. + */ + +UErrorCode convsample_12() +{ + printf("\n\n==============================================\n" + "Sample 12: C: simple sjis -> unicode conversion\n"); + + + // **************************** START SAMPLE ******************* + + char source[] = { 0x63, 0x61, 0x74, (char)0x94, 0x4C, (char)0x82, 0x6E, (char)0x82, 0x6A, 0x00 }; + UChar target[100]; + UErrorCode status = U_ZERO_ERROR; + UConverter *conv; + int32_t len; + + // set up the converter + conv = ucnv_open("shift_jis", &status); + assert(U_SUCCESS(status)); + + // convert to Unicode + // Note: we can use strlen, we know it's an 8 bit null terminated codepage + target[6] = 0xFDCA; + len = ucnv_toUChars(conv, target, 100, source, strlen(source), &status); + U_ASSERT(status); + // close the converter + ucnv_close(conv); + + // ***************************** END SAMPLE ******************** + + // Print it out + printBytes("src", source, strlen(source) ); + printf("\n"); + printUChars("targ", target, len); + + return U_ZERO_ERROR; +} + +/****************************************************************** + C: Convert from codepage to Unicode one at a time. +*/ + +UErrorCode convsample_13() +{ + printf("\n\n==============================================\n" + "Sample 13: C: simple Big5 -> unicode conversion, char at a time\n"); + + + const char sourceChars[] = { 0x7a, 0x68, 0x3d, (char)0xa4, (char)0xa4, (char)0xa4, (char)0xe5, (char)0x2e }; + // const char sourceChars[] = { 0x7a, 0x68, 0x3d, 0xe4, 0xb8, 0xad, 0xe6, 0x96, 0x87, 0x2e }; + const char *source, *sourceLimit; + UChar32 target; + UErrorCode status = U_ZERO_ERROR; + UConverter *conv = NULL; + int32_t srcCount=0; + int32_t dstCount=0; + + srcCount = sizeof(sourceChars); + + conv = ucnv_open("Big5", &status); + U_ASSERT(status); + + source = sourceChars; + sourceLimit = sourceChars + sizeof(sourceChars); + + // **************************** START SAMPLE ******************* + + + printBytes("src",source,sourceLimit-source); + + while(source < sourceLimit) + { + puts(""); + target = ucnv_getNextUChar (conv, + &source, + sourceLimit, + &status); + + // printBytes("src",source,sourceLimit-source); + U_ASSERT(status); + printUChar(target); + dstCount++; + } + + + // ************************** END SAMPLE ************************* + + printf("src=%d bytes, dst=%d uchars\n", srcCount, dstCount); + ucnv_close(conv); + + return U_ZERO_ERROR; +} + + + + +UBool convsample_20_didSubstitute(const char *source) +{ + UChar uchars[100]; + char bytes[100]; + UConverter *conv = NULL; + UErrorCode status = U_ZERO_ERROR; + uint32_t len, len2; + UBool flagVal; + + FromUFLAGContext * context = NULL; + + printf("\n\n==============================================\n" + "Sample 20: C: Test for substitution using callbacks\n"); + + /* print out the original source */ + printBytes("src", source); + printf("\n"); + + /* First, convert from UTF8 to unicode */ + conv = ucnv_open("utf-8", &status); + U_ASSERT(status); + + len = ucnv_toUChars(conv, uchars, 100, source, strlen(source), &status); + U_ASSERT(status); + + printUChars("uch", uchars, len); + printf("\n"); + + /* Now, close the converter */ + ucnv_close(conv); + + /* Now, convert to windows-1252 */ + conv = ucnv_open("windows-1252", &status); + U_ASSERT(status); + + /* Converter starts out with the SUBSTITUTE callback set. */ + + /* initialize our callback */ + context = flagCB_fromU_openContext(); + + /* Set our special callback */ + ucnv_setFromUCallBack(conv, + flagCB_fromU, + context, + &(context->subCallback), + &(context->subContext), + &status); + + U_ASSERT(status); + + len2 = ucnv_fromUChars(conv, bytes, 100, uchars, len, &status); + U_ASSERT(status); + + flagVal = context->flag; /* it's about to go away when we close the cnv */ + + ucnv_close(conv); + + /* print out the original source */ + printBytes("bytes", bytes, len2); + + return flagVal; /* true if callback was called */ +} + +UErrorCode convsample_20() +{ + const char *sample1 = "abc\xdf\xbf"; + const char *sample2 = "abc_def"; + + + if(convsample_20_didSubstitute(sample1)) + { + printf("DID substitute.\n******\n"); + } + else + { + printf("Did NOT substitute.\n*****\n"); + } + + if(convsample_20_didSubstitute(sample2)) + { + printf("DID substitute.\n******\n"); + } + else + { + printf("Did NOT substitute.\n*****\n"); + } + + return U_ZERO_ERROR; +} + +// 21 - C, callback, with clone and debug + + + +UBool convsample_21_didSubstitute(const char *source) +{ + UChar uchars[100]; + char bytes[100]; + UConverter *conv = NULL, *cloneCnv = NULL; + UErrorCode status = U_ZERO_ERROR; + uint32_t len, len2; + int32_t cloneLen; + UBool flagVal = FALSE; + UConverterFromUCallback junkCB; + + FromUFLAGContext *flagCtx = NULL, + *cloneFlagCtx = NULL; + + debugCBContext *debugCtx1 = NULL, + *debugCtx2 = NULL, + *cloneDebugCtx = NULL; + + printf("\n\n==============================================\n" + "Sample 21: C: Test for substitution w/ callbacks & clones \n"); + + /* print out the original source */ + printBytes("src", source); + printf("\n"); + + /* First, convert from UTF8 to unicode */ + conv = ucnv_open("utf-8", &status); + U_ASSERT(status); + + len = ucnv_toUChars(conv, uchars, 100, source, strlen(source), &status); + U_ASSERT(status); + + printUChars("uch", uchars, len); + printf("\n"); + + /* Now, close the converter */ + ucnv_close(conv); + + /* Now, convert to windows-1252 */ + conv = ucnv_open("windows-1252", &status); + U_ASSERT(status); + + /* Converter starts out with the SUBSTITUTE callback set. */ + + /* initialize our callback */ + /* from the 'bottom' innermost, out + * CNV -> debugCtx1[debug] -> flagCtx[flag] -> debugCtx2[debug] */ + +#if DEBUG_TMI + printf("flagCB_fromU = %p\n", &flagCB_fromU); + printf("debugCB_fromU = %p\n", &debugCB_fromU); +#endif + + debugCtx1 = debugCB_openContext(); + flagCtx = flagCB_fromU_openContext(); + debugCtx2 = debugCB_openContext(); + + debugCtx1->subCallback = flagCB_fromU; /* debug1 -> flag */ + debugCtx1->subContext = flagCtx; + + flagCtx->subCallback = debugCB_fromU; /* flag -> debug2 */ + flagCtx->subContext = debugCtx2; + + debugCtx2->subCallback = UCNV_FROM_U_CALLBACK_SUBSTITUTE; + debugCtx2->subContext = NULL; + + /* Set our special callback */ + + ucnv_setFromUCallBack(conv, + debugCB_fromU, + debugCtx1, + &(debugCtx2->subCallback), + &(debugCtx2->subContext), + &status); + + U_ASSERT(status); + +#if DEBUG_TMI + printf("Callback chain now: Converter %p -> debug1:%p-> (%p:%p)==flag:%p -> debug2:%p -> cb %p\n", + conv, debugCtx1, debugCtx1->subCallback, + debugCtx1->subContext, flagCtx, debugCtx2, debugCtx2->subCallback); +#endif + + cloneLen = 1; /* but passing in null so it will clone */ + cloneCnv = ucnv_safeClone(conv, NULL, &cloneLen, &status); + + U_ASSERT(status); + +#if DEBUG_TMI + printf("Cloned converter from %p -> %p. Closing %p.\n", conv, cloneCnv, conv); +#endif + + ucnv_close(conv); + +#if DEBUG_TMI + printf("%p closed.\n", conv); +#endif + + U_ASSERT(status); + /* Now, we have to extract the context */ + cloneDebugCtx = NULL; + cloneFlagCtx = NULL; + + ucnv_getFromUCallBack(cloneCnv, &junkCB, (const void **)&cloneDebugCtx); + if(cloneDebugCtx != NULL) { + cloneFlagCtx = (FromUFLAGContext*) cloneDebugCtx -> subContext; + } + + printf("Cloned converter chain: %p -> %p[debug1] -> %p[flag] -> %p[debug2] -> substitute\n", + cloneCnv, cloneDebugCtx, cloneFlagCtx, cloneFlagCtx?cloneFlagCtx->subContext:NULL ); + + len2 = ucnv_fromUChars(cloneCnv, bytes, 100, uchars, len, &status); + U_ASSERT(status); + + if(cloneFlagCtx != NULL) { + flagVal = cloneFlagCtx->flag; /* it's about to go away when we close the cnv */ + } else { + printf("** Warning, couldn't get the subcallback \n"); + } + + ucnv_close(cloneCnv); + + /* print out the original source */ + printBytes("bytes", bytes, len2); + + return flagVal; /* true if callback was called */ +} + +UErrorCode convsample_21() +{ + const char *sample1 = "abc\xdf\xbf"; + const char *sample2 = "abc_def"; + + if(convsample_21_didSubstitute(sample1)) + { + printf("DID substitute.\n******\n"); + } + else + { + printf("Did NOT substitute.\n*****\n"); + } + + if(convsample_21_didSubstitute(sample2)) + { + printf("DID substitute.\n******\n"); + } + else + { + printf("Did NOT substitute.\n*****\n"); + } + + return U_ZERO_ERROR; +} + + +// 40- C, cp37 -> UTF16 [data02.bin -> data40.utf16] + +#define BUFFERSIZE 17 /* make it interesting :) */ + +UErrorCode convsample_40() +{ + printf("\n\n==============================================\n" + "Sample 40: C: convert data02.bin from cp37 to UTF16 [data40.utf16]\n"); + + FILE *f; + FILE *out; + int32_t count; + char inBuf[BUFFERSIZE]; + const char *source; + const char *sourceLimit; + UChar *uBuf; + UChar *target; + UChar *targetLimit; + int32_t uBufSize = 0; + UConverter *conv = NULL; + UErrorCode status = U_ZERO_ERROR; + uint32_t inbytes=0, total=0; + + f = fopen("data02.bin", "rb"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'data02.bin' (cp37 data file).\n"); + return U_FILE_ACCESS_ERROR; + } + + out = fopen("data40.utf16", "wb"); + if(!out) + { + fprintf(stderr, "Couldn't create file 'data40.utf16'.\n"); + return U_FILE_ACCESS_ERROR; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_openCCSID(37, UCNV_IBM, &status); + assert(U_SUCCESS(status)); + + uBufSize = (BUFFERSIZE/ucnv_getMinCharSize(conv)); + printf("input bytes %d / min chars %d = %d UChars\n", + BUFFERSIZE, ucnv_getMinCharSize(conv), uBufSize); + uBuf = (UChar*)malloc(uBufSize * sizeof(UChar)); + assert(uBuf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, 1, BUFFERSIZE , f)) > 0) ) + { + inbytes += count; + + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + do + { + target = uBuf; + targetLimit = uBuf + uBufSize; + + ucnv_toUnicode( conv, &target, targetLimit, + &source, sourceLimit, NULL, + feof(f)?TRUE:FALSE, /* pass 'flush' when eof */ + /* is true (when no more data will come) */ + &status); + + if(status == U_BUFFER_OVERFLOW_ERROR) + { + // simply ran out of space - we'll reset the target ptr the next + // time through the loop. + status = U_ZERO_ERROR; + } + else + { + // Check other errors here. + assert(U_SUCCESS(status)); + // Break out of the loop (by force) + } + + // Process the Unicode + // Todo: handle UTF-16/surrogates + assert(fwrite(uBuf, sizeof(uBuf[0]), (target-uBuf), out) == + (size_t)(target-uBuf)); + total += (target-uBuf); + } while (source < sourceLimit); // while simply out of space + } + + printf("%d bytes in, %d UChars out.\n", inbytes, total); + + // ***************************** END SAMPLE ******************** + ucnv_close(conv); + + fclose(f); + fclose(out); + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + + + +// 46- C, UTF16 -> latin2 [data40.utf16 -> data46.out] + +#define BUFFERSIZE 24 /* make it interesting :) */ + +UErrorCode convsample_46() +{ + printf("\n\n==============================================\n" + "Sample 46: C: convert data40.utf16 from UTF16 to latin2 [data46.out]\n"); + + FILE *f; + FILE *out; + int32_t count; + UChar inBuf[BUFFERSIZE]; + const UChar *source; + const UChar *sourceLimit; + char *buf; + char *target; + char *targetLimit; + + int32_t bufSize = 0; + UConverter *conv = NULL; + UErrorCode status = U_ZERO_ERROR; + uint32_t inchars=0, total=0; + + f = fopen("data40.utf16", "rb"); + if(!f) + { + fprintf(stderr, "Couldn't open file 'data40.utf16' (did you run convsample_40() ?)\n"); + return U_FILE_ACCESS_ERROR; + } + + out = fopen("data46.out", "wb"); + if(!out) + { + fprintf(stderr, "Couldn't create file 'data46.out'.\n"); + return U_FILE_ACCESS_ERROR; + } + + // **************************** START SAMPLE ******************* + conv = ucnv_open( "iso-8859-2", &status); + assert(U_SUCCESS(status)); + + bufSize = (BUFFERSIZE*ucnv_getMaxCharSize(conv)); + printf("input UChars[16] %d * max charsize %d = %d bytes output buffer\n", + BUFFERSIZE, ucnv_getMaxCharSize(conv), bufSize); + buf = (char*)malloc(bufSize * sizeof(char)); + assert(buf!=NULL); + + // grab another buffer's worth + while((!feof(f)) && + ((count=fread(inBuf, sizeof(UChar), BUFFERSIZE , f)) > 0) ) + { + inchars += count; + + // Convert bytes to unicode + source = inBuf; + sourceLimit = inBuf + count; + + do + { + target = buf; + targetLimit = buf + bufSize; + + ucnv_fromUnicode( conv, &target, targetLimit, + &source, sourceLimit, NULL, + feof(f)?TRUE:FALSE, /* pass 'flush' when eof */ + /* is true (when no more data will come) */ + &status); + + if(status == U_BUFFER_OVERFLOW_ERROR) + { + // simply ran out of space - we'll reset the target ptr the next + // time through the loop. + status = U_ZERO_ERROR; + } + else + { + // Check other errors here. + assert(U_SUCCESS(status)); + // Break out of the loop (by force) + } + + // Process the Unicode + assert(fwrite(buf, sizeof(buf[0]), (target-buf), out) == + (size_t)(target-buf)); + total += (target-buf); + } while (source < sourceLimit); // while simply out of space + } + + printf("%d Uchars (%d bytes) in, %d chars out.\n", inchars, inchars * sizeof(UChar), total); + + // ***************************** END SAMPLE ******************** + ucnv_close(conv); + + fclose(f); + fclose(out); + printf("\n"); + + return U_ZERO_ERROR; +} +#undef BUFFERSIZE + +#define BUFFERSIZE 219 + + +/* main */ + +int main() +{ + + printf("Default Converter=%s\n", ucnv_getDefaultName() ); + + convsample_02(); // C , u->koi8r, conv + convsample_03(); // C, iterate + + convsample_05(); // C, utf8->u, getNextUChar + convsample_06(); // C freq counter thingy + + convsample_12(); // C, sjis->u, conv + convsample_13(); // C, big5->u, getNextU + + convsample_20(); // C, callback + convsample_21(); // C, callback debug + + convsample_40(); // C, cp37 -> UTF16 [data02.bin -> data40.utf16] + + convsample_46(); // C, UTF16 -> latin3 [data41.utf16 -> data46.out] + + printf("End of converter samples.\n"); + + fflush(stdout); + fflush(stderr); + + return 0; +} Copied: trunk/opentrep/test/i18n/icu/icufmt.cpp (from rev 173, trunk/opentrep/test/i18n/icufmt.cpp) =================================================================== --- trunk/opentrep/test/i18n/icu/icufmt.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icufmt.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,27 @@ +// STL +#include <iostream> +// ICU +#include <unicode/choicfmt.h> +#include <unicode/unistr.h> +#include <unicode/ustream.h> + +// //////////// M A I N ///////////// +int main (int argc, char *argv[]) { + double limits[] = {1,2,3,4,5,6,7}; + + UnicodeString weekDayNames[] = { + "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"}; + + ChoiceFormat fmt (limits, weekDayNames, 7); + + UnicodeString str; + for (double x = 1.0; x != 8.0; x += 1.0) { + fmt.format(x, str); + std::cout << x << " -> " << str << std::endl; + } + + std::cout << std::endl; + + return 0; +} + Added: trunk/opentrep/test/i18n/icu/icuustring.cpp =================================================================== --- trunk/opentrep/test/i18n/icu/icuustring.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icuustring.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,116 @@ +// STL +#include <cstdio> +#include <iostream> +// ICU +#include <unicode/utypes.h> +#include <unicode/uchar.h> +#include <unicode/locid.h> +#include <unicode/ustring.h> +#include <unicode/ucnv.h> +#include <unicode/unistr.h> + +#define LENGTHOF(array) (sizeof(array)/sizeof((array)[0])) + +// helper functions -------------------------------------------------------- *** + +// default converter for the platform encoding +static UConverter* cnv = NULL; + +static void +printUnicodeString(const char *announce, const UnicodeString &s) { + static char out[200]; + int32_t i, length; + + // output the string, converted to the platform encoding + + // Note for Windows: The "platform encoding" defaults to the "ANSI codepage", + // which is different from the "OEM codepage" in the console window. + // However, if you pipe the output into a file and look at it with Notepad + // or similar, then "ANSI" characters will show correctly. + // Production code should be aware of what encoding is required, + // and use a UConverter or at least a charset name explicitly. + out[s.extract(0, 99, out)]=0; + printf("%s%s {", announce, out); + + // output the code units (not code points) + length=s.length(); + for(i=0; i<length; ++i) { + printf(" %04x", s.charAt(i)); + } + printf(" }\n"); +} + +static void demoCaseMapInCPlusPlus() { + /* + * input= + * "<Cyrillic Capital Letter BE>" + * "<Cyrillic Capital Letter GHE>" + */ + static const UChar input[]={ + 0x411, 0x413, 0 + }; + + std::cout << std::endl << "* demoCaseMapInCPlusPlus() --------- ***" + << std::endl << std::endl; + + UnicodeString s(input), t; + const Locale& en = Locale::getEnglish(); + Locale ru ("ru"); + + /* + * Full case mappings as in demoCaseMapInC(), using UnicodeString functions. + * These functions modify the string object itself. + * Since we want to keep the input string around, we copy it each time + * and case-map the copy. + */ + printUnicodeString("input string: ", s); + + /* lowercase/English */ + printUnicodeString("full-lowercased/en: ", (t=s).toLower(en)); + /* lowercase/Russian */ + printUnicodeString("full-lowercased/ru: ", (t=s).toLower(ru)); + /* uppercase/English */ + printUnicodeString("full-uppercased/en: ", (t=s).toUpper(en)); + /* uppercase/Russian */ + printUnicodeString("full-uppercased/ru: ", (t=s).toUpper(ru)); + /* titlecase/English */ + printUnicodeString("full-titlecased/en: ", (t=s).toTitle(NULL, en)); + /* titlecase/Russian */ + printUnicodeString("full-titlecased/ru: ", (t=s).toTitle(NULL, ru)); + /* case-folde/default */ + printUnicodeString("full-case-folded/default: ", (t=s).foldCase(U_FOLD_CASE_DEFAULT)); + /* case-folde/Russian */ + printUnicodeString("full-case-folded/Russian: ", (t=s).foldCase(U_FOLD_CASE_EXCLUDE_SPECIAL_I)); +} + +extern int +main(int argc, const char *argv[]) { + UErrorCode errorCode=U_ZERO_ERROR; + + // Note: Using a global variable for any object is not exactly + // thread-safe... + // You can change this call to e.g. ucnv_open("UTF-8", &errorCode) + // if you pipe the output to a file and look at it with a + // Unicode-capable editor. This will currently affect only the + // printUString() function, see the code above. + // printUnicodeString() could use this, too, by changing to an + // extract() overload that takes a UConverter argument. + // cnv = ucnv_open(NULL, &errorCode); + cnv = ucnv_open("UTF-8", &errorCode); + if(U_FAILURE(errorCode)) { + fprintf(stderr, "error %s opening the default converter\n", u_errorName(errorCode)); + return errorCode; + } + + ucnv_setFromUCallBack(cnv, UCNV_FROM_U_CALLBACK_ESCAPE, UCNV_ESCAPE_C, NULL, NULL, &errorCode); + if(U_FAILURE(errorCode)) { + fprintf(stderr, "error %s setting the escape callback in the default converter\n", u_errorName(errorCode)); + ucnv_close(cnv); + return errorCode; + } + + demoCaseMapInCPlusPlus(); + + ucnv_close(cnv); + return 0; +} Added: trunk/opentrep/test/i18n/icu/icuustringref.cpp =================================================================== --- trunk/opentrep/test/i18n/icu/icuustringref.cpp (rev 0) +++ trunk/opentrep/test/i18n/icu/icuustringref.cpp 2009-08-14 17:51:06 UTC (rev 176) @@ -0,0 +1,609 @@ +/* +******************************************************************************* +* +* Copyright (C) 2000-2002, International Business Machines +* Corporation and others. All Rights Reserved. +* +******************************************************************************* +* file name: ustring.c +* encoding: US-ASCII +* tab size: 8 (not used) +* indentation:4 +* +* created on: 2000aug15 +* created by: Markus W. Scherer +* +* This file contains sample code that illustrates the use of Unicode strings +* with ICU. +*/ + +#include <stdio.h> +#include "unicode/utypes.h" +#include "unicode/uchar.h" +#include "unicode/locid.h" +#include "unicode/ustring.h" +#include "unicode/ucnv.h" +#include "unicode/unistr.h" + +#define LENGTHOF(array) (sizeof(array)/sizeof((array)[0])) + +// helper functions -------------------------------------------------------- *** + +// default converter for the platform encoding +static UConverter *cnv=NULL; + +static void +printUString(const char *announce, const UChar *s, int32_t length) { + static char out[200]; + UChar32 c; + int32_t i; + UErrorCode errorCode=U_ZERO_ERROR; + + /* + * Convert to the "platform encoding". See notes in printUnicodeString(). + * ucnv_fromUChars(), like most ICU APIs understands length==-1 + * to mean that the string is NUL-terminated. + */ + ucnv_fromUChars(cnv, out, sizeof(out), s, length, &errorCode); + if(U_FAILURE(errorCode) || errorCode==U_STRING_NOT_TERMINATED_WARNING) { + printf("%sproblem converting string from Unicode: %s\n", announce, u_errorName(errorCode)); + return; + } + + printf("%s%s {", announce, out); + + /* output the code points (not code units) */ + if(length>=0) { + /* s is not NUL-terminated */ + for(i=0; i<length; /* U16_NEXT post-increments */) { + U16_NEXT(s, i, length, c); + printf(" %04x", c); + } + } else { + /* s is NUL-terminated */ + for(i=0; /* condition in loop body */; /* U16_NEXT post-increments */) { + U16_NEXT(s, i, length, c); + if(c==0) { + break; + } + printf(" %04x", c); + } + } + printf(" }\n"); +} + +static void +printUnicodeString(const char *announce, const UnicodeString &s) { + static char out[200]; + int32_t i, length; + + // output the string, converted to the platform encoding + + // Note for Windows: The "platform encoding" defaults to the "ANSI codepage", + // which is different from the "OEM codepage" in the console window. + // However, if you pipe the output into a file and look at it with Notepad + // or similar, then "ANSI" characters will show correctly. + // Production code should be aware of what encoding is required, + // and use a UConverter or at least a charset name explicitly. + out[s.extract(0, 99, out)]=0; + printf("%s%s {", announce, out); + + // output the code units (not code points) + length=s.length(); + for(i=0; i<length; ++i) { + printf(" %04x", s.charAt(i)); + } + printf(" }\n"); +} + +// sample code for utf.h macros -------------------------------------------- *** + +static void +demo_utf_h_macros() { + static UChar input[]={ 0x0061, 0xd800, 0xdc00, 0xdbff, 0xdfff, 0x0062 }; + UChar32 c; + int32_t i; + UBool isError; + + printf("\n* demo_utf_h_macros() -------------- ***\n\n"); + + printUString("iterate forward through: ", input, LENGTHOF(input)); + for(i=0; i<LENGTHOF(input); /* U16_NEXT post-increments */) { + /* Iterating forwards + Codepoint at offset 0: U+0061 + Codepoint at offset 1: U+10000 + Codepoint at offset 3: U+10ffff + Codepoint at offset 5: U+0062 + */ + printf("Codepoint at offset %d: U+", i); + U16_NEXT(input, i, LENGTHOF(input), c); + printf("%04x\n", c); + } + + puts(""); + + isError=FALSE; + i=1; /* write position, gets post-incremented so needs to be in an l-value */ + U16_APPEND(input, i, LENGTHOF(input), 0x0062, isError); + + printUString("iterate backward through: ", input, LENGTHOF(input)); + for(i=LENGTHOF(input); i>0; /* U16_PREV pre-decrements */) { + U16_PREV(input, 0, i, c); + /* Iterating backwards + Codepoint at offset 5: U+0062 + Codepoint at offset 3: U+10ffff + Codepoint at offset 2: U+dc00 -- unpaired surrogate because lead surr. overwritten + Codepoint at offset 1: U+0062 -- by this BMP code point + Codepoint at offset 0: U+0061 + */ + printf("Codepoint at offset %d: U+%04x\n", i, c); + } +} + +// sample code for Unicode strings in C ------------... [truncated message content] |
From: <den...@us...> - 2009-08-15 16:04:46
|
Revision: 177 http://opentrep.svn.sourceforge.net/opentrep/?rev=177&view=rev Author: denis_arnaud Date: 2009-08-15 16:04:38 +0000 (Sat, 15 Aug 2009) Log Message: ----------- [i18n] Added a tool on UTF-8 string handling, by Jeff Bezanson (Wikix). Modified Paths: -------------- trunk/opentrep/configure.ac trunk/opentrep/test/i18n/Makefile.am Added Paths: ----------- trunk/opentrep/test/i18n/utf8/ trunk/opentrep/test/i18n/utf8/Makefile.am trunk/opentrep/test/i18n/utf8/utf8.cpp trunk/opentrep/test/i18n/utf8/utf8.hpp Modified: trunk/opentrep/configure.ac =================================================================== --- trunk/opentrep/configure.ac 2009-08-14 17:51:06 UTC (rev 176) +++ trunk/opentrep/configure.ac 2009-08-15 16:04:38 UTC (rev 177) @@ -265,6 +265,7 @@ test/parsers/Makefile test/i18n/Makefile test/i18n/icu/Makefile + test/i18n/utf8/Makefile test/python/Makefile test/iterator/Makefile test/Makefile Modified: trunk/opentrep/test/i18n/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/Makefile.am 2009-08-14 17:51:06 UTC (rev 176) +++ trunk/opentrep/test/i18n/Makefile.am 2009-08-15 16:04:38 UTC (rev 177) @@ -1,6 +1,8 @@ ## command sub-directory include $(top_srcdir)/Makefile.common +SUBDIRS = icu utf8 + MAINTAINERCLEANFILES = Makefile.in check_PROGRAMS = boost_string loc2 stdlocru simple_io Property changes on: trunk/opentrep/test/i18n/utf8 ___________________________________________________________________ Added: svn:ignore + .deps .libs Makefile Makefile.in utf8 Added: trunk/opentrep/test/i18n/utf8/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/utf8/Makefile.am (rev 0) +++ trunk/opentrep/test/i18n/utf8/Makefile.am 2009-08-15 16:04:38 UTC (rev 177) @@ -0,0 +1,12 @@ +## command sub-directory +include $(top_srcdir)/Makefile.common + +MAINTAINERCLEANFILES = Makefile.in + +check_PROGRAMS = utf8 + +utf8_SOURCES = utf8.cpp +utf8_CXXFLAGS = +utf8_LDFLAGS = + +EXTRA_DIST = Added: trunk/opentrep/test/i18n/utf8/utf8.cpp =================================================================== --- trunk/opentrep/test/i18n/utf8/utf8.cpp (rev 0) +++ trunk/opentrep/test/i18n/utf8/utf8.cpp 2009-08-15 16:04:38 UTC (rev 177) @@ -0,0 +1,483 @@ +/* + Basic UTF-8 manipulation routines + by Jeff Bezanson + placed in the public domain Fall 2005 + + This code is designed to provide the utilities you need to manipulate + UTF-8 as an internal string encoding. These functions do not perform the + error checking normally needed when handling UTF-8 data, so if you happen + to be from the Unicode Consortium you will want to flay me alive. + I do this because error checking can be performed at the boundaries (I/O), + with these routines reserved for higher performance on data known to be + valid. +*/ +#include <cstdlib> +#include <cstdio> +#include <cstring> +#include <cstdarg> +#ifdef WIN32 +#include <malloc.h> +#else +#include <alloca.h> +#endif + +#include "utf8.hpp" + +static const u_int32_t offsetsFromUTF8[6] = { + 0x00000000UL, 0x00003080UL, 0x000E2080UL, + 0x03C82080UL, 0xFA082080UL, 0x82082080UL +}; + +static const char trailingBytesForUTF8[256] = { + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, + 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 +}; + +/* returns length of next utf-8 sequence */ +int u8_seqlen(char *s) +{ + return trailingBytesForUTF8[(unsigned int)(unsigned char)s[0]] + 1; +} + +/* conversions without error checking + only works for valid UTF-8, i.e. no 5- or 6-byte sequences + srcsz = source size in bytes, or -1 if 0-terminated + sz = dest size in # of wide characters + + returns # characters converted + dest will always be L'\0'-terminated, even if there isn't enough room + for all the characters. + if sz = srcsz+1 (i.e. 4*srcsz+4 bytes), there will always be enough space. +*/ +int u8_toucs(u_int32_t *dest, int sz, char *src, int srcsz) +{ + u_int32_t ch; + char *src_end = src + srcsz; + int nb; + int i=0; + + while (i < sz-1) { + nb = trailingBytesForUTF8[(unsigned char)*src]; + if (srcsz == -1) { + if (*src == 0) + goto done_toucs; + } + else { + if (src + nb >= src_end) + goto done_toucs; + } + ch = 0; + switch (nb) { + /* these fall through deliberately */ + case 3: ch += (unsigned char)*src++; ch <<= 6; + case 2: ch += (unsigned char)*src++; ch <<= 6; + case 1: ch += (unsigned char)*src++; ch <<= 6; + case 0: ch += (unsigned char)*src++; + } + ch -= offsetsFromUTF8[nb]; + dest[i++] = ch; + } + done_toucs: + dest[i] = 0; + return i; +} + +/* srcsz = number of source characters, or -1 if 0-terminated + sz = size of dest buffer in bytes + + returns # characters converted + dest will only be '\0'-terminated if there is enough space. this is + for consistency; imagine there are 2 bytes of space left, but the next + character requires 3 bytes. in this case we could NUL-terminate, but in + general we can't when there's insufficient space. therefore this function + only NUL-terminates if all the characters fit, and there's space for + the NUL as well. + the destination string will never be bigger than the source string. +*/ +int u8_toutf8(char *dest, int sz, u_int32_t *src, int srcsz) +{ + u_int32_t ch; + int i = 0; + char *dest_end = dest + sz; + + while (srcsz<0 ? src[i]!=0 : i < srcsz) { + ch = src[i]; + if (ch < 0x80) { + if (dest >= dest_end) + return i; + *dest++ = (char)ch; + } + else if (ch < 0x800) { + if (dest >= dest_end-1) + return i; + *dest++ = (ch>>6) | 0xC0; + *dest++ = (ch & 0x3F) | 0x80; + } + else if (ch < 0x10000) { + if (dest >= dest_end-2) + return i; + *dest++ = (ch>>12) | 0xE0; + *dest++ = ((ch>>6) & 0x3F) | 0x80; + *dest++ = (ch & 0x3F) | 0x80; + } + else if (ch < 0x110000) { + if (dest >= dest_end-3) + return i; + *dest++ = (ch>>18) | 0xF0; + *dest++ = ((ch>>12) & 0x3F) | 0x80; + *dest++ = ((ch>>6) & 0x3F) | 0x80; + *dest++ = (ch & 0x3F) | 0x80; + } + i++; + } + if (dest < dest_end) + *dest = '\0'; + return i; +} + +int u8_wc_toutf8(char *dest, u_int32_t ch) +{ + if (ch < 0x80) { + dest[0] = (char)ch; + return 1; + } + if (ch < 0x800) { + dest[0] = (ch>>6) | 0xC0; + dest[1] = (ch & 0x3F) | 0x80; + return 2; + } + if (ch < 0x10000) { + dest[0] = (ch>>12) | 0xE0; + dest[1] = ((ch>>6) & 0x3F) | 0x80; + dest[2] = (ch & 0x3F) | 0x80; + return 3; + } + if (ch < 0x110000) { + dest[0] = (ch>>18) | 0xF0; + dest[1] = ((ch>>12) & 0x3F) | 0x80; + dest[2] = ((ch>>6) & 0x3F) | 0x80; + dest[3] = (ch & 0x3F) | 0x80; + return 4; + } + return 0; +} + +/* charnum => byte offset */ +int u8_offset(char *str, int charnum) +{ + int offs=0; + + while (charnum > 0 && str[offs]) { + (void)(isutf(str[++offs]) || isutf(str[++offs]) || + isutf(str[++offs]) || ++offs); + charnum--; + } + return offs; +} + +/* byte offset => charnum */ +int u8_charnum(char *s, int offset) +{ + int charnum = 0, offs=0; + + while (offs < offset && s[offs]) { + (void)(isutf(s[++offs]) || isutf(s[++offs]) || + isutf(s[++offs]) || ++offs); + charnum++; + } + return charnum; +} + +/* number of characters */ +int u8_strlen(char *s) +{ + int count = 0; + int i = 0; + + while (u8_nextchar(s, &i) != 0) + count++; + + return count; +} + +/* reads the next utf-8 sequence out of a string, updating an index */ +u_int32_t u8_nextchar(char *s, int *i) +{ + u_int32_t ch = 0; + int sz = 0; + + do { + ch <<= 6; + ch += (unsigned char)s[(*i)++]; + sz++; + } while (s[*i] && !isutf(s[*i])); + ch -= offsetsFromUTF8[sz-1]; + + return ch; +} + +void u8_inc(char *s, int *i) +{ + (void)(isutf(s[++(*i)]) || isutf(s[++(*i)]) || + isutf(s[++(*i)]) || ++(*i)); +} + +void u8_dec(char *s, int *i) +{ + (void)(isutf(s[--(*i)]) || isutf(s[--(*i)]) || + isutf(s[--(*i)]) || --(*i)); +} + +int octal_digit(char c) +{ + return (c >= '0' && c <= '7'); +} + +int hex_digit(char c) +{ + return ((c >= '0' && c <= '9') || + (c >= 'A' && c <= 'F') || + (c >= 'a' && c <= 'f')); +} + +/* assumes that src points to the character after a backslash + returns number of input characters processed */ +int u8_read_escape_sequence(char *str, u_int32_t *dest) +{ + u_int32_t ch; + char digs[10]="\0\0\0\0\0\0\0\0\0"; + int dno=0, i=1; + + ch = (u_int32_t)str[0]; /* take literal character */ + if (str[0] == 'n') + ch = L'\n'; + else if (str[0] == 't') + ch = L'\t'; + else if (str[0] == 'r') + ch = L'\r'; + else if (str[0] == 'b') + ch = L'\b'; + else if (str[0] == 'f') + ch = L'\f'; + else if (str[0] == 'v') + ch = L'\v'; + else if (str[0] == 'a') + ch = L'\a'; + else if (octal_digit(str[0])) { + i = 0; + do { + digs[dno++] = str[i++]; + } while (octal_digit(str[i]) && dno < 3); + ch = strtol(digs, NULL, 8); + } + else if (str[0] == 'x') { + while (hex_digit(str[i]) && dno < 2) { + digs[dno++] = str[i++]; + } + if (dno > 0) + ch = strtol(digs, NULL, 16); + } + else if (str[0] == 'u') { + while (hex_digit(str[i]) && dno < 4) { + digs[dno++] = str[i++]; + } + if (dno > 0) + ch = strtol(digs, NULL, 16); + } + else if (str[0] == 'U') { + while (hex_digit(str[i]) && dno < 8) { + digs[dno++] = str[i++]; + } + if (dno > 0) + ch = strtol(digs, NULL, 16); + } + *dest = ch; + + return i; +} + +/* convert a string with literal \uxxxx or \Uxxxxxxxx characters to UTF-8 + example: u8_unescape(mybuf, 256, "hello\\u220e") + note the double backslash is needed if called on a C string literal */ +int u8_unescape(char *buf, int sz, char *src) +{ + int c=0, amt; + u_int32_t ch; + char temp[4]; + + while (*src && c < sz) { + if (*src == '\\') { + src++; + amt = u8_read_escape_sequence(src, &ch); + } + else { + ch = (u_int32_t)*src; + amt = 1; + } + src += amt; + amt = u8_wc_toutf8(temp, ch); + if (amt > sz-c) + break; + memcpy(&buf[c], temp, amt); + c += amt; + } + if (c < sz) + buf[c] = '\0'; + return c; +} + +int u8_escape_wchar(char *buf, int sz, u_int32_t ch) +{ + if (ch == L'\n') + return snprintf(buf, sz, "\\n"); + else if (ch == L'\t') + return snprintf(buf, sz, "\\t"); + else if (ch == L'\r') + return snprintf(buf, sz, "\\r"); + else if (ch == L'\b') + return snprintf(buf, sz, "\\b"); + else if (ch == L'\f') + return snprintf(buf, sz, "\\f"); + else if (ch == L'\v') + return snprintf(buf, sz, "\\v"); + else if (ch == L'\a') + return snprintf(buf, sz, "\\a"); + else if (ch == L'\\') + return snprintf(buf, sz, "\\\\"); + else if (ch < 32 || ch == 0x7f) + return snprintf(buf, sz, "\\x%hhX", (unsigned char)ch); + else if (ch > 0xFFFF) + return snprintf(buf, sz, "\\U%.8X", (u_int32_t)ch); + else if (ch >= 0x80 && ch <= 0xFFFF) + return snprintf(buf, sz, "\\u%.4hX", (unsigned short)ch); + + return snprintf(buf, sz, "%c", (char)ch); +} + +int u8_escape(char *buf, int sz, char *src, int escape_quotes) +{ + int c=0, i=0, amt; + + while (src[i] && c < sz) { + if (escape_quotes && src[i] == '"') { + amt = snprintf(buf, sz - c, "\\\""); + i++; + } + else { + amt = u8_escape_wchar(buf, sz - c, u8_nextchar(src, &i)); + } + c += amt; + buf += amt; + } + if (c < sz) + *buf = '\0'; + return c; +} + +char *u8_strchr(char *s, u_int32_t ch, int *charn) +{ + int i = 0, lasti=0; + u_int32_t c; + + *charn = 0; + while (s[i]) { + c = u8_nextchar(s, &i); + if (c == ch) { + return &s[lasti]; + } + lasti = i; + (*charn)++; + } + return NULL; +} + +char *u8_memchr(char *s, u_int32_t ch, size_t sz, int *charn) +{ + int lasti=0; + size_t i =0; + u_int32_t c; + int csz; + + *charn = 0; + while (i < sz) { + c = csz = 0; + do { + c <<= 6; + c += (unsigned char)s[i++]; + csz++; + } while (i < sz && !isutf(s[i])); + c -= offsetsFromUTF8[csz-1]; + + if (c == ch) { + return &s[lasti]; + } + lasti = i; + (*charn)++; + } + return NULL; +} + +int u8_is_locale_utf8(char *locale) +{ + /* this code based on libutf8 */ + const char* cp = locale; + + for (; *cp != '\0' && *cp != '@' && *cp != '+' && *cp != ','; cp++) { + if (*cp == '.') { + const char* encoding = ++cp; + for (; *cp != '\0' && *cp != '@' && *cp != '+' && *cp != ','; cp++) + ; + if ((cp-encoding == 5 && !strncmp(encoding, "UTF-8", 5)) + || (cp-encoding == 4 && !strncmp(encoding, "utf8", 4))) + return 1; /* it's UTF-8 */ + break; + } + } + return 0; +} + +int u8_vprintf(char *fmt, va_list ap) +{ + int cnt, sz=0; + char *buf; + u_int32_t *wcs; + + sz = 512; + buf = (char*)alloca(sz); + try_print: + cnt = vsnprintf(buf, sz, fmt, ap); + if (cnt >= sz) { + buf = (char*)alloca(cnt - sz + 1); + sz = cnt + 1; + goto try_print; + } + wcs = (u_int32_t*)alloca((cnt+1) * sizeof(u_int32_t)); + cnt = u8_toucs(wcs, cnt+1, buf, cnt); + printf("%ls", (wchar_t*)wcs); + return cnt; +} + +int u8_printf(char *fmt, ...) +{ + int cnt; + va_list args; + + va_start(args, fmt); + + cnt = u8_vprintf(fmt, args); + + va_end(args); + return cnt; +} + +// ////////////////// M A I N /////////////////// +int main (int argc, char* argv[]) { + + return 0; +} Added: trunk/opentrep/test/i18n/utf8/utf8.hpp =================================================================== --- trunk/opentrep/test/i18n/utf8/utf8.hpp (rev 0) +++ trunk/opentrep/test/i18n/utf8/utf8.hpp 2009-08-15 16:04:38 UTC (rev 177) @@ -0,0 +1,72 @@ +// +#include <cstdarg> + +/* is c the start of a utf8 sequence? */ +#define isutf(c) (((c)&0xC0)!=0x80) + +/* convert UTF-8 data to wide character */ +int u8_toucs(u_int32_t *dest, int sz, char *src, int srcsz); + +/* the opposite conversion */ +int u8_toutf8(char *dest, int sz, u_int32_t *src, int srcsz); + +/* single character to UTF-8 */ +int u8_wc_toutf8(char *dest, u_int32_t ch); + +/* character number to byte offset */ +int u8_offset(char *str, int charnum); + +/* byte offset to character number */ +int u8_charnum(char *s, int offset); + +/* return next character, updating an index variable */ +u_int32_t u8_nextchar(char *s, int *i); + +/* move to next character */ +void u8_inc(char *s, int *i); + +/* move to previous character */ +void u8_dec(char *s, int *i); + +/* returns length of next utf-8 sequence */ +int u8_seqlen(char *s); + +/* assuming src points to the character after a backslash, read an + escape sequence, storing the result in dest and returning the number of + input characters processed */ +int u8_read_escape_sequence(char *src, u_int32_t *dest); + +/* given a wide character, convert it to an ASCII escape sequence stored in + buf, where buf is "sz" bytes. returns the number of characters output. */ +int u8_escape_wchar(char *buf, int sz, u_int32_t ch); + +/* convert a string "src" containing escape sequences to UTF-8 */ +int u8_unescape(char *buf, int sz, char *src); + +/* convert UTF-8 "src" to ASCII with escape sequences. + if escape_quotes is nonzero, quote characters will be preceded by + backslashes as well. */ +int u8_escape(char *buf, int sz, char *src, int escape_quotes); + +/* utility predicates used by the above */ +int octal_digit(char c); +int hex_digit(char c); + +/* return a pointer to the first occurrence of ch in s, or NULL if not + found. character index of found character returned in *charn. */ +char *u8_strchr(char *s, u_int32_t ch, int *charn); + +/* same as the above, but searches a buffer of a given size instead of + a NUL-terminated string. */ +char *u8_memchr(char *s, u_int32_t ch, size_t sz, int *charn); + +/* count the number of characters in a UTF-8 string */ +int u8_strlen(char *s); + +int u8_is_locale_utf8(char *locale); + +/* printf where the format string and arguments may be in UTF-8. + you can avoid this function and just use ordinary printf() if the current + locale is UTF-8. */ +int u8_vprintf(char *fmt, va_list ap); +int u8_printf(char *fmt, ...); This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |
From: <den...@us...> - 2009-08-16 14:08:17
|
Revision: 179 http://opentrep.svn.sourceforge.net/opentrep/?rev=179&view=rev Author: denis_arnaud Date: 2009-08-16 14:08:09 +0000 (Sun, 16 Aug 2009) Log Message: ----------- [i18n] Added a utility class for conversion from/to UTF8 strings to/from wide-character strings. Modified Paths: -------------- trunk/opentrep/opentrep/basic/sources.mk trunk/opentrep/test/i18n/icu/Makefile.am trunk/opentrep/test/i18n/stdlocru.cpp trunk/opentrep/test/i18n/utf8/Makefile.am trunk/opentrep/test/i18n/utf8/utf8.cpp trunk/opentrep/test/i18n/utf8/utf8.hpp trunk/opentrep/test/i18n/utf8/utf8string.cpp Added Paths: ----------- trunk/opentrep/opentrep/basic/UTF8Handler.cpp trunk/opentrep/opentrep/basic/UTF8Handler.hpp Added: trunk/opentrep/opentrep/basic/UTF8Handler.cpp =================================================================== --- trunk/opentrep/opentrep/basic/UTF8Handler.cpp (rev 0) +++ trunk/opentrep/opentrep/basic/UTF8Handler.cpp 2009-08-16 14:08:09 UTC (rev 179) @@ -0,0 +1,183 @@ +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// STL +#include <cassert> +#include <sstream> +#include <string> +// OpenTrep +#include <opentrep/basic/UTF8Handler.hpp> + +namespace OPENTREP { + + // ////////////////////////////////////////////////////////////////////// + static const wchar_t offsetsFromUTF8[6] = { + 0x00000000UL, 0x00003080UL, 0x000E2080UL, + 0x03C82080UL, 0xFA082080UL, 0x82082080UL + }; + + // ////////////////////////////////////////////////////////////////////// + static const char trailingBytesForUTF8[256] = { + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, + 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 + }; + + // ////////////////////////////////////////////////////////////////////// + std::wstring UTF8Handler::toWideString (const std::string& iSrc) { + std::basic_ostringstream<wchar_t> oStr; + + // Length of the source string + const size_t lStringSize = iSrc.size(); + + // Transform the source string in a regular C-string (char*) + const char* src = iSrc.c_str(); + + // + typedef unsigned char uchar_t; + + size_t idx = 0; + while (idx != lStringSize) { + + uchar_t lCurrentChar = static_cast<uchar_t> (src[idx]); + + // When there are multi-byte characters (e.g., for UTF-8 encoded + // STL strings), the size of the STL string corresponds to the + // total number of bytes. For instance, "München" has a size of 8 + // bytes (and not 7 characters). However, the iteration is made on + // the number of characters (idx); when the end of the string is + // reached, the loop must therefore be exited. + if (lCurrentChar == '\0') { + break; + } + + const int nb = trailingBytesForUTF8[lCurrentChar]; + + wchar_t tmpChar = 0; + switch (nb) { + // These fall through deliberately + case 3: { + lCurrentChar = static_cast<uchar_t> (src[idx]); ++idx; + tmpChar += lCurrentChar; tmpChar <<= 6; + } + case 2: { + lCurrentChar = static_cast<uchar_t> (src[idx]); ++idx; + tmpChar += lCurrentChar; tmpChar <<= 6; + } + case 1: { + lCurrentChar = static_cast<uchar_t> (src[idx]); ++idx; + tmpChar += lCurrentChar; tmpChar <<= 6; + } + case 0: { + lCurrentChar = static_cast<uchar_t> (src[idx]); ++idx; + tmpChar += lCurrentChar; + } + } + + tmpChar -= offsetsFromUTF8[nb]; + oStr << tmpChar; + } + + oStr << '\0'; + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + std::string UTF8Handler::toSimpleString (const std::wstring& iStr) { + std::ostringstream oStr; + + const wchar_t* src = iStr.c_str(); + size_t idx = 0; + size_t i = 0; + + while (src[i] != 0) { + wchar_t ch = src[i]; + + if (ch < 0x80) { + const char tmpChar = static_cast<const char> (ch); + oStr << tmpChar; ++idx; + + } else if (ch < 0x800) { + char tmpChar = static_cast<const char> ((ch >> 6) | 0xC0); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> ((ch & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + + } else if (ch < 0x10000) { + char tmpChar = static_cast<const char> ((ch>>12) | 0xE0); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> (((ch>>6) & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> ((ch & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + + } else if (ch < 0x110000) { + char tmpChar = static_cast<const char> ((ch>>18) | 0xF0); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> (((ch>>12) & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> (((ch>>6) & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + + tmpChar = static_cast<const char> ((ch & 0x3F) | 0x80); + oStr << tmpChar; ++idx; + } + i++; + } + + oStr << '\0'; + + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + std::string UTF8Handler::displayCharString (const char* iString) { + std::ostringstream oStr; + + bool hasReachedEnd = false; + for (size_t idx = 0; hasReachedEnd == false; ++idx) { + if (idx != 0) { + oStr << "; "; + } + const unsigned char lChar = iString[idx]; + // const wchar_t lChar = iString[idx]; + if (lChar == '\0') { + hasReachedEnd = true; + } + oStr << "[" << idx << "]: " << std::hex << lChar; + } + oStr << std::endl; + + return oStr.str(); + } + + // ////////////////////////////////////////////////////////////////////// + std::string UTF8Handler::displaySTLWString (const std::wstring& iString) { + std::ostringstream oStr; + + size_t idx = 0; + for (std::wstring::const_iterator itChar = iString.begin(); + itChar != iString.end(); ++itChar, ++idx) { + if (idx != 0) { + oStr << "; "; + } + const wchar_t lChar = *itChar; + oStr << "[" << idx << "]: " << std::hex << lChar; + } + oStr << std::endl; + + return oStr.str(); + } + +} + Added: trunk/opentrep/opentrep/basic/UTF8Handler.hpp =================================================================== --- trunk/opentrep/opentrep/basic/UTF8Handler.hpp (rev 0) +++ trunk/opentrep/opentrep/basic/UTF8Handler.hpp 2009-08-16 14:08:09 UTC (rev 179) @@ -0,0 +1,53 @@ +#ifndef __OPENTREP_BAS_UTF8HANDLER_HPP +#define __OPENTREP_BAS_UTF8HANDLER_HPP + +// ////////////////////////////////////////////////////////////////////// +// Import section +// ////////////////////////////////////////////////////////////////////// +// STL +#include <string> + +namespace OPENTREP { + + /** Utility class for basic handling of UTF-8 encoded strings. + <br>Most of the methods have taken their inspiration from Jeff + Bezanson's work in the Wikix project + (see http://meta.wikimedia.org/wiki/Wikix for further details), + and have been "C++-ified". */ + class UTF8Handler { + public: + /* Conversion from a UTF-8-encoded "simple character" (though + potentially multi-byte) STL string into a wide character STL + string. + <br>Note that as there is no checks of appropriate encoding, it + only works for valid UTF-8, i.e. no 5- or 6-byte sequences. + <br>Note that the "simple characters", within a STL string, may be + multi-byte (e.g., if they are UTF-8-encoded). + @param std::string The "simple character" (though potentially + multi-byte) STL string. + @return std::wstring The wide character STL string. + */ + static std::wstring toWideString (const std::string& iSrc); + + /* Conversion from a wide character STL string into a UTF-8-encoded + "simple character" (though potentially multi-byte) STL string. + <br>Note that as there is no checks of appropriate encoding, it + only works for valid UTF-8, i.e. no 5- or 6-byte sequences. + <br>Note that the "simple characters", within a STL string, may be + multi-byte (e.g., if they are UTF-8-encoded). + @param std::wstring The wide character STL string. + @return std::string The "simple character" (though potentially + multi-byte) STL string. + */ + static std::string toSimpleString (const std::wstring& iStr); + + /** Display the sequence of characters for the simple C-string. */ + static std::string displayCharString (const char* iString); + + /** Display the sequence of characters (one by one) for the given + STL wide character string. */ + static std::string displaySTLWString (const std::wstring& iString); + }; + +} +#endif // __OPENTREP_BAS_UTF8HANDLER_HPP Modified: trunk/opentrep/opentrep/basic/sources.mk =================================================================== --- trunk/opentrep/opentrep/basic/sources.mk 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/opentrep/basic/sources.mk 2009-08-16 14:08:09 UTC (rev 179) @@ -1,5 +1,7 @@ bas_h_sources = $(top_srcdir)/opentrep/basic/BasConst_General.hpp \ $(top_srcdir)/opentrep/basic/BasConst_OPENTREP_Service.hpp \ - $(top_srcdir)/opentrep/basic/BasChronometer.hpp + $(top_srcdir)/opentrep/basic/BasChronometer.hpp \ + $(top_srcdir)/opentrep/basic/UTF8Handler.hpp bas_cc_sources = $(top_srcdir)/opentrep/basic/BasConst.cpp \ - $(top_srcdir)/opentrep/basic/BasChronometer.cpp + $(top_srcdir)/opentrep/basic/BasChronometer.cpp \ + $(top_srcdir)/opentrep/basic/UTF8Handler.cpp Modified: trunk/opentrep/test/i18n/icu/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/icu/Makefile.am 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/icu/Makefile.am 2009-08-16 14:08:09 UTC (rev 179) @@ -3,7 +3,7 @@ MAINTAINERCLEANFILES = Makefile.in -check_PROGRAMS = icufmt icuustring icucharsetdetector icuconv +check_PROGRAMS = icufmt icuustring icucharsetdetector icuconv icuutext icufmt_SOURCES = icufmt.cpp icufmt_CXXFLAGS = $(ICU_CFLAGS) @@ -21,4 +21,8 @@ icuconv_CXXFLAGS = $(ICU_CFLAGS) icuconv_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) +icuutext_SOURCES = icuutext.cpp +icuutext_CXXFLAGS = $(ICU_CFLAGS) +icuutext_LDFLAGS = $(ICU_LIBS) $(ICU_IO_LIB) + EXTRA_DIST = Modified: trunk/opentrep/test/i18n/stdlocru.cpp =================================================================== --- trunk/opentrep/test/i18n/stdlocru.cpp 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/stdlocru.cpp 2009-08-16 14:08:09 UTC (rev 179) @@ -59,7 +59,7 @@ std::cout << "de: " << mucDEWCharString << std::endl; std::cout << "ru: " << mucRUWCharString << std::endl; - // STL ctypes on char* + // STL ctypes on wchar_t std::use_facet<std::ctype<wchar_t> > (langLocale).toupper(mucDEWCharString, mucDEWCharString+7); std::use_facet<std::ctype<wchar_t> > (langLocale).toupper(mucRUWCharString, Modified: trunk/opentrep/test/i18n/utf8/Makefile.am =================================================================== --- trunk/opentrep/test/i18n/utf8/Makefile.am 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/utf8/Makefile.am 2009-08-16 14:08:09 UTC (rev 179) @@ -11,6 +11,8 @@ utf8string_SOURCES = utf8string.cpp utf8string_CXXFLAGS = -utf8string_LDFLAGS = +utf8string_LDFLAGS = \ + $(BOOST_LIBS) $(SOCI_LIBS) $(CPPUNIT_LIBS) \ + $(top_builddir)/@PACKAGE@/lib@PACKAGE@.la EXTRA_DIST = Modified: trunk/opentrep/test/i18n/utf8/utf8.cpp =================================================================== --- trunk/opentrep/test/i18n/utf8/utf8.cpp 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/utf8/utf8.cpp 2009-08-16 14:08:09 UTC (rev 179) @@ -55,10 +55,10 @@ for all the characters. if sz = srcsz+1 (i.e. 4*srcsz+4 bytes), there will always be enough space. */ -int u8_toucs(u_int32_t *dest, int sz, char *src, int srcsz) +int u8_toucs(u_int32_t *dest, int sz, const char *src, int srcsz) { u_int32_t ch; - char *src_end = src + srcsz; + const char* src_end = src + srcsz; int nb; int i=0; @@ -100,7 +100,7 @@ the NUL as well. the destination string will never be bigger than the source string. */ -int u8_toutf8(char *dest, int sz, u_int32_t *src, int srcsz) +int u8_toutf8(char *dest, int sz, const u_int32_t *src, int srcsz) { u_int32_t ch; int i = 0; Modified: trunk/opentrep/test/i18n/utf8/utf8.hpp =================================================================== --- trunk/opentrep/test/i18n/utf8/utf8.hpp 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/utf8/utf8.hpp 2009-08-16 14:08:09 UTC (rev 179) @@ -5,10 +5,10 @@ #define isutf(c) (((c)&0xC0)!=0x80) /* convert UTF-8 data to wide character */ -int u8_toucs(u_int32_t *dest, int sz, char *src, int srcsz); +int u8_toucs(u_int32_t *dest, int sz, const char *src, int srcsz); /* the opposite conversion */ -int u8_toutf8(char *dest, int sz, u_int32_t *src, int srcsz); +int u8_toutf8(char *dest, int sz, const u_int32_t *src, int srcsz); /* single character to UTF-8 */ int u8_wc_toutf8(char *dest, u_int32_t ch); Modified: trunk/opentrep/test/i18n/utf8/utf8string.cpp =================================================================== --- trunk/opentrep/test/i18n/utf8/utf8string.cpp 2009-08-15 18:24:37 UTC (rev 178) +++ trunk/opentrep/test/i18n/utf8/utf8string.cpp 2009-08-16 14:08:09 UTC (rev 179) @@ -1,113 +1,43 @@ // STL #include <iostream> -#include <locale> -#include <string> -#include <cstring> +// OpenTrep +#include <opentrep/basic/UTF8Handler.hpp> -// /////////////////////////////////////////////// -void displayCharString (const char* iString) { - // Store current formatting flags of std::cout - std::ios::fmtflags oldFlags = std::cout.flags(); - - const size_t lLength = std::strlen (iString); - for (size_t idx = 0; idx != lLength; ++idx) { - if (idx != 0) { - std::cout << "; "; - } - const unsigned short lChar = iString[idx]; - // const wchar_t lChar = iString[idx]; - std::cout << "[" << idx << "]: " << std::hex << lChar; - } - std::cout << std::endl; - - // Reset formatting flags of std::cout - std::cout.flags (oldFlags); -} - -// /////////////////////////////////////////////// -void displayWCharString (const wchar_t* iString, const size_t iLength) { - // Store current formatting flags of std::cout - std::ios::fmtflags oldFlags = std::cout.flags(); - - for (size_t idx = 0; idx != iLength; ++idx) { - if (idx != 0) { - std::cout << "; "; - } - const wchar_t lChar = iString[idx]; - std::cout << "[" << idx << "]: " << std::hex << lChar; - } - std::cout << std::endl; - - // Reset formatting flags of std::cout - std::cout.flags (oldFlags); -} - -// /////////////////////////////////////////////// -void displaySTLString (const std::string& iString) { - // Store current formatting flags of std::cout - std::ios::fmtflags oldFlags = std::cout.flags(); - - unsigned short idx = 0; - for (std::string::const_iterator itChar = iString.begin(); - itChar != iString.end(); ++itChar, ++idx) { - if (idx != 0) { - std::cout << "; "; - } - const unsigned short lChar = *itChar; - // const char lChar = *itChar; - // const wchar_t lChar = *itChar; - std::cout << "[" << idx << "]: " << std::hex << lChar; - } - std::cout << std::endl; - - // Reset formatting flags of std::cout - std::cout.flags (oldFlags); -} - // //////////////////////// M A I N ///////////////////////// int main (int argc, char* argv[]) { - // Single char strings - const char mucDECharString[] = ("München"); - const char mucRUCharString[] = ("Мюнхен"); + // STL strings + std::string mucDESTLString ("München"); + std::string mucRUSTLString ("Мюнхен"); - std::cout << "--------" << std::endl << "Single char strings" << std::endl; - std::cout << "Deutsch ('" << mucDECharString << "'): " << std::endl; - displayCharString (mucDECharString); + std::cout << "--------" << std::endl + << "STL strings without processing" << std::endl; + std::cout << "Deutsch: '" << mucDESTLString << "'" << std::endl; + std::cout << "Russian: '" << mucRUSTLString << "'" << std::endl; - std::cout << "Russian ('" << mucRUCharString << "'): " << std::endl; - displayCharString (mucRUCharString); - - // Wide char strings - wchar_t mucDEWCharString[7]; - wchar_t mucRUWCharString[6]; - - // Conversion from char* to wchar_t thanks to the STL locale - std::locale lLocale; - std::use_facet<std::ctype<wchar_t> > (lLocale).widen (mucDECharString, - mucDECharString+7, - mucDEWCharString); - std::use_facet<std::ctype<wchar_t> > (lLocale).widen (mucRUCharString, - mucRUCharString+6, - mucRUWCharString); + // + std::wstring mucDESTLWString = + OPENTREP::UTF8Handler::toWideString (mucDESTLString); + std::wstring mucRUSTLWString = + OPENTREP::UTF8Handler::toWideString (mucRUSTLString); - std::cout << "--------" << std::endl << "Wide char strings" << std::endl; - std::cout << "Deutsch ('" << mucDEWCharString << "'): " << std::endl; - displayWCharString (mucDEWCharString, 7); + std::cout << "--------" << std::endl + << "UTF-8 decoded wide char strings" << std::endl; + std::cout << "Deutsch: " << std::endl; + // std::cout << "Deutsch: '" << mucDESTLWString << "'" << std::endl; + std::cout << OPENTREP::UTF8Handler::displaySTLWString (mucDESTLWString); - std::cout << "Russian ('" << mucRUWCharString << "'): " << std::endl; - displayWCharString (mucRUWCharString, 6); + std::cout << "Russian: " << std::endl; + // std::cout << "Russian: '" << mucRUSTLWString << "'" << std::endl; + std::cout << OPENTREP::UTF8Handler::displaySTLWString (mucRUSTLWString); - // STL strings - std::string mucDESTLString ("München"); - std::string mucRUSTLString ("Мюнхен"); - - std::cout << "--------" << std::endl << "STL strings" << std::endl; - std::cout << "Deutsch ('" << mucDESTLString << "'): " << std::endl; - displaySTLString (mucDESTLString); + mucDESTLString = OPENTREP::UTF8Handler::toSimpleString (mucDESTLWString); + mucRUSTLString = OPENTREP::UTF8Handler::toSimpleString (mucRUSTLWString); - std::cout << "Russian ('" << mucRUSTLString << "'): " << std::endl; - displaySTLString (mucRUSTLString); + std::cout << "--------" << std::endl + << "STL strings after processing" << std::endl; + std::cout << "Deutsch: '" << mucDESTLString << "'" << std::endl; + std::cout << "Russian: '" << mucRUSTLString << "'" << std::endl; return 0; } This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |