Download Latest Version montezuma-2016-10-05.zip (4.1 MB)
Email in envelope

Get an email when there's a new version of montezuma

Home / docs
Name Modified Size InfoDownloads / Week
Parent folder
riao91.pdf 2016-01-08 174.4 kB
DougCutting-Haifa05.pdf 2016-01-08 142.1 kB
CharacterEntities.html 2015-11-18 58.0 kB
MontezumaQueryParserSyntax.html 2015-11-18 20.6 kB
Totals: 4 Items   395.1 kB 0
Copyright 2006 John Wiseman <jjwiseman@yahoo.com> 7/13/2006
Copyright 2015 Roy Anderson <reanz1959@gmail.com> 2015-12-12

Updated 2016-01-08

Welcome to Montezuma! A Common Lisp, open source system for indexing and searching large document collections.

Montezuma provides a comprehensive solution for full-text index and search systems that works in concert with other code systems. It also serves as a code base worthy of study including CLOS and a purpose built query language. Montezuma is capable of large scale applications for corpus management. Unless you have an interest in a Common Lisp full text index library, Montezuma is probably not for you.

Introduction

Montezuma is a text index and search engine library for Common Lisp based on the Ruby Ferret library which is itself based on the Lucene library (http://lucene.apache.org/) for Java.

For a Lisp developer, Montezuma provides a comprehensive solution for full-text index and search systems that works in concert with other code systems. It also serves as a code base worthy of study including CLOS and a purpose built query language. Montezuma is capable large scale applications to corpus management, for example, the OANC (Open American National Corpus).

What's New?

Montezuma 2.0.1 introduces improved handling of invalid queries: instead of substituting a simple analysis of query terms, Montezuma now raises an error condition and does not attempt to interpret queries that cannot be parsed. 

Montezuma now includes typed queries. More specifically, when a field definition in an index identifies a field type, the type guides the query parser's interpretation of targeted values. So far, supported field consist of: date, int, or float.

Montezuma 1.2.0 brings a Listener based shell (Read Eval Print Loop) for exploring Montezuma including the query parser and search features. It also introduces Oropendola: a [LispWorks](http://www.lispworks.com/) GUI alternative to the Listener shell. To start the shell, copy the Montezuma source and dependencies
and enter `(shell)`. To start the Oropendola GUI, you may need to install LispWorks then enter `(oropendola)` in the Listener.

Dependencies

Montezuma requires the following systems, but they are now incorporated into the Montezuma release in the dependencies directory:

[CL-PPCRE](http://weitz.de/files/cl-ppcre.tar.gz)
[ALEXANDRIA](http://common-lisp.net/~loliveira/tarballs/inofficial/alexandria-2008-07-29.tar.gz)
[TRIVIAL-FEATURES](https://github.com/trivial-features/trivial-features)
[BORDEAUX-THREADS](https://gitlab.common-lisp.net/bordeaux-threads/bordeaux-threads)
[CL-FAD](http://weitz.de/files/cl-fad.tar.gz)
[BABEL](https://github.com/cl-babel/babel)
[TRIVIAL-GRAY-STREAMS](https://github.com/trivial-gray-streams/trivial-gray-streams)
[LOCAL-TIME](https://common-lisp.net/project/local-time/")
 
Installation Guide

After downloading the Montezuma release file which includes the packages it depends upon (available from ([Montezuma SourceForge files](https://sourceforge.net/projects/montezuma/files/)). Edit the montezuma.lisp file and change the root directory so that you can load the montezuma. You will also need to set the index path in the shell/indexes file. It's still a little cumbersome but it should work relatively well you have configured the directories. When these changes have been made you may still have to compile and load the montezuma.lisp file when you want to start up Montezuma.

Montezuma has been tested with Lispworks 6.1.1 (Windows 10), SBCL 0.9.12 (OS X/PPC), SBCL 0.9.13 (Linux/x86) OpenMCL 1.0 (OS X/PPC) and ACL 8.0 (OS X/PPC). It has been extended in 2015 using Lispworks 6.1.1.

The only implementation-dependent code in Montezuma is in src/util/mop.lisp. To add support for another implementation may be as simple as adding one line to the definition of the CLASS-SLOTS function and one to SLOT-DEFINITION-NAME.

Installation and Loading

You can use ASDF-INSTALL to install Montezuma:

~~~~
  (asdf-install:install '#:montezuma)
~~~~

And ASDF to load it:

~~~~
  (asdf:oos 'asdf:load-op '#:montezuma)
~~~~

Testing

Once Montezuma has been loaded, you can run the unit tests if you like:

~~~~
  (asdf:oos 'asdf:test-op '#:montezuma)
~~~~

Use

See the TUTORIAL.TXT file for more information on how to use Montezuma.

The Montezuma project page at http://projects.heavymeta.org/montezuma/
should have the latest information about Montezuma.

Acknowledgements

Thanks to Dave Balmain, Gary King, Peter Seibel (for his META-inspired parser), Xach Beane (for the heap implementation from his ([TIMER](http://www.xach.com/lisp/timer/doc.html)) library[1]) and Franz. Inc. (for their ([Porter stemmer](http://www.lispwire.com/entry-text-porter-word-stemmer-des)).

Failures and Successes Adding Montezuma Documents

[REA] While adding documents to Montezuma, every dozen or so additions would raise a Delete File or Rename File exception. I retried adding documents (for Rename exceptions) or restart the load from the last document added. This problem disappeared when I moved the index directory from a DropBox networked drive to a local drive. Not only did the exceptions disappear, but the load times improved from about 5 hours to 5 minutes. I could also remove the checkpoints and repeated index optimization without exceptions.

Onward

For a complete example of using Montezuma to index and retrieve real information, see the file `tests/corpora/pastes-1000/paste-search.lisp`.

Source: README.txt, updated 2016-01-08