Lupy is a full text indexer for Python. It is a port of Jakarta Lucene 1.2 to Python. Specifically, it reads and writes indexes in Lucene binary format. Like Lucene, it is sophisticated and scalable.
Be the first to post a text review of Lupy. Rate and review a project by clicking thumbs up or thumbs down in the right column.
Release 0.2.1 - 11 May 2004 Minor bugfixes, added tests.
Release 0.2.0 - 20 Feb 2004 This release brings major reorganization of the code, grouping classes into larger modules instead of the original Java style, as well as rewriting several of the classes to be more Pythonic, removing extraneous data structures and so forth; overall, the code has been reduced by 20%. The public interface, indexer.py, has not changed; other classes have not been changed significantly, other than being moved to new modules. Also, this release changes the interface for analyzers: they are now iterable objects that take one argument, the string to be tokenized, and produce tokens, rather than the analysis classes ported from Lucene. This improves performance while simplifying the code. If an analyzer is not specified, lupy.index.documentwriter.standardTokenizer is used. The regex used by that generator is re.compile("\\w+", re.U), and the tokens are downcased before being stored. Along with this improvement in tokenization comes better Unicode support; all text is now handled as Unicode strings. There is a simple test for the indexing and retrieval of documents containing non-ASCII data.
Be the first person to add a text review.
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use
Thanks for your rating!
Would you also like to write a review?
Thanks for your review!
Get credit for your review by logging in via OpenID. Click your account provider: