TML - Text Mining Library for LSA
Description
TML is a Text Mining Library with a focus on LSA (Latent Semantic Analysis) tightly integrated with Apache's Lucene which focuses on ease of use for researchers and developers that want to integrate Text Mining capabilities in their applications.
Categories
License
Features
- Document indexing and selection using Apache's Lucene
- Fast VSM generation with several local and global weights (term - doc matrix)
- Dimensionality reduction using SVD or NMF for LSA or related.
- Meta-data annotators (PennTree grammar parsing).
- Operations: Document distances, topic clustering, keyword extraction, and many more!
Update Notifications
User Ratings
User Reviews
-
Fast and simple.
-
works perfectly.
-
It seems to be good, but there are some errors that dont let the program load correctly the library ( Abstract Annotator constructor receives parameters but PennTreeAnnotator doesnt receive)
-
very good library for doing text mining
-
great
-
Nice and simple.