Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.
Features
- All algorithms are memory-independent w.r.t. the corpus size (can process input larger than RAM, streamed, out-of-core),
- Easy to plug in your own input corpus/datastream (trivial streaming API)
- Easy to extend with other Vector Space algorithms (trivial transformation API)
- Efficient multicore implementations of popular algorithms
- Can run Latent Semantic Analysis and Latent Dirichlet Allocation on a cluster of computers
- Documentation available
Categories
Machine LearningLicense
GNU Library or Lesser General Public License version 3.0 (LGPLv3)Follow gensim
Other Useful Business Software
Streamline Azure Security with Palo Alto Networks VM-Series
Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of gensim!