Text2vec is a Python toolkit for turning text into vector representations. It supports words, sentences, and paragraphs, making it useful for semantic search, similarity matching, clustering, retrieval, and ranking workflows. The project implements models and methods such as Word2Vec, RankBM25, BERT, Sentence-BERT, and CoSENT. It also compares model behavior on semantic matching and similarity calculation tasks. Developers can use it as an applied NLP library for embedding generation or as a study resource for text representation methods. It is especially useful for Chinese and multilingual NLP projects that need practical sentence embeddings and similarity scoring.
Features
- Word, sentence, and paragraph embeddings
- Text similarity calculation
- Word2Vec and RankBM25 support
- BERT and Sentence-BERT support
- CoSENT model support
- Semantic matching evaluation resources
Categories
Text EditorsLicense
Apache License V2.0Follow Text2vec
Other Useful Business Software
Stop Storing Third-Party Tokens in Your Database
Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of Text2vec!