Text2vec is a Python toolkit for turning text into vector representations. It supports words, sentences, and paragraphs, making it useful for semantic search, similarity matching, clustering, retrieval, and ranking workflows. The project implements models and methods such as Word2Vec, RankBM25, BERT, Sentence-BERT, and CoSENT. It also compares model behavior on semantic matching and similarity calculation tasks. Developers can use it as an applied NLP library for embedding generation or as a study resource for text representation methods. It is especially useful for Chinese and multilingual NLP projects that need practical sentence embeddings and similarity scoring.

Features

  • Word, sentence, and paragraph embeddings
  • Text similarity calculation
  • Word2Vec and RankBM25 support
  • BERT and Sentence-BERT support
  • CoSENT model support
  • Semantic matching evaluation resources

Project Samples

Project Activity

See All Activity >

Categories

Text Editors

License

Apache License V2.0

Follow Text2vec

Text2vec Web Site

Other Useful Business Software
Stop Storing Third-Party Tokens in Your Database Icon
Stop Storing Third-Party Tokens in Your Database

Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
Try Auth0 for Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Text2vec!

Additional Project Details

Programming Language

Python

Related Categories

Python Text Editors

Registered

17 hours ago