Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. By relying on semantic embeddings and contextual analysis, the tool can identify passages that are relevant even when the query uses different wording than the source documents.
Features
- Semantic search that retrieves results based on conceptual similarity rather than keyword matching
- Command-line tool that analyzes local text and PDF files
- Automatic generation of document embeddings for semantic retrieval
- Local web interface for interactive document exploration
- Caching system that speeds up repeated searches on previously processed documents
- Support for tagging, filtering, and refining semantic queries