Page 2 | Best Open Source Python Semantic Search Tools

Use Vim as IDE

use vim as IDE

Use Vim As IDE is a comprehensive configuration repository (by YangYangWithGnu) that guides you how to turn Vim into a full-fledged Integrated Development Environment (IDE). The project isn’t just a single plugin; it’s more like a curated set of plugins, configuration tips, and workflow suggestions to enable syntax highlighting, smart code completion, project navigation, semantic search, file-switching, build-integration, undo-history, templating and more—particularly geared toward C/C++ development, but with many ideas applicable more broadly. The documentation is long and detailed, walking users from the fundamentals of Vim configuration (.vimrc, plugin management) through higher-order capabilities like semantic navigation and project toolchain integration. The philosophy: Vim already offers “what you need when you need it; what you want when you want it” and this repo shows how to tap that potential.

Downloads: 0 This Week

Last Update: 2025-10-14

See Project

Vedana

Open source multi-agent RAG over a knowledge graph

Vedana is an open-source multi-agent RAG system built around a typed knowledge graph. It is designed for questions that require structure, completeness, and traceability instead of simple text similarity. The system lets agents navigate data step by step through Cypher queries, vector search, document lookup, and source verification. Its architecture combines a knowledge graph, pgvector-based embeddings, incremental ETL, and a backoffice interface for chat, metrics, prompt tuning, and data loading. It also includes JIMS, a framework for persistent conversational agents with typed events and pluggable pipelines. Overall, Vedana is useful for teams that need reliable answers from real data, especially when relationships, counts, rules, and source-backed reasoning matter.

Downloads: 0 This Week

Last Update: 2026-06-26

See Project

finetuner

Task-oriented finetuning for better embeddings on neural search

Fine-tuning is an effective way to improve performance on neural search tasks. However, setting up and performing fine-tuning can be very time-consuming and resource-intensive. Jina AI’s Finetuner makes fine-tuning easier and faster by streamlining the workflow and handling all the complexity and infrastructure in the cloud. With Finetuner, you can easily enhance the performance of pre-trained models, making them production-ready without extensive labeling or expensive hardware. Create high-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.

Downloads: 0 This Week

Last Update: 2023-08-21

See Project

kg-gen

Knowledge Graph Generation from Any Text

kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing higher-quality graph structures from raw text. The framework addresses common problems in automatic knowledge graph construction, particularly sparsity and duplication of entities, by applying a clustering and entity-resolution process that merges semantically similar nodes. This allows the generated graphs to be denser, more coherent, and easier to use for downstream tasks such as retrieval-augmented generation, semantic search, and reasoning systems.

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

pgai

A suite of tools to develop RAG, semantic search, and other AI apps

pgai is a suite of PostgreSQL extensions developed by Timescale to empower developers in building AI applications directly within their databases. It integrates tools for vector storage, advanced indexing, and AI model interactions, facilitating the development of applications like semantic search and Retrieval-Augmented Generation (RAG) without leaving the SQL environment.

Downloads: 0 This Week

Last Update: 2025-10-14

See Project

rag-search

RAG Search API

rag-search is a lightweight Retrieval-Augmented Generation API service designed to provide structured semantic search and answer generation through a simple FastAPI backend. The project integrates web search, vector embeddings, and reranking logic to retrieve relevant context before passing it to a language model for response generation. It is built to be easily deployable, requiring only environment configuration and dependency installation to run a functional RAG service. The system supports configurable filtering, scoring thresholds, and reranking options, allowing developers to fine-tune retrieval quality. Its architecture is modular, separating handlers, services, and utilities to support customization and extension. Overall, rag-search serves as a practical starter backend for teams building AI search or question-answering applications on their own data.

Downloads: 0 This Week

Last Update: 2026-03-03

See Project

txtai

Build AI-powered semantic search applications

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.

Downloads: 0 This Week

Last Update: 2026-07-01

See Project

yt-fts

Search all of YouTube from the command line

yt-fts, short for YouTube Full Text Search, is an open-source command-line tool that enables users to search the spoken content of YouTube videos by indexing their subtitles. The program automatically downloads subtitles from a specified YouTube channel using the yt-dlp utility and stores them in a local SQLite database. Once indexed, users can perform full-text searches across all transcripts to quickly locate keywords or phrases mentioned within the videos. The tool returns search results with timestamps and direct links to the exact moment in the video where the phrase occurs. In addition to traditional keyword search, the system supports experimental semantic search capabilities using embeddings from AI services and vector databases. This allows users to search videos by meaning rather than only exact keywords.

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

Open Source Python Semantic Search Tools - Page 2

Python Semantic Search Tools

Use Vim as IDE

Vedana

finetuner

kg-gen

pgai

rag-search

txtai

yt-fts

Related Searches