Semantic Search Tools for Linux

View 7 business solutions

Browse free open source Semantic Search tools and projects for Linux below. Use the toggles on the left to filter open source Semantic Search tools by OS, license, language, programming language, and project status.

  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 1
    Hands-On Large Language Models

    Hands-On Large Language Models

    Official code repo for the O'Reilly Book

    Hands-On-Large-Language-Models is the official GitHub code repository accompanying the practical technical book Hands-On Large Language Models authored by Jay Alammar and Maarten Grootendorst, providing a comprehensive collection of example notebooks, code labs, and supporting materials that illustrate the core concepts and real-world applications of large language models. The repository is structured into chapters that align with the educational progression of the book — covering everything from foundational topics like tokens, embeddings, and transformer architecture to advanced techniques such as prompt engineering, semantic search, retrieval-augmented generation (RAG), multimodal LLMs, and fine-tuning. Each chapter contains executable Jupyter notebooks that are designed to be run in environments like Google Colab, making it easy for learners to experiment interactively with models, visualize attention patterns, implement classification and generation tasks.
    Downloads: 80 This Week
    Last Update:
    See Project
  • 2
    pgvector

    pgvector

    Open-source vector similarity search for Postgres

    pgvector is an open-source PostgreSQL extension that equips PostgreSQL databases with vector data storage, indexing, and similarity search capabilities—ideal for embeddings-based applications like semantic search and recommendations. You can add an index to use approximate nearest neighbor search, which trades some recall for speed. Unlike typical indexes, you will see different results for queries after adding an approximate index. An HNSW index creates a multilayer graph. It has better query performance than IVFFlat (in terms of speed-recall tradeoff), but has slower build times and uses more memory. Also, an index can be created without any data in the table since there isn’t a training step like IVFFlat.
    Downloads: 69 This Week
    Last Update:
    See Project
  • 3
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    Forge Code

    Forge Code

    AI enabled pair programmer for Claude, GPT, O Series, Grok, Deepseek

    Forge is a modern, open-source tool that brings AI-powered code assistance directly into your terminal workflow, effectively turning your shell into a “pair programmer”, without ever leaving your development environment. Written in Rust (with a command-line interface), Forge integrates with your existing shell (bash, zsh, fish, etc.) or IDE-agnostic workflows, allowing you to interact with your codebase, command-line tools, and version control as usual, but with the added support of large language models (LLMs) to help with code generation, refactoring, bug fixing, code review, and even design advice. Rather than requiring a separate UI or web-based IDE, Forge respects the developer’s existing habits and setups, and keeps all operations local, ensuring your code doesn’t get sent to unknown external services — a strong point for privacy and security. It supports many model providers (e.g. GPT, Claude, Grok, and others) via API keys.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is implemented with performance in mind, often leveraging optimized data structures that balance fast reads and writes with minimal resource consumption. Developers can integrate OpenViking into modern AI stacks to unify context storage across services, enabling consistent session history, personalized responses, and richer search experiences.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    FlagEmbedding

    FlagEmbedding

    Retrieval and Retrieval-augmented LLMs

    FlagEmbedding is an open-source toolkit for building and deploying high-performance text embedding models used in information retrieval and retrieval-augmented generation systems. The project is part of the BAAI FlagOpen ecosystem and focuses on creating embedding models that transform text into dense vector representations suitable for semantic search and large language model pipelines. FlagEmbedding includes a family of models known as BGE (BAAI General Embedding), which are designed to achieve strong performance across multilingual and cross-lingual retrieval benchmarks. The toolkit provides infrastructure for inference, fine-tuning, evaluation, and dataset preparation, enabling developers to train custom embedding models for specific domains or applications. It also includes reranker models that refine search results by re-evaluating candidate documents using cross-encoder architectures, improving retrieval accuracy in complex queries.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    LibrePhotos

    LibrePhotos

    A self-hosted open source photo management service

    LibrePhotos is an open-source self-hosted photo management platform designed to organize, browse, and analyze personal media libraries while preserving user privacy. The system allows individuals to store and manage their photos and videos locally rather than relying on commercial cloud services. It provides features similar to services like Google Photos but runs on a private server controlled by the user. The application includes AI-powered tools that automatically analyze images to detect faces, objects, and locations, allowing photos to be grouped and searched more efficiently. LibrePhotos supports a wide variety of media formats and provides a web interface that can be accessed from different devices and operating systems. The platform is built using a Django backend and a React frontend, forming a full-stack web application architecture.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    ModernBERT

    ModernBERT

    Bringing BERT into modernity via both architecture changes and scaling

    ModernBERT is an open-source research project that modernizes the classic BERT encoder architecture by incorporating recent advances in transformer design, training techniques, and efficiency improvements. The goal of the project is to bring BERT-style models up to date with the capabilities of modern large language models while preserving the strengths of bidirectional encoder architectures used for tasks such as classification, retrieval, and semantic search. ModernBERT introduces architectural improvements that enhance both training efficiency and inference performance, making the model more suitable for modern large-scale machine learning pipelines. The repository also includes FlexBERT, a modular framework that allows developers to experiment with different encoder building blocks and configurations when constructing new models.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    Open Semantic Search is an open source research and analytics platform designed for searching, analyzing, and exploring large collections of documents using semantic search technologies. It provides an integrated search server combined with a document processing pipeline that supports crawling, text extraction, and automated analysis of content from many different sources. Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files. It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Reor Project

    Reor Project

    Private & local AI personal knowledge management app

    Reor is an AI-powered desktop note-taking app: it automatically links related notes, answers questions on your notes, provides semantic search and can generate AI flashcards. Everything is stored locally and you can edit your notes with an Obsidian-like markdown editor. The hypothesis of the project is that AI tools for thought should run models locally by default. Reor stands on the shoulders of the giants Ollama, Transformers.js & LanceDB to enable both LLMs and embedding models to run locally.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI. Both are great tools but not very performant in inference. Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OpenAI Cookbook

    OpenAI Cookbook

    Examples and guides for using the OpenAI API

    openai-cookbook is a repository containing example code, tutorials, and guidance for how to build real applications on top of the OpenAI API. It covers a wide range of use cases: prompt engineering, embeddings and semantic search, fine-tuning, agent architectures, function calling, working with images, chat workflows, and more. The content is primarily in Python (notebooks, scripts), but the conceptual guidance is applicable across languages. The repository is kept up to date and often expanded, and its examples are intended to serve both beginners and intermediate users of the API. It also includes deployment recipes, integration snippets (e.g. with GitHub Actions), and production considerations. Because OpenAI’s API evolves rapidly, the Cookbook acts as a living, community-curated reference to show “how to do X with the API” rather than only reprinting documentation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    QMD

    QMD

    mini cli search engine for your docs, knowledge bases, etc.

    QMD is a powerful and lightweight command-line tool that acts as an on-device search engine for your personal knowledge base, allowing you to index and search files like Markdown notes, meeting transcripts, technical documentation, and other text collections without depending on cloud services. Designed to keep all search activity local, it combines classic full-text search techniques with modern semantic features such as vector similarity and hybrid ranking so that queries return not just literal matches but conceptually relevant results. Users can organize content into named collections, embed documents for semantic retrieval, and then perform keyword searches, semantic searches, or hybrid natural-language queries to quickly surface the most useful information across all indexed sources. Because the entire system runs on the user’s machine, privacy is preserved and there’s no risk of exposing sensitive content to outside providers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    RAG API

    RAG API

    ID-based RAG FastAPI: Integration with Langchain and PostgreSQL

    rag_api is an open-source REST API for building Retrieval-Augmented Generation (RAG) systems using LLMs like GPT. It lets users index documents, search semantically, and retrieve relevant content for use in generative AI workflows. Designed for rapid prototyping, it is ideal for chatbot development, document assistants, and knowledge-based LLM apps.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    CocoIndex

    CocoIndex

    ETL framework to index data for AI, such as RAG

    CocoIndex is an open-source framework designed for building powerful, local-first semantic search systems. It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search data, distinguishing itself from closed, black-box systems. The tool is suitable for developers working on personal knowledge bases, AI search interfaces, or private LLM applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Kernel Memory

    Kernel Memory

    Research project. A Memory solution for users, teams, and applications

    Kernel Memory is an open-source reference architecture developed by Microsoft to help developers build memory systems for AI applications powered by large language models. The project focuses on enabling applications to store, index, and retrieve information so that AI systems can incorporate external knowledge when generating responses. It supports scenarios such as document ingestion, semantic search, and retrieval-augmented generation, allowing language models to answer questions using contextual information from private or enterprise datasets. Kernel Memory can ingest documents in multiple formats, process them into embeddings, and store them in searchable indexes. Applications can then query these indexed data sources to retrieve relevant information and include it as context for AI responses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Language Models

    Language Models

    Explore large language models in 512MB of RAM

    languagemodels is a lightweight Python library designed to simplify experimentation with large language models while maintaining extremely low hardware requirements. The project focuses on enabling developers and students to explore language model capabilities without needing expensive GPUs or large cloud infrastructures. By using small and optimized models, the library allows LLM inference to run in environments with limited resources, sometimes requiring only a few hundred megabytes of memory. The package provides simple APIs that allow developers to generate text, perform semantic search, classify text, and answer questions using local models. It is particularly useful for educational purposes, as it demonstrates the fundamental mechanics of language model inference and prompt-based applications. The repository includes multiple example applications such as chatbots, document question answering systems, and information retrieval tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Microsoft Learn MCP Server

    Microsoft Learn MCP Server

    Official Microsoft Learn MCP Server, powering LLMs and AI agents

    Microsoft Learn MCP Server is the official GitHub repository for the Microsoft Learn MCP (Model Context Protocol) Server, a service that implements the Model Context Protocol to provide AI assistants and tools with reliable, real-time access to Microsoft’s official documentation. Rather than relying on training data that may be outdated or incomplete, MCP servers let agents like GitHub Copilot, Claude, or other LLM-based tools search and pull context directly from up-to-date Microsoft Learn content, including Azure, .NET, and other tech docs. By connecting to the MCP endpoint, coding agents can answer questions, retrieve code examples, and offer best practices grounded in authoritative sources without requiring API keys or manual browser searches. This capability helps eliminate hallucinations, improve accuracy, and streamline developer workflows by keeping relevant tech guidance close at hand.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    OceanBase seekdb

    OceanBase seekdb

    The AI-Native Search Database

    seekdb is an AI-native search database from OceanBase that unifies vector, full-text, relational, JSON, and GIS data into a single query engine. The system is designed to support hybrid search workloads and in-database AI workflows without requiring multiple specialized databases. It enables developers to perform semantic search, keyword search, and structured SQL queries within the same platform, simplifying modern AI application stacks. seekdb also embeds AI capabilities directly in the database layer, including embedding generation, reranking, and LLM inference for end-to-end RAG pipelines. Built on the OceanBase engine, it maintains ACID compliance and MySQL compatibility while delivering real-time analytical performance. Overall, seekdb positions itself as a unified data foundation for next-generation AI applications that require both transactional and semantic retrieval capabilities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Semantra

    Semantra

    Multi-tool for semantic search

    Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. By relying on semantic embeddings and contextual analysis, the tool can identify passages that are relevant even when the query uses different wording than the source documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Vector AI

    Vector AI

    A platform for building vector based applications

    Vector AI is a framework designed to make the process of building production-grade vector-based applications as quick and easily as possible. Create, store, manipulate, search and analyze vectors alongside json documents to power applications such as neural search, semantic search, personalized recommendations etc. Image2Vec, Audio2Vec, etc (Any data can be turned into vectors through machine learning). Store your vectors alongside documents without having to do a db lookup for metadata about the vectors. Enable searching of vectors and rich multimedia with vector similarity search. The backbone of many popular A.I use cases like reverse image search, recommendations, personalization, etc. There are scenarios where vector search is not as effective as traditional search, e.g. searching for skus. Vector AI lets you combine vector search with all the features of traditional search such as filtering, fuzzy search, and keyword matching to create an even more powerful search.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    ViMax

    ViMax

    Director, Screenwriter, Producer, and Video Generator All-in-One

    ViMax is an open-source framework for performing large-scale multi-modal vision-language modeling and reasoning by combining powerful image encoders with advanced language models to solve complex visual tasks. It integrates components like visual encoders, cross-modal fusion techniques, and reasoning modules so that users can go beyond simple captioning or classification to perform tasks such as visual question answering, multi-image inference, and structured scene understanding. ViMax’s design accommodates large image sets and supports retrieval augmentation, enabling it to work with external image databases, supplementary metadata, and semantic search to enhance context awareness. The system aims to bridge foundational vision backbones and generative language models through adapters and fusion layers that maximize both signal integration and reasoning depth, and includes utility pipelines for training, evaluation, and deployment.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    pgai

    pgai

    A suite of tools to develop RAG, semantic search, and other AI apps

    pgai is a suite of PostgreSQL extensions developed by Timescale to empower developers in building AI applications directly within their databases. It integrates tools for vector storage, advanced indexing, and AI model interactions, facilitating the development of applications like semantic search and Retrieval-Augmented Generation (RAG) without leaving the SQL environment.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    QTE Technologies-Industrial-Scientific

    QTE Technologies-Industrial-Scientific

    1M+ Industrial & Scientific MRO Metadata for AI and Research

    This is the official open-data repository for QTE Technologies, providing a comprehensive archive of over 1,000,000 industrial and scientific MRO (Maintenance, Repair, and Operations) records. Optimized for Industrial AI training, RAG applications, and semantic search, this dataset includes technical specifications, global standards, and manufacturer metadata. Verification & Authority: Managed via DVC on DagsHub. Archived on Zenodo, Harvard Dataverse, and Figshare. Linked Data via Wikidata (Q138411149). Built for engineers, data scientists, and procurement professionals worldwide.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB