Showing 80 open source projects for "indexing"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Vald

    Vald

    Vald. A Highly Scalable Distributed Vector Search Engine

    ...Vald is designed and implemented based on the Cloud-Native architecture. It uses the fastest ANN Algorithm NGT to search for neighbors. Vald has automatic vector indexing and index backup, and horizontal scaling which is made for searching from billions of feature vector data. Vald is easy to use, feature-rich and highly customizable as you needed. Usually, the graph requires locking during indexing, which causes stop-the-world. But Vald uses distributed index graphs so it continues to work during indexing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Engram

    Engram

    A New Axis of Sparsity for Large Language Models

    ...It provides utilities to generate embeddings from text or other structured data, index them using efficient approximate nearest neighbor algorithms, and perform real-time similarity queries even on large corpora. Engineered with speed and memory efficiency in mind, Engram supports batched indexing, incremental updates, and custom distance metrics so developers can tailor search behaviors to their domain’s needs. In addition to raw similarity search, the project includes tools for clustering, ranking, and filtering results, enabling richer user experiences like “related content”, semantic auto-completion, and contextual filtering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    pgai

    pgai

    A suite of tools to develop RAG, semantic search, and other AI apps

    pgai is a suite of PostgreSQL extensions developed by Timescale to empower developers in building AI applications directly within their databases. It integrates tools for vector storage, advanced indexing, and AI model interactions, facilitating the development of applications like semantic search and Retrieval-Augmented Generation (RAG) without leaving the SQL environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 5
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    ...This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 97 This Week
    Last Update:
    See Project
  • 6
    bloop

    bloop

    bloop is a fast code search engine written in Rust

    Bloop is an AI-powered code search tool designed to help developers quickly find relevant code snippets, documentation, and usage examples within large repositories. It provides natural language search capabilities and AI-enhanced recommendations for improving code discovery.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SemTools

    SemTools

    Semantic search and document parsing tools for the command line

    SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    EverMemOS

    EverMemOS

    Long-term memory OS for AI with structured recall and context awarenes

    ...Instead of treating each prompt independently, it builds evolving user profiles, tracks preferences, and connects related events into coherent narratives. Its architecture combines memory storage, indexing, and retrieval with agent-level reasoning, allowing AI systems to make informed decisions based on prior interactions. EverMemOS goes beyond simple retrieval by actively applying stored knowledge to current tasks, improving personalization and consistency. EverMemOS uses a multi-stage memory lifecycle to convert raw dialogue into structured semantic data, supporting long-horizon reasoning and adaptive behavior across sessions.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    UltraRAG

    UltraRAG

    Less Code, Lower Barrier, Faster Deployment

    UltraRAG 2.0 is a low-code, MCP-enabled RAG framework that aims to lower the barrier to building complex retrieval pipelines for research and production. It provides end-to-end recipes—from encoding and indexing corpora to deploying retrievers and LLMs—so users can reproduce baselines and iterate rapidly. The toolkit comes with built-in support for popular RAG datasets, large corpora, and canonical baselines, plus documentation that walks from “quick start” to debugging and case analysis. It encourages pipeline composition via configuration, enabling researchers to swap retrievers, rerankers, and generators without heavy refactoring. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    GraphRAG

    GraphRAG

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Basic Memory

    Basic Memory

    Persistent AI memory using local Markdown knowledge graphs

    ...Basic Memory creates a semantic knowledge graph by linking related ideas, making it easier to retrieve, expand, and connect information over time. With a local-first design, your data stays private and portable, while optional cloud sync enables cross-device access. It combines simplicity with powerful indexing and search, giving you a flexible way to build long-term memory for projects, research, and workflows.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Cognita

    Cognita

    Open source RAG framework for building scalable modular AI apps

    ...It includes both a backend service and a frontend interface, enabling users to upload documents, experiment with configurations, and perform question-answering tasks interactively. Cognita supports incremental indexing, meaning it processes only new or updated data to reduce computational overhead and improve efficiency.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Advanced + Agentic RAG Cookbooks

    Advanced + Agentic RAG Cookbooks

    Advanced RAG cookbooks for building accurate LLM applications

    ...It provides ready-to-use notebooks, implementations, and explanations that help developers move from basic RAG setups to more sophisticated workflows. Athina AI’s RAG Cookbooks covers the full RAG pipeline, including indexing, retrieval, augmentation, and generation, while also addressing evaluation to measure accuracy and relevance. It includes multiple approaches such as hybrid search, contextual compression, and agent-based retrieval strategies, allowing users to experiment and compare methods. It is designed to reduce development time by offering practical examples and references to research papers, making it useful for both learning and production use. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    VideoRAG

    VideoRAG

    "VideoRAG: Chat with Your Videos

    VideoRAG is a retrieval-augmented generation (RAG) framework tailored for video content that enables AI systems to answer questions, summarize, and reason over long videos by combining visual embeddings with contextual search. The system works by first breaking video into clips, extracting visual and audio-textual features, and indexing them into embeddings, then using an LLM with a retriever to pull relevant segments on demand. When a user query is received, VideoRAG locates semantically relevant moments in the video using the embedding index, retrieves associated clips or transcripts, and feeds them to a generative model to produce accurate, grounded answers or summaries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    HeavyDB

    HeavyDB

    HeavyDB (formerly MapD/OmniSciDB)

    ...The system is built as a SQL-based relational columnar database engine that leverages modern hardware parallelism, including GPUs and multicore CPUs. Its architecture allows users to query datasets containing billions of rows in milliseconds without requiring traditional indexing, pre-aggregation, or sampling techniques. HeavyDB was originally developed as part of the OmniSci platform (formerly MapD) and is commonly used for large-scale analytics and geospatial data processing. The database compiles queries into optimized machine code that executes efficiently on GPU hardware, significantly accelerating analytical workloads. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    Pixeltable is an open-source Python data infrastructure framework designed to support the development of multimodal AI applications. The system provides a declarative interface for managing the entire lifecycle of AI data pipelines, including storage, transformation, indexing, retrieval, and orchestration of datasets. Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a table-based abstraction. Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    clip-retrieval

    clip-retrieval

    Easily compute clip embeddings and build a clip retrieval system

    ...The system is optimized for performance and scalability, capable of processing tens or even hundreds of millions of embeddings using GPU acceleration. It includes components for inference, indexing, filtering, and serving results through APIs, making it a complete pipeline for building production-ready retrieval systems. The framework also supports querying by image, text, or embedding, enabling flexible use cases such as reverse image search or multimodal content discovery. Additionally, it provides a simple frontend interface and backend services that can be deployed to expose search functionality to users.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Pathway AI Pipelines

    Pathway AI Pipelines

    Ready-to-run cloud templates for RAG

    ...It supports numerous connectors including local files, Google Drive, SharePoint, Kafka, PostgreSQL, and real-time APIs, making it suitable for enterprise data environments. The templates include built-in indexing, vector search, hybrid search, and caching capabilities that remove the need to assemble separate infrastructure components. Developers can run the applications locally or deploy them to cloud platforms using Docker with minimal setup. Overall, llm-app functions as a practical accelerator for teams building real-time, production-ready AI knowledge systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MiniRAG

    MiniRAG

    Making RAG Simpler with Small and Open-Sourced Language Models

    MiniRAG is a lightweight retrieval-augmented generation tool designed to bring the benefits of RAG workflows to smaller datasets, edge environments, and constrained compute settings by simplifying embedding, indexing, and retrieval. It extracts text from documents, codes, or other structured inputs and converts them into embeddings using efficient models, then stores these vectors for fast nearest-neighbor search without requiring huge databases or separate vector servers. When a query is issued, MiniRAG retrieves the most relevant contexts and feeds them into a generative model to produce an answer that is grounded in the source material rather than hallucinated. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    pgvector

    pgvector

    Open-source vector similarity search for Postgres

    pgvector is an open-source PostgreSQL extension that equips PostgreSQL databases with vector data storage, indexing, and similarity search capabilities—ideal for embeddings-based applications like semantic search and recommendations. You can add an index to use approximate nearest neighbor search, which trades some recall for speed. Unlike typical indexes, you will see different results for queries after adding an approximate index. An HNSW index creates a multilayer graph.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 22
    LightRAG

    LightRAG

    "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    LightRAG is a lightweight Retrieval-Augmented Generation (RAG) framework designed for efficient document retrieval and response generation. It is optimized for speed and lower resource consumption, making it ideal for real-time applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    LEANN

    LEANN

    Local RAG engine for private multimodal knowledge search on devices

    ...LEANN introduces a storage-efficient approximate nearest neighbor index combined with on-the-fly embedding recomputation to avoid storing large embedding vectors. By recomputing embeddings during queries and using compact graph-based indexing structures, LEANN can maintain high search accuracy while minimizing disk usage. It aims to act as a unified personal knowledge layer that connects different types of data such as documents, code, images, and other local files into a searchable context for language models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Sage Chat

    Sage Chat

    Chat with any codebase in under two minutes | Fully local

    ...Developers can ask natural language questions about a project, and the system responds with explanations supported by references to the relevant code, documentation, or external technical resources. The project aims to act as a contextual knowledge layer for software teams by combining language models with repository indexing and documentation retrieval. Sage can operate locally or connect to external AI services, depending on the configuration, providing flexibility for privacy-sensitive environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    mgrep

    mgrep

    A calm, CLI-native way to semantically grep everything, like code

    ...Built with a focus on calm CLI experiences, it lets you index and query your local files with semantic understanding, delivering results that are relevant to your intent rather than simple pattern matches, which is especially powerful in large or diverse projects. It also includes features such as background indexing to keep your search index up to date without interrupting your workflow and web search integration to expand the scope of queries beyond local files. Designed for both programmers and agents, it integrates naturally into development and research workflows while offering thoughtful defaults that keep output clean and informative.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB