Search Results for "information retrieval"

Sort By:

Showing 290 open source projects for "information retrieval"

View related business solutions

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

BEIR

A Heterogeneous Benchmark for Information Retrieval

BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.

Downloads: 0 This Week

Last Update: 2025-06-04
See Project
2

Claude Context

Code search MCP for Claude Code

Claude Context is a tool designed to enhance the contextual understanding of large language models by managing and injecting relevant information into prompts. It focuses on improving response quality by ensuring that models have access to the most relevant data when generating outputs. The system integrates with vector databases and retrieval systems, enabling efficient storage and retrieval of contextual information. It supports workflows such as retrieval-augmented generation, where external knowledge is dynamically incorporated into model responses. ...

Downloads: 3 This Week

Last Update: 3 hours ago
See Project
3

RAPTOR

The official implementation of RAPTOR

RAPTOR is a retrieval architecture designed to improve retrieval-augmented generation systems by organizing documents into hierarchical structures that enable more effective context retrieval. Traditional RAG systems typically retrieve small text chunks independently, which can limit a model’s ability to understand broader document context. RAPTOR addresses this limitation by recursively embedding, clustering, and summarizing documents to create a tree-structured hierarchy of information. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
4

LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

LightRAG is a lightweight Retrieval-Augmented Generation (RAG) framework designed for efficient document retrieval and response generation. It is optimized for speed and lower resource consumption, making it ideal for real-time applications.

Downloads: 2 This Week

Last Update: 6 days ago
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
5

FastRAG

Efficient Retrieval Augmentation and Generation Framework

fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool set for advancing retrieval augmented generation.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
6

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

FlagEmbedding is an open-source toolkit for building and deploying high-performance text embedding models used in information retrieval and retrieval-augmented generation systems. The project is part of the BAAI FlagOpen ecosystem and focuses on creating embedding models that transform text into dense vector representations suitable for semantic search and large language model pipelines. FlagEmbedding includes a family of models known as BGE (BAAI General Embedding), which are designed to achieve strong performance across multilingual and cross-lingual retrieval benchmarks. ...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
7

Youtu-GraphRAG

Vertically Unified Agents for Graph Retrieval-Augmented Reasoning

Youtu-GraphRAG is a research framework developed by Tencent for performing complex reasoning using graph-based retrieval-augmented generation. The system combines knowledge graphs, retrieval mechanisms, and agent-based reasoning into a unified architecture designed to handle knowledge-intensive tasks. Instead of relying solely on text retrieval, the framework organizes information into structured graph schemas that represent entities, relationships, and attributes. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
8

Agentic RAG for Dummies

A modular Agentic RAG built with LangGraph

Agentic RAG for Dummies is an educational repository that demonstrates how to build retrieval-augmented generation systems combined with autonomous AI agents. The project explains the principles behind agentic retrieval pipelines where language models can dynamically decide when to retrieve information, analyze results, and plan further actions. Instead of relying on static retrieval pipelines, the system shows how agents can orchestrate retrieval, reasoning, and tool usage in a more flexible decision loop. ...

Downloads: 2 This Week

Last Update: 2026-04-01
See Project
9

WeKnora

LLM framework for document understanding and semantic retrieval

WeKnora is an open source framework developed for deep document understanding and semantic information retrieval using large language models. It focuses on analyzing complex and heterogeneous documents by combining multiple processing stages such as multimodal document parsing, vector indexing, and intelligent retrieval. It follows the Retrieval-Augmented Generation (RAG) paradigm, where relevant document segments are retrieved and used by language models to generate accurate, context-aware responses. ...

Downloads: 0 This Week

Last Update: 2026-04-15
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
10

memsearch

A Markdown-first memory system, a standalone library for any AI agent

memsearch is a markdown-first memory system designed to provide long-term memory capabilities for AI agents through structured storage and semantic retrieval. It enables agents to store, organize, and retrieve information using embeddings and hybrid search techniques, ensuring that relevant context is always available. The system supports advanced features such as reranking and progressive disclosure, which help prioritize the most useful information for a given query. It integrates with vector databases like Milvus, enabling scalable storage and retrieval of large datasets. ...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
11

Erethos Downloader

Save Erothots content including leaked videos and premium collections

Erethos Downloader is an automation tool designed to download media content from supported adult content platforms, enabling users to archive videos and images locally for offline access. The project focuses on simplifying the retrieval process by allowing users to input URLs or account information and automatically fetch associated content in bulk. It includes mechanisms for handling authentication and session data, ensuring access to content that may require login credentials. The downloader is designed to manage large batches efficiently, reducing manual effort when saving multiple posts or galleries. ...

1 Review

Downloads: 86 This Week

Last Update: 2026-03-18
See Project
12

GBrain

Garry's Opinionated OpenClaw/Hermes Agent Brain

...It also organizes knowledge into structured documents with summaries and timelines, helping agents maintain context and track changes in information.

Downloads: 13 This Week

Last Update: 5 days ago
See Project
13

Hindsight

Hindsight: Agent Memory That Learns

Hindsight is an advanced, open-source memory system for AI agents designed to enable long-term learning, reasoning, and consistency across interactions by treating memory as a first-class component of intelligence rather than a simple retrieval layer. It addresses one of the core limitations of modern AI agents, which is their inability to retain and meaningfully use past experiences over time, by introducing a structured, biomimetic memory architecture inspired by how human memory works. Instead of relying solely on vector similarity or basic retrieval techniques, Hindsight organizes information into distinct categories such as facts, experiences, beliefs, and observations, allowing agents to differentiate between raw data and inferred knowledge. ...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
14

SAG

SQL-Driven RAG Engine

...These vectors allow the system to identify relationships between concepts and construct a graph representation of knowledge at runtime. The architecture also includes a three-stage retrieval pipeline consisting of recall, expansion, and reranking steps to improve search accuracy. The engine integrates semantic vector similarity with traditional full-text search to improve both recall and precision. Because the knowledge graph is generated dynamically, the system can adapt to new information without requiring manual graph maintenance.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
15

QuivrHQ

Opiniated RAG for integrating GenAI in your apps

Quivr is an open-source platform that leverages Retrieval-Augmented Generation (RAG) to integrate Generative AI into applications. It serves as a "second brain," enabling users to build powerful AI-driven assistants that can process and retrieve information efficiently. Quivr supports various large language models and vector stores, providing flexibility and customization for developers.

Downloads: 0 This Week

Last Update: 2025-05-30
See Project
16

Dynamiq

An orchestration framework for agentic AI and LLM applications

Dynamiq is an open-source orchestration framework designed to streamline the development of generative AI applications that rely on large language models and autonomous agents. The framework focuses on simplifying the creation of complex AI workflows that involve multiple agents, retrieval systems, and reasoning steps. Instead of building each component manually, developers can use Dynamiq’s structured APIs and modular architecture to connect language models, vector databases, and external tools into cohesive pipelines. The framework supports the creation of multi-agent systems where different AI agents collaborate to solve tasks such as information retrieval, document analysis, or automated decision making. ...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
17

MCP Server RAG Web Browser

A MCP Server for the RAG Web Browser Actor

The MCP Server for the RAG Web Browser Actor allows AI assistants and LLMs to perform web searches and extract information from web pages. It facilitates interaction with the web, enabling up-to-date context retrieval for AI applications.

Downloads: 0 This Week

Last Update: 2025-08-21
See Project
18

Qwen3-VL-Embedding

Multimodal embedding and reranking models built on Qwen3-VL

...The core embedding model maps such inputs into semantically rich vectors in a unified representation space, enabling similarity search, clustering, and cross-modal retrieval. The reranking model then precisely scores relevance between a given query and candidate documents, enhancing retrieval accuracy in complex multimodal tasks. Together, they support advanced information retrieval workflows such as image-text search, visual question answering (VQA), and video-text matching, while providing out-of-the-box support for more than 30 languages.

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
19

MemoryOS

MemoryOS is designed to provide a memory operating system

MemoryOS is an open-source framework designed to provide a structured memory management system for AI agents and large language model applications. The project addresses one of the major limitations of modern language models: their inability to maintain long-term context beyond the limits of their prompt window. MemoryOS introduces a hierarchical memory architecture inspired by operating system memory management principles, allowing agents to store, update, retrieve, and generate information...

Downloads: 3 This Week

Last Update: 2026-03-09
See Project
20

gensim

Topic Modelling for Humans

Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.

Downloads: 2 This Week

Last Update: 2025-10-16
See Project
21

WebGLM

An Efficient Web-enhanced Question Answering System

WebGLM is a web-enhanced question-answering system that combines a large language model with web search and retrieval capabilities to produce more accurate answers. The system is based on the General Language Model architecture and was designed to enable language models to interact directly with web information during the question-answering process. Instead of relying solely on knowledge stored in the model’s training data, the system retrieves relevant web content and integrates it into the reasoning process. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
22

Supermemory

Memory engine and app that is extremely fast, scalable

Supermemory is an ambitious and extensible AI-powered personal knowledge management system that aims to help users capture, organize, retrieve, and reason over information in a manner that mimics human memory structures. The platform allows individuals to ingest text, documents, and other content forms, then uses advanced retrieval and embedding techniques to index and relate information intelligently so that users can recall relevant knowledge in context rather than just by keyword match. It often incorporates clustering, semantic search, and summarization modules to reduce cognitive load and surface key ideas, which makes it useful for research, study, writing, and long-term project tracking. ...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
23

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
24

Kernel Memory

Research project. A Memory solution for users, teams, and applications

Kernel Memory is an open-source reference architecture developed by Microsoft to help developers build memory systems for AI applications powered by large language models. The project focuses on enabling applications to store, index, and retrieve information so that AI systems can incorporate external knowledge when generating responses. It supports scenarios such as document ingestion, semantic search, and retrieval-augmented generation, allowing language models to answer questions using contextual information from private or enterprise datasets. Kernel Memory can ingest documents in multiple formats, process them into embeddings, and store them in searchable indexes. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
25

ChatWiki

ChatWiki WeChat official account's AI knowledge base workflow agent

ChatWiki is an open-source AI knowledge base and workflow automation platform designed to help organizations build intelligent question-answering systems using large language models and retrieval-augmented generation techniques. The system enables companies to transform internal documents and data into searchable knowledge bases that can power AI assistants capable of answering domain-specific questions. It provides a complete pipeline for ingesting documents, preprocessing and segmenting content, generating vector embeddings, and retrieving relevant information during conversations. ...

Downloads: 1 This Week

Last Update: 1 day ago
See Project