MiniRAG is a lightweight retrieval-augmented generation tool designed to bring the benefits of RAG workflows to smaller datasets, edge environments, and constrained compute settings by simplifying embedding, indexing, and retrieval. It extracts text from documents, codes, or other structured inputs and converts them into embeddings using efficient models, then stores these vectors for fast nearest-neighbor search without requiring huge databases or separate vector servers. When a query is issued, MiniRAG retrieves the most relevant contexts and feeds them into a generative model to produce an answer that is grounded in the source material rather than hallucinated. Its minimal footprint makes it suitable for local research assistants, chatbots, help desks, or knowledge bases embedded in applications with limited resources. Despite its simplicity, it includes features such as chunking logic, configurable embedding models, and optional caching to balance performance and accuracy.
Features
- Lightweight embedding and indexing
- Fast nearest-neighbor retrieval
- Query-driven generative output grounded in source text
- Configurable chunking and context limits
- Minimal compute and dependency footprint
- Easy to integrate into local apps and bots