autollm is an open-source Python framework designed to make it much faster to build retrieval-augmented generation applications and expose them as usable services with minimal setup. The project focuses on simplifying the usual stack of model selection, document ingestion, vector storage, querying, and API deployment into a more unified developer experience. Its core idea is that a developer can create a query engine from a document set in just a few lines and then turn that same engine into a FastAPI application almost instantly. AutoLLM supports a broad range of language models and vector databases, which makes it useful for teams that want flexibility without rewriting their application architecture every time they switch providers. The framework also includes built-in readers for multiple content sources such as PDFs, DOCX files, notebooks, websites, and other document types, which helps shorten the time between raw data and a working knowledge application.
Features
- Unified API across 100+ language models
- Support for 20+ vector databases
- One-line creation of RAG query engines
- One-line conversion to FastAPI apps
- Built-in readers for files, notebooks, websites, and documents
- Automated token and cost calculation across supported models