PixelRAG is a visual retrieval-augmented generation system that searches documents by how they look, not only by the text they contain. It renders web pages, PDFs, and images into screenshot tiles, then performs retrieval over those visual representations. This approach preserves layout, tables, charts, diagrams, infographics, and other visual structure that traditional HTML or text parsing can miss. The project includes tools for rendering, chunking, embedding, indexing, and serving visual search indexes. It also provides a hosted API with a prebuilt Wikipedia index, plus local pipelines for building indexes from custom documents. PixelRAG can be used with Claude through the pixelbrowse skill, giving agents the ability to inspect pages visually instead of relying only on raw markup.
Features
- Screenshot-based document retrieval
- Support for web pages, PDFs, and images
- Visual indexing with FAISS
- Hosted Wikipedia search API
- Local indexing pipeline for custom documents
- Claude Code pixelbrowse integration