Edit PDF files with Nano Banana
OCRmyPDF adds an OCR text layer to scanned PDF files
A Python tool to help extracting information from structured PDFs
Document (PDF, Word, PPTX ...) extraction and parse API
CLI tool to extract (meta)data from PDF and manipulate PDF files
Python bindings for MuPDF's rendering library.
A pure-python PDF library capable of splitting, merging, cropping
Zero-copy PDF text extraction library written in Zig
borb is a library for reading, creating and manipulating PDF files
Video-based AI memory library. Store millions of text chunks in MP4
OCR software, free and offline
A simple tool for reading in poorly redacted documents
A library for converting HTML into PDFs using ReportLab
A high-quality PDF to Markdown tool based on large language model
Open-Source Python3 tool for recognizing layouts, tables, and math
A community-supported supercharged version of paperless
Generate audiobooks from EPUBs, PDFs and text with captions
Reading book source
The best free open source website change detection and restock service
The Markdown Editor for Linux
Open Source Document Management System for Digital Archives
High accuracy RAG for answering questions from scientific documents
PDF to Markdown with vision models
A minimalist command line knowledge base manager
Chinese version of Google open source project style guide