A GUI tool for extracting hard-coded subtitle (hardsub) from videos
A machine learning software for extracting information
Extract and convert data from any document, images, pdfs, word doc
Structured data extraction and instruction calling with ML, LLM
No-code LLM Platform to launch APIs and ETL Pipelines
Zero-copy PDF text extraction library written in Zig
CLI tool to extract (meta)data from PDF and manipulate PDF files
Fast and efficient unstructured data extraction
ExtractThinker is a Document Intelligence library for LLMs
Fast, local-first web content extraction for LLMs
MD/.JSON Document OCR and structured data extraction API
Command-line XML and HTML beautifier and content extractor
Turn entire websites into LLM-ready markdown or structured data
PDF Parser for AI-ready data. Automate PDF accessibility
Extract internal monitoring data from application logs
Crawl a website starting from a URL, find relevant pages
A super-easy, composable, web server framework for warp speeds
A beautiful, cross-platform downloader for YouTube, TikTok, Instagram
JavaScript OCR and text extraction for images and PDFs
Library for extracting streaming site data without official APIs
ContextGem: Effortless LLM extraction from documents
Open source NLP guide with models, methods, and real use cases
Document (PDF, Word, PPTX ...) extraction and parse API
A fast, helpful, and open-source document parser
Turn any technical book PDF into a Claude Code skill