A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Extract and convert data from any document, images, pdfs, word doc
A machine learning software for extracting information
Structured data extraction and instruction calling with ML, LLM
No-code LLM Platform to launch APIs and ETL Pipelines
CLI tool to extract (meta)data from PDF and manipulate PDF files
Fast and efficient unstructured data extraction
ExtractThinker is a Document Intelligence library for LLMs
Fast, local-first web content extraction for LLMs
Zero-copy PDF text extraction library written in Zig
MD/.JSON Document OCR and structured data extraction API
Command-line XML and HTML beautifier and content extractor
Extract internal monitoring data from application logs
Turn entire websites into LLM-ready markdown or structured data
Document content and metadata extraction microservice
PDF Parser for AI-ready data. Automate PDF accessibility
A beautiful, cross-platform downloader for YouTube, TikTok, Instagram
Open source NLP guide with models, methods, and real use cases
ContextGem: Effortless LLM extraction from documents
JavaScript OCR and text extraction for images and PDFs
Crawl a website starting from a URL, find relevant pages
Library for extracting streaming site data without official APIs
Document (PDF, Word, PPTX ...) extraction and parse API
A super-easy, composable, web server framework for warp speeds
Turn any technical book PDF into a Claude Code skill