Extract and convert data from any document, images, pdfs, word doc
MD/.JSON Document OCR and structured data extraction API
A machine learning software for extracting information
Structured data extraction and instruction calling with ML, LLM
No-code LLM Platform to launch APIs and ETL Pipelines
CLI tool to extract (meta)data from PDF and manipulate PDF files
Fast and efficient unstructured data extraction
ExtractThinker is a Document Intelligence library for LLMs
Fast, local-first web content extraction for LLMs
Zero-copy PDF text extraction library written in Zig
Command-line XML and HTML beautifier and content extractor
Turn entire websites into LLM-ready markdown or structured data
Document content and metadata extraction microservice
Extract internal monitoring data from application logs
A beautiful, cross-platform downloader for YouTube, TikTok, Instagram
Open source NLP guide with models, methods, and real use cases
ContextGem: Effortless LLM extraction from documents
JavaScript OCR and text extraction for images and PDFs
Crawl a website starting from a URL, find relevant pages
Document (PDF, Word, PPTX ...) extraction and parse API
A super-easy, composable, web server framework for warp speeds
Turn any technical book PDF into a Claude Code skill
Official Vectorize MCP Server
Flexible Node.js AI-assisted crawler library
Unreal Engine Archives Explorer