dude uncomplicated data extraction: A simple framework
ExtractThinker is a Document Intelligence library for LLMs
CLI tool to extract (meta)data from PDF and manipulate PDF files
Turn entire websites into LLM-ready markdown or structured data
MD/.JSON Document OCR and structured data extraction API
Structured data extraction and instruction calling with ML, LLM
Crawl a website starting from a URL, find relevant pages
Fast, local-first web content extraction for LLMs
Unreal Engine Archives Explorer
PDF Parser for AI-ready data. Automate PDF accessibility
No-code LLM Platform to launch APIs and ETL Pipelines
Clean network diagrams, One-time setup, zero upkeep
Fast and efficient unstructured data extraction
Model Context Protocol server that integrates AgentQL's data
Automatic extraction of relevant features from time series
Extract and convert data from any document, images, pdfs, word doc
Enhance any agent's browser use skill
Flexible Node.js AI-assisted crawler library
ContextGem: Effortless LLM extraction from documents
AI-first Ruby framework for building fast, flexible web scraping spide
Claude Code skill for generating production-quality SVG+PNG technical
Document content and metadata extraction microservice
AI-ready web crawler that extracts and structures website content
Declarative web scraping
Open source web scraping system for automated data collection tasks