dude uncomplicated data extraction: A simple framework
ExtractThinker is a Document Intelligence library for LLMs
CLI tool to extract (meta)data from PDF and manipulate PDF files
Turn entire websites into LLM-ready markdown or structured data
MD/.JSON Document OCR and structured data extraction API
Structured data extraction and instruction calling with ML, LLM
Crawl a website starting from a URL, find relevant pages
Fast, local-first web content extraction for LLMs
Unreal Engine Archives Explorer
AI-first Ruby framework for building fast, flexible web scraping spide
PDF Parser for AI-ready data. Automate PDF accessibility
No-code LLM Platform to launch APIs and ETL Pipelines
Fast and efficient unstructured data extraction
Model Context Protocol server that integrates AgentQL's data
Automatic extraction of relevant features from time series
Make websites accessible for AI agents
Extract and convert data from any document, images, pdfs, word doc
Clean network diagrams, One-time setup, zero upkeep
Enhance any agent's browser use skill
Claude Code skill for generating production-quality SVG+PNG technical
Flexible Node.js AI-assisted crawler library
Open source web scraping system for automated data collection tasks
ContextGem: Effortless LLM extraction from documents
Library for extracting streaming site data without official APIs
Document content and metadata extraction microservice