dude uncomplicated data extraction: A simple framework
ExtractThinker is a Document Intelligence library for LLMs
CLI tool to extract (meta)data from PDF and manipulate PDF files
Clean network diagrams, One-time setup, zero upkeep
Turn entire websites into LLM-ready markdown or structured data
Structured data extraction and instruction calling with ML, LLM
MD/.JSON Document OCR and structured data extraction API
Did you say you like data?
AI-ready web crawler that extracts and structures website content
Unreal Engine Archives Explorer
No-code LLM Platform to launch APIs and ETL Pipelines
Crawl a website starting from a URL, find relevant pages
Fast and efficient unstructured data extraction
Model Context Protocol server that integrates AgentQL's data
Flexible Node.js AI-assisted crawler library
ContextGem: Effortless LLM extraction from documents
Make websites accessible for AI agents
A library for audio and music analysis, feature extraction
BlockArrays for Julia
AI-first Ruby framework for building fast, flexible web scraping spide
Open source web scraping system for automated data collection tasks
Automatic extraction of relevant features from time series
Zero-copy PDF text extraction library written in Zig
Eases DOM navigation for HTML and XML documents
Extract and convert data from any document, images, pdfs, word doc