A machine learning software for extracting information
CLI tool to extract (meta)data from PDF and manipulate PDF files
ExtractThinker is a Document Intelligence library for LLMs
Command-line XML and HTML beautifier and content extractor
Structured data extraction and instruction calling with ML, LLM
A library for audio and music analysis, feature extraction
Document (PDF, Word, PPTX ...) extraction and parse API
Zero-copy PDF text extraction library written in Zig
Turn entire websites into LLM-ready markdown or structured data
ContextGem: Effortless LLM extraction from documents
Flexible Node.js AI-assisted crawler library
Extract and convert data from any document, images, pdfs, word doc
A high-quality tool for convert PDF to Markdown and JSON
Crawl a website starting from a URL, find relevant pages
Fast and efficient unstructured data extraction
Open source NLP guide with models, methods, and real use cases
No-code LLM Platform to launch APIs and ETL Pipelines
MD/.JSON Document OCR and structured data extraction API
JavaScript OCR and text extraction for images and PDFs
Automatic extraction of relevant features from time series
Make websites accessible for AI agents
Document content and metadata extraction microservice
Extract internal monitoring data from application logs
A Simple and Universal Swarm Intelligence Engine
ZipArchive is a simple utility class for zipping and unzipping files