Did you say you like data?
Document (PDF, Word, PPTX ...) extraction and parse API
Extract structured data from webpages using LLM-powered scraping
OCR model for complex documents with layout-aware structured outputs
An on-premises, OCR-free unstructured data extraction
JavaScript OCR and text extraction for images and PDFs
Create prompt-friendly codebase digests from any Git repository URL
Discover pretrained models for deep learning in MATLAB
Self-hosted AI audio transcription
Contexts Optical Compression
Memory Management Kit for Agents
Crawl a website starting from a URL, find relevant pages
Self-hosted AI accounting app. LLM analyzer for receipts
Semantic search and document parsing tools for the command line
Open source semantic search and text analytics for large document sets
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Automated translation solution for visual novels
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Python Audio Analysis Library: Feature Extraction, Classification
Open source and self-hostable browser automation library for AI agents
Fast and efficient unstructured data extraction
An AI-powered research assistant that performs iterative research
Quick illustration of how one can easily read books together with LLMs
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
Allow LLMs to control a browser with Browserbase and Stagehand