A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Structured data extraction and instruction calling with ML, LLM
No-code LLM Platform to launch APIs and ETL Pipelines
ExtractThinker is a Document Intelligence library for LLMs
Document content and metadata extraction microservice
Open source NLP guide with models, methods, and real use cases
ContextGem: Effortless LLM extraction from documents
Document (PDF, Word, PPTX ...) extraction and parse API
A high-quality tool for convert PDF to Markdown and JSON
Python Audio Analysis Library: Feature Extraction, Classification
End-to-end pipeline converting generative videos
A Simple and Universal Swarm Intelligence Engine
Claude Code skill for generating production-quality SVG+PNG technical
OCR software, free and offline
PyTorch code and models for the DINOv2 self-supervised learning
AI video generator optimized for low VRAM and older GPUs use
kaldi-asr/kaldi is the official location of the Kaldi project
Your Fully-Automated Personal AI Assistant
From Paper to Presentation in One Click
An on-premises, OCR-free unstructured data extraction
Synthetic data curation for post-training and data extraction
Open-source evaluation toolkit of large multi-modality models (LMMs)
NLP Cloud serves high performance pre-trained or custom models for NER
Reference PyTorch implementation and models for DINOv3
Get your documents ready for gen AI