Awesome multilingual OCR toolkits based on PaddlePaddle
Contexts Optical Compression
OCR software, free and offline
Crowdsourcing platform for full text transcription and tagging
Audio foundation model excelling in audio understanding
A framework to enable multimodal models to operate a computer
OCRmyPDF adds an OCR text layer to scanned PDF files
Enhances Tesseract OCR output using LLMs (local or API)
Accurate × Fast × Comprehensive
Visual Causal Flow
A simple tool for reading in poorly redacted documents
OCR expert VLM powered by Hunyuan's native multimodal architecture
Python Audio Analysis Library: Feature Extraction, Classification
An on-premises, OCR-free unstructured data extraction
A ranked list of awesome machine learning Python libraries
A Web UI for easy subtitle using whisper model
A Python application to add watermarks (text or image) to PDF files
Img2Txt - Extract Text From Images using AI
The ultimate tool to automate custom telegram message forwarding
tom_core - a tool for automating events on a computer
Convolutional neural network model for video classification
Optical Music Recognition for Tablature Notations
A pygame music lib.