Awesome multilingual OCR toolkits based on PaddlePaddle
Robust Speech Recognition via Large-Scale Weak Supervision
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Speech recognition module for Python
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCR software, free and offline
Contexts Optical Compression
Library for OCR-related tasks powered by Deep Learning
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
A full spaCy pipeline and models for scientific/biomedical documents
Voice Recognition to Text Tool
Automatic Speech Recognition with Word-level Timestamps
Toolkit for conversational AI
Audio foundation model excelling in audio understanding
Open-source industrial-grade ASR models
Open source annotation tool for machine learning practitioners
Underthesea - Vietnamese NLP Toolkit
OCRmyPDF adds an OCR text layer to scanned PDF files
Accurate × Fast × Comprehensive
Faster Whisper transcription with CTranslate2
NLP Cloud serves high performance pre-trained or custom models for NER
Enhances Tesseract OCR output using LLMs (local or API)
The behavior guidance framework for customer-facing LLM agents
Han Language Processing
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD