Awesome multilingual OCR toolkits based on PaddlePaddle
Robust Speech Recognition via Large-Scale Weak Supervision
State-of-the-art 2D and 3D Face Analysis Project
A Lightweight Face Recognition and Facial Attribute Analysis
OCR software, free and offline
Speech recognition module for Python
Multilingual speech recognition and audio understanding model
Contexts Optical Compression
High-Performance Face Recognition Library on PaddlePaddle & PyTorch
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Open-source industrial-grade ASR models
kaldi-asr/kaldi is the official location of the Kaldi project
A PyTorch-based Speech Toolkit
Audio foundation model excelling in audio understanding
Robust Speech Recognition Across Languages, Dialects
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Faster Whisper transcription with CTranslate2
Automatic Speech Recognition with Word-level Timestamps
Library for OCR-related tasks powered by Deep Learning
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Multilingual Automatic Speech Recognition with word-level timestamps
A full spaCy pipeline and models for scientific/biomedical documents
Enhances Tesseract OCR output using LLMs (local or API)
Replace OpenAI GPT with another LLM in your app
Training data (data labeling, annotation, workflow) for all data types