Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Faster Whisper transcription with CTranslate2
Crowdsourcing platform for full text transcription and tagging
Open Source Computer Vision Library
Library for OCR-related tasks powered by Deep Learning
Voice Recognition to Text Tool
Book_4_Matrix Power | The Iris Book: From Addition, Subtraction
Han Language Processing
Training data (data labeling, annotation, workflow) for all data types
Enhances Tesseract OCR output using LLMs (local or API)
Toolkit for conversational AI
Formula recognition based on LaTeX-OCR and ONNXRuntime
CLI tool to extract (meta)data from PDF and manipulate PDF files
Replace OpenAI GPT with another LLM in your app
Framework for building real-time voice and multimodal AI agents
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Open source AI VTuber platform with voice chat and Live2D avatars
Accurate × Fast × Comprehensive
OCRmyPDF adds an OCR text layer to scanned PDF files
The no-nonsense RAG chunking library
Persian NLP Toolkit
From Addition, Subtraction, Multiplication, and Division to ML
Open source annotation tool for machine learning practitioners
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Fast multimodal LLM for real-time voice interaction and AI apps