The Iris Book: Addition, Subtraction, Multiplication, and Division
Robust Speech Recognition via Large-Scale Weak Supervision
Multilingual speech recognition and audio understanding model
Contexts Optical Compression
High-Performance Face Recognition Library on PaddlePaddle & PyTorch
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Audio foundation model excelling in audio understanding
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Book_4_Matrix Power | The Iris Book: From Addition, Subtraction
Framework for building real-time voice and multimodal AI agents
Open source AI VTuber platform with voice chat and Live2D avatars
Accurate × Fast × Comprehensive
The no-nonsense RAG chunking library
From Addition, Subtraction, Multiplication, and Division to ML
Fast multimodal LLM for real-time voice interaction and AI apps
LLM Large Model of Selling Anchor
A proof-of-concept jupyter extension which converts english queries
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Large Audio Language Model built for natural interactions
StreamSpeech is a seamless model for offline speech recognition
Video understanding codebase from FAIR for reproducing video models
A simple tool for reading in poorly redacted documents
Visual Causal Flow
Real-time voice interactive digital human
Advanced NLP with spaCy: A free online course