Robust Speech Recognition via Large-Scale Weak Supervision
Multilingual speech recognition and audio understanding model
Contexts Optical Compression
High-Performance Face Recognition Library on PaddlePaddle & PyTorch
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Audio foundation model excelling in audio understanding
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Open source AI VTuber platform with voice chat and Live2D avatars
Open source semantic search and text analytics for large document sets
Accurate × Fast × Comprehensive
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Fast multimodal LLM for real-time voice interaction and AI apps
LLM Large Model of Selling Anchor
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Flock is a workflow-based low-code platform for building chatbots
Interactive Machine Learning experiments
Framework for building real-time voice and multimodal AI agents
The media player for language learning, with dual subtitles
Research and application of technologies such as nl processing
Video understanding codebase from FAIR for reproducing video models
A proof-of-concept jupyter extension which converts english queries
Advanced NLP with spaCy: A free online course
Large Audio Language Model built for natural interactions
StreamSpeech is a seamless model for offline speech recognition
Foundational Models for State-of-the-Art Speech and Text Translation