Face recognition with deep neural networks
State-of-the-art 2D and 3D Face Analysis Project
A Lightweight Face Recognition and Facial Attribute Analysis
Awesome multilingual OCR toolkits based on PaddlePaddle
Repo of Qwen2-Audio chat & pretrained large audio language model
Speech-to-text, text-to-speech, and speaker recognition
Leading free and open-source liveliness check &face recognition system
Contexts Optical Compression
Audio foundation model excelling in audio understanding
Integrating LLMs into structured NLP pipelines
kaldi-asr/kaldi is the official location of the Kaldi project
Graphical User Interface Face Anonymization Tool
Synchronized Translation for Videos
A PyTorch-based Speech Toolkit
StreamSpeech is a seamless model for offline speech recognition
A chatbot built based on a large model
Capable of understanding text, audio, vision, video
Large Audio Language Model built for natural interactions
Visual Causal Flow
Real-time voice interactive digital human
A framework to enable multimodal models to operate a computer
Models for the spaCy Natural Language Processing (NLP) library
Qwen3-ASR is an open-source series of ASR models
Data manipulation and transformation for audio signal processing
An Open Source text-to-speech system built by inverting Whisper