Repo of Qwen2-Audio chat & pretrained large audio language model
Detect faces in an image
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Code for running inference with the SAM 3D Body Model 3DB
Capable of understanding text, audio, vision, video
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Open-source large language model family from Tencent Hunyuan
Qwen3-Coder is the code version of Qwen3
A Conversational Speech Generation Model
Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input
Russian ASR model fine-tuned on Common Voice and CSS10 datasets
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
CTC-based forced aligner for audio-text in 158 languages