Diversity-driven optimization and large-model reasoning ability
Tiny vision language model
Bidirectional token-classification model for identifiable info
Miso TTS is an 8 billion, highly emotive text-to-speech model
Large-language-model & vision-language-model based on Linear Attention
Qwen-Image is a powerful image generation foundation model
Foundation model for image generation
Tool for exploring and debugging transformer model behaviors
Project Lyra: Open Generative 3D World Models
Open-source deep-learning framework
A Customizable Image-to-Video Model based on HunyuanVideo
Unified Multimodal Understanding and Generation Models
The official PyTorch implementation of Google's Gemma models
RGBD video generation model conditioned on camera input
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Netease Youdao's open-source embedding and reranker models
Audio foundation model excelling in audio understanding
1B text generation model based on the HRM architecture
Foundation Models for Time Series
Open-source image generative foundation model
Convert Google Gemini web into OpenAI-compatible API
A 0.1B Omni model trained from scratch
26m function call model that runs on incredibly small devices
Open Source Speech Language Model
Qwen3-ASR is an open-source series of ASR models