GLM-4-Voice | End-to-End Chinese-English Conversational Model
Open-source multi-speaker long-form text-to-speech model
Official inference repo for FLUX.2 models
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Block Diffusion for Ultra-Fast Speculative Decoding
A Family of Open Sourced Music Foundation Models
Qwen-Image is a powerful image generation foundation model
Native and Compact Structured Latents for 3D Generation
Open-weight, large-scale hybrid-attention reasoning model
Audio foundation model excelling in audio understanding
Implementation of the Surya Foundation Model for Heliophysics
A Pragmatic VLA Foundation Model
Towards Real-World Vision-Language Understanding
Collection of Gemma 3 variants that are trained for performance
HY-Motion model for 3D character animation generation
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Hunyuan Translation Model Version 1.5
Foundation model for image generation
Multimodal Diffusion with Representation Alignment
Z80-μLM is a 2-bit quantized language model
From Images to High-Fidelity 3D Assets
Tool for exploring and debugging transformer model behaviors
Pretrained time-series foundation model developed by Google Research
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Code for running inference with the SAM 3D Body Model 3DB