Industrial-level controllable zero-shot text-to-speech system
Qwen3-TTS is an open-source series of TTS models
Official inference repo for FLUX.2 models
Hunyuan Translation Model Version 1.5
Multimodal Diffusion with Representation Alignment
Official inference repo for FLUX.1 models
Agentic, Reasoning, and Coding (ARC) foundation models
Powerful AI language model (MoE) optimized for efficiency/performance
Official repository for LTX-Video
Advanced language and coding AI model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Code for running inference and finetuning with SAM 3 model
Image generation model with single-stream diffusion transformer
Reference PyTorch implementation and models for DINOv3
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Controllable & emotion-expressive zero-shot TTS
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
An experimental version of DeepSeek model
Achieving 3+ generation speedup on reasoning tasks
PyTorch code and models for the DINOv2 self-supervised learning
Video Object and Interaction Deletion
Foundation model for image generation
A Unified Framework for Text-to-3D and Image-to-3D Generation
Block Diffusion for Ultra-Fast Speculative Decoding
Pretrained time-series foundation model developed by Google Research