Wan2.2: Open and Advanced Large-Scale Video Generative Model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Awesome multilingual OCR toolkits based on PaddlePaddle
State-of-the-art TTS model under 25MB
Industrial-level controllable zero-shot text-to-speech system
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Generating Immersive, Explorable, and Interactive 3D Worlds
Qwen3-TTS is an open-source series of TTS models
Official inference repo for FLUX.2 models
Hunyuan Translation Model Version 1.5
Miso TTS is an 8 billion, highly emotive text-to-speech model
The most powerful local music generation model
Multimodal Diffusion with Representation Alignment
Official inference repo for FLUX.1 models
Agentic, Reasoning, and Coding (ARC) foundation models
HY-Motion model for 3D character animation generation
Open image model at the forefront of design
Powerful AI language model (MoE) optimized for efficiency/performance
Official repository for LTX-Video
Advanced language and coding AI model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Code for running inference and finetuning with SAM 3 model
Image generation model with single-stream diffusion transformer
Qwen3 is the large language model series developed by Qwen team
Reference PyTorch implementation and models for DINOv3