Wan2.2: Open and Advanced Large-Scale Video Generative Model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Awesome multilingual OCR toolkits based on PaddlePaddle
Industrial-level controllable zero-shot text-to-speech system
State-of-the-art TTS model under 25MB
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Generating Immersive, Explorable, and Interactive 3D Worlds
Qwen3-TTS is an open-source series of TTS models
Hunyuan Translation Model Version 1.5
Miso TTS is an 8 billion, highly emotive text-to-speech model
Official inference repo for FLUX.2 models
The most powerful local music generation model
Multimodal Diffusion with Representation Alignment
Official inference repo for FLUX.1 models
Agentic, Reasoning, and Coding (ARC) foundation models
HY-Motion model for 3D character animation generation
Powerful AI language model (MoE) optimized for efficiency/performance
Official repository for LTX-Video
Open image model at the forefront of design
Advanced language and coding AI model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Tooling for the Common Objects In 3D dataset
Reference PyTorch implementation and models for DINOv3
Code for running inference and finetuning with SAM 3 model
Image generation model with single-stream diffusion transformer