Wan2.2: Open and Advanced Large-Scale Video Generative Model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Awesome multilingual OCR toolkits based on PaddlePaddle
Industrial-level controllable zero-shot text-to-speech system
State-of-the-art TTS model under 25MB
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Generating Immersive, Explorable, and Interactive 3D Worlds
Qwen3-TTS is an open-source series of TTS models
Hunyuan Translation Model Version 1.5
Multimodal Diffusion with Representation Alignment
Official inference repo for FLUX.2 models
Miso TTS is an 8 billion, highly emotive text-to-speech model
HY-Motion model for 3D character animation generation
The most powerful local music generation model
Official inference repo for FLUX.1 models
Agentic, Reasoning, and Coding (ARC) foundation models
Powerful AI language model (MoE) optimized for efficiency/performance
Official repository for LTX-Video
Repo for SeedVR2 & SeedVR
Tooling for the Common Objects In 3D dataset
Advanced language and coding AI model
High-Fidelity and Controllable Generation of Textured 3D Assets
Image generation model with single-stream diffusion transformer
Reference PyTorch implementation and models for DINOv3
Code for running inference and finetuning with SAM 3 model