High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Official Python inference and LoRA trainer package
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Native and Compact Structured Latents for 3D Generation
Open-source, high-performance AI model with advanced reasoning
Awesome multilingual OCR toolkits based on PaddlePaddle
Official repository for LTX-Video
State-of-the-art TTS model under 25MB
Recovering the Visual Space from Any Views
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Repo for SeedVR2 & SeedVR
Accurate × Fast × Comprehensive
A Family of Open Sourced Music Foundation Models
Diversity-driven optimization and large-model reasoning ability
Official inference repo for FLUX.2 models
From Images to High-Fidelity 3D Assets
Open-source large language model family from Tencent Hunyuan
Reference PyTorch implementation and models for DINOv3
Multimodal Diffusion with Representation Alignment
Qwen3-TTS is an open-source series of TTS models
Industrial-level controllable zero-shot text-to-speech system
Miso TTS is an 8 billion, highly emotive text-to-speech model
Fast, Sharp & Reliable Agentic Intelligence
The most powerful local music generation model