Open-source multi-speaker long-form text-to-speech model
A Family of Open Sourced Music Foundation Models
Reference PyTorch implementation and models for DINOv3
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
State-of-the-art TTS model under 25MB
Lets make video diffusion practical
Qwen-Image is a powerful image generation foundation model
Revolutionizing Database Interactions with Private LLM Technology
Qwen2.5-VL is the multimodal large language model series
Provides convenient access to the Anthropic REST API from any Python 3
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Industrial-level controllable zero-shot text-to-speech system
DeepSeek Coder: Let the Code Write Itself
Visual Causal Flow
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Easy Docker setup for Stable Diffusion with user-friendly UI
Open-Source Financial Large Language Models
A Systematic Framework for Interactive World Modeling
Models for object and human mesh reconstruction
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
RGBD video generation model conditioned on camera input
Official repository for LTX-Video
A Powerful Native Multimodal Model for Image Generation
A Customizable Image-to-Video Model based on HunyuanVideo
Advancing Open-source World Models