Industrial-level controllable zero-shot text-to-speech system
The official repo of Qwen chat & pretrained large language model
LTX-Video Support for ComfyUI
Official inference repo for FLUX.1 models
Open-source multi-speaker long-form text-to-speech model
Contexts Optical Compression
Generating Immersive, Explorable, and Interactive 3D Worlds
Code for running inference with the SAM 3D Body Model 3DB
Renderer for the harmony response format to be used with gpt-oss
Diffusion Transformer with Fine-Grained Chinese Understanding
Python bindings for llama.cpp
HY-Motion model for 3D character animation generation
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Global weather forecasting model using graph neural networks and JAX
Python SDK for Claude Agent
A Systematic Framework for Interactive World Modeling
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Unified Multimodal Understanding and Generation Models
Sharp Monocular Metric Depth in Less Than a Second
DeepSeek Coder: Let the Code Write Itself
High-Resolution Image Synthesis with Latent Diffusion Models
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Programmatic access to the AlphaGenome model
OCR expert VLM powered by Hunyuan's native multimodal architecture