Agentic, Reasoning, and Coding (ARC) foundation models
MOSS‑TTS Family open‑source speech and sound generation model
Long-form streaming TTS system for multi-speaker dialogue generation
Large Multimodal Models for Video Understanding and Editing
OCR expert VLM powered by Hunyuan's native multimodal architecture
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open Multilingual Multimodal Chat LMs
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201