Video Object and Interaction Deletion
Code for running inference with the SAM 3D Body Model 3DB
Native and Compact Structured Latents for 3D Generation
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Implementation of the Surya Foundation Model for Heliophysics
Qwen3-ASR is an open-source series of ASR models
A Family of Open Sourced Music Foundation Models
A 0.1B Omni model trained from scratch
Open Source Speech Language Model
High-resolution models for human tasks
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Multimodal Diffusion with Representation Alignment
A theoretical reconstruction of the Claude Mythos architecture
Uncommon Objects in 3D dataset
A Production-ready Reinforcement Learning AI Agent Library
Diversity-driven optimization and large-model reasoning ability
Qwen3-Coder is the code version of Qwen3
Netease Youdao's open-source embedding and reranker models
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Advancing Open-source World Models
Miso TTS is an 8 billion, highly emotive text-to-speech model
Generate Any 3D Scene in Seconds