Powerful AI language model (MoE) optimized for efficiency/performance
A Multi-Modal World Model for Reconstructing, Generating, Simulation
From Vibe Coding to Agentic Engineering
Qwen3.5 is the large language model series developed by Qwen team
Models for object and human mesh reconstruction
Official repository for LTX-Video
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official inference repo for FLUX.2 models
Open-source multi-speaker long-form text-to-speech model
FlashMLA: Efficient Multi-head Latent Attention Kernels
MOSS‑TTS Family open‑source speech and sound generation model
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
MiniMax-M2, a model built for Max coding & agentic workflows
Long-form streaming TTS system for multi-speaker dialogue generation
Diffusion Transformer with Fine-Grained Chinese Understanding
New family of code large language models (LLMs)
Advanced language and coding AI model
Multimodal-Driven Architecture for Customized Video Generation
Qwen3.6 is the large language model series developed by Qwen team
A Customizable Image-to-Video Model based on HunyuanVideo
Reference PyTorch implementation and models for DINOv3
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Phi-3.5 for Mac: Locally-run Vision and Language Models
Industrial-level controllable zero-shot text-to-speech system