Diffusion Transformer with Fine-Grained Chinese Understanding
A Customizable Image-to-Video Model based on HunyuanVideo
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Global weather forecasting model using graph neural networks and JAX
Uncommon Objects in 3D dataset
A series of math-specific large language models of our Qwen2 series
CodeGeeX2: A More Powerful Multilingual Code Generation Model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
State-of-the-art (SoTA) text-to-video pre-trained model
OCR expert VLM powered by Hunyuan's native multimodal architecture
Capable of understanding text, audio, vision, video
Pushing the Limits of Mathematical Reasoning in Open Language Models
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Tiny vision language model
The official PyTorch implementation of Google's Gemma models
Open Source Speech Language Model
Open-source industrial-grade ASR models
Fast-stable-diffusion + DreamBooth
A Pragmatic VLA Foundation Model
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Hunyuan Translation Model Version 1.5
Multimodal embedding and reranking models built on Qwen3-VL
Z80-μLM is a 2-bit quantized language model