GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Recovering the Visual Space from Any Views
MiniMax-M2, a model built for Max coding & agentic workflows
PyTorch implementation of JiT
Models for object and human mesh reconstruction
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
A Powerful Native Multimodal Model for Image Generation
Pokee Deep Research Model Open Source Repo
Advancing Open-source World Models
DeepSeek Coder: Let the Code Write Itself
Easy Docker setup for Stable Diffusion with user-friendly UI
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Qwen3.6 is the large language model series developed by Qwen team
Model export recipes, Python primitives, and Swift runtime utilities
Programmatic access to the AlphaGenome model
An experimental version of DeepSeek model
Miso TTS is an 8 billion, highly emotive text-to-speech model
Qwen-Image is a powerful image generation foundation model
Open-source image generative foundation model
Inference script for Oasis 500M
Multimodal Diffusion with Representation Alignment
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A series of math-specific large language models of our Qwen2 series
A Systematic Framework for Interactive World Modeling