High-Resolution Image Synthesis with Latent Diffusion Models
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models
A theoretical reconstruction of the Claude Mythos architecture
A latent text-to-image diffusion model
Text and image to video generation: CogVideoX and CogVideo
State-of-the-art TTS model under 25MB
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Qwen-Image is a powerful image generation foundation model
Qwen3.6 is the large language model series developed by Qwen team
From Images to High-Fidelity 3D Assets
Qwen3-Coder is the code version of Qwen3
Qwen3 is the large language model series developed by Qwen team
Qwen3.5 is the large language model series developed by Qwen team
Qwen2.5-VL is the multimodal large language model series
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Open-source multi-speaker long-form text-to-speech model
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
A Multi-Modal World Model for Reconstructing, Generating, Simulation
RGBD video generation model conditioned on camera input
Flux 2 image generation model pure C inference
A Family of Open Sourced Music Foundation Models
Advancing Open-source World Models
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Reference PyTorch implementation and models for DINOv3
DeepSeek Coder: Let the Code Write Itself