Qwen-Image is a powerful image generation foundation model
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Easy Docker setup for Stable Diffusion with user-friendly UI
FAIR Sequence Modeling Toolkit 2
PyTorch code and models for the DINOv2 self-supervised learning
Foundation model for image generation
Block Diffusion for Ultra-Fast Speculative Decoding
Official implementation of Watermark Anything with Localized Messages
Qwen2.5-VL is the multimodal large language model series
Audio foundation model excelling in audio understanding
Controllable & emotion-expressive zero-shot TTS
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Inference code for scalable emulation of protein equilibrium ensembles
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Generate Any 3D Scene in Seconds
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-ASR is an open-source series of ASR models
VMZ: Model Zoo for Video Modeling
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
A Unified Framework for Text-to-3D and Image-to-3D Generation
Open-source framework for intelligent speech interaction
GLM-4 series: Open Multilingual Multimodal Chat LMs
Project Lyra: Open Generative 3D World Models