Ultra-Efficient LLMs on End Device
Qwen3-Coder is the code version of Qwen3
A Powerful Native Multimodal Model for Image Generation
Phi-3.5 for Mac: Locally-run Vision and Language Models
RGBD video generation model conditioned on camera input
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Z80-μLM is a 2-bit quantized language model
Industrial-level controllable zero-shot text-to-speech system
Video Object and Interaction Deletion
Achieving 3+ generation speedup on reasoning tasks
Open-Source Financial Large Language Models
Generating Immersive, Explorable, and Interactive 3D Worlds
Easy Docker setup for Stable Diffusion with user-friendly UI
FAIR Sequence Modeling Toolkit 2
Advancing Open-source World Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Sharp Monocular Metric Depth in Less Than a Second
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Qwen2.5-VL is the multimodal large language model series
ChatGPT interface with better UI
PyTorch code and models for the DINOv2 self-supervised learning
Qwen3-ASR is an open-source series of ASR models
Foundation model for image generation
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal Diffusion with Representation Alignment