Netease Youdao's open-source embedding and reranker models
Visual Causal Flow
Open-source multi-speaker long-form text-to-speech model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
An AI-powered security review GitHub Action using Claude
Open-source framework for intelligent speech interaction
A Powerful Native Multimodal Model for Image Generation
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Generating Immersive, Explorable, and Interactive 3D Worlds
Audio foundation model excelling in audio understanding
Video understanding codebase from FAIR for reproducing video models
Multimodal Diffusion with Representation Alignment
New family of code large language models (LLMs)
Qwen-Image is a powerful image generation foundation model
Multi-modal large language model designed for audio understanding
LLM-based Reinforcement Learning audio edit model
The ChatGPT Retrieval Plugin lets you easily find personal documents
Official code for Style Aligned Image Generation via Shared Attention
Let us control diffusion models
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Code release for "Masked-attention Mask Transformer
PyTorch implementation of MAE
Per-Pixel Classification is Not All You Need for Semantic Segmentation