Netease Youdao's open-source embedding and reranker models
Visual Causal Flow
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Open-source multi-speaker long-form text-to-speech model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
An AI-powered security review GitHub Action using Claude
Open-source framework for intelligent speech interaction
A Powerful Native Multimodal Model for Image Generation
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Generating Immersive, Explorable, and Interactive 3D Worlds
Audio foundation model excelling in audio understanding
Video understanding codebase from FAIR for reproducing video models
Multimodal Diffusion with Representation Alignment
Foundational Models for State-of-the-Art Speech and Text Translation
New family of code large language models (LLMs)
Qwen-Image is a powerful image generation foundation model
Multi-modal large language model designed for audio understanding
LLM-based Reinforcement Learning audio edit model
The ChatGPT Retrieval Plugin lets you easily find personal documents
Encoder of greater-than-word length text trained on a variety of data
Official code for Style Aligned Image Generation via Shared Attention
Latent Diffusion and Stable Diffusion Implementation
Let us control diffusion models
Chinese LLaMA & Alpaca large language model + local CPU/GPU training