GLM-4-Voice | End-to-End Chinese-English Conversational Model
Towards Real-World Vision-Language Understanding
FAIR Sequence Modeling Toolkit 2
Chat & pretrained large vision language model
High-resolution models for human tasks
Open-Source Financial Large Language Models
Qwen3-TTS is an open-source series of TTS models
4M: Massively Multimodal Masked Modeling
A Production-ready Reinforcement Learning AI Agent Library
Tooling for the Common Objects In 3D dataset
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Designed for text embedding and ranking tasks
A SOTA open-source image editing model
Chinese and English multimodal conversational language model
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing
Chat & pretrained large audio language model proposed by Alibaba Cloud
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based Reinforcement Learning audio edit model
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Official code for Style Aligned Image Generation via Shared Attention
Dataset of GPT-2 outputs for research in detection, biases, and more
High-Resolution Image Synthesis with Latent Diffusion Models
A Conversational Speech Generation Model
Qwen2.5-Coder is the code version of Qwen2.5, the large language model