Repo for SeedVR2 & SeedVR
Foundation model for image generation
Block Diffusion for Ultra-Fast Speculative Decoding
Official implementation of Watermark Anything with Localized Messages
CogView4, CogView3-Plus and CogView3(ECCV 2024)
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based Reinforcement Learning audio edit model
Qwen3-omni is a natively end-to-end, omni-modal LLM
Inference code for scalable emulation of protein equilibrium ensembles
The Clay Foundation Model - An open source AI model and interface
Audio foundation model excelling in audio understanding
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-ASR is an open-source series of ASR models
VMZ: Model Zoo for Video Modeling
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
A Unified Framework for Text-to-3D and Image-to-3D Generation
Project Lyra: Open Generative 3D World Models
Inference script for Oasis 500M
HY-Motion model for 3D character animation generation
Foundation Models for Time Series
A Production-ready Reinforcement Learning AI Agent Library
Hackable and optimized Transformers building blocks
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Official implementation of DreamCraft3D