Audio foundation model excelling in audio understanding
Hackable and optimized Transformers building blocks
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Tool for exploring and debugging transformer model behaviors
Multimodal Diffusion with Representation Alignment
Official implementation of DreamCraft3D
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Open Source Speech Language Model
Powerful AI language model (MoE) optimized for efficiency/performance
Agentic, Reasoning, and Coding (ARC) foundation models
HY-Motion model for 3D character animation generation
Advanced language and coding AI model
Open-source, high-performance AI model with advanced reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Video understanding codebase from FAIR for reproducing video models
Industrial-level controllable zero-shot text-to-speech system
A theoretical reconstruction of the Claude Mythos architecture
Towards Real-World Vision-Language Understanding
Inference code for scalable emulation of protein equilibrium ensembles
Official inference repo for FLUX.2 models
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Code for running inference and finetuning with SAM 3 model