ChatGLM-6B: An Open Bilingual Dialogue Language Model
Hackable and optimized Transformers building blocks
Industrial-level controllable zero-shot text-to-speech system
Tool for exploring and debugging transformer model behaviors
Multimodal Diffusion with Representation Alignment
HY-Motion model for 3D character animation generation
Official implementation of DreamCraft3D
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Video understanding codebase from FAIR for reproducing video models
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
Inference code for scalable emulation of protein equilibrium ensembles
Open Source Speech Language Model
General-purpose image editing model that delivers high-fidelity
ICLR2024 Spotlight: curation/training code, metadata, distribution
Towards Real-World Vision-Language Understanding
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
A SOTA open-source image editing model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
OCR expert VLM powered by Hunyuan's native multimodal architecture
Pushing the Limits of Mathematical Reasoning in Open Language Models
The ChatGPT Retrieval Plugin lets you easily find personal documents
Release for Improved Denoising Diffusion Probabilistic Models