An experimental version of DeepSeek model
Hackable and optimized Transformers building blocks
Qwen-Image is a powerful image generation foundation model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Inference code for scalable emulation of protein equilibrium ensembles
MOSS‑TTS Family open‑source speech and sound generation model
Multimodal Diffusion with Representation Alignment
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
HY-Motion model for 3D character animation generation
Audio foundation model excelling in audio understanding
Open Source Speech Language Model
Video understanding codebase from FAIR for reproducing video models
Tool for exploring and debugging transformer model behaviors
General-purpose image editing model that delivers high-fidelity
ICLR2024 Spotlight: curation/training code, metadata, distribution
Official implementation of DreamCraft3D
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Language modeling in a sentence representation space
A SOTA open-source image editing model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
OCR expert VLM powered by Hunyuan's native multimodal architecture
ChatGPT interface with better UI
High-Resolution Image Synthesis with Latent Diffusion Models