Project Lyra: Open Generative 3D World Models
Video Object and Interaction Deletion
Repo for SeedVR2 & SeedVR
CogView4, CogView3-Plus and CogView3(ECCV 2024)
CLIP, Predict the most relevant text snippet given an image
A Customizable Image-to-Video Model based on HunyuanVideo
Inference code for scalable emulation of protein equilibrium ensembles
Chinese and English multimodal conversational language model
A trainable PyTorch reproduction of AlphaFold 3
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Controllable & emotion-expressive zero-shot TTS
An Efficient Agentic Model for Computer Use
Designed for text embedding and ranking tasks
Achieving 3+ generation speedup on reasoning tasks
Foundation model for image generation
Fast-stable-diffusion + DreamBooth
VMZ: Model Zoo for Video Modeling
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Provides convenient access to the Anthropic REST API from any Python 3
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Robust Speech Recognition Across Languages, Dialects
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Tiny vision language model
Open Source Speech Language Model