Personalize Any Characters with a Scalable Diffusion Transformer
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Video Object and Interaction Deletion
CLIP, Predict the most relevant text snippet given an image
Repo for SeedVR2 & SeedVR
Inference code for scalable emulation of protein equilibrium ensembles
Project Lyra: Open Generative 3D World Models
CogView4, CogView3-Plus and CogView3(ECCV 2024)
A Customizable Image-to-Video Model based on HunyuanVideo
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Chinese and English multimodal conversational language model
Long-form streaming TTS system for multi-speaker dialogue generation
Research code artifacts for Code World Model (CWM)
Achieving 3+ generation speedup on reasoning tasks
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Provides convenient access to the Anthropic REST API from any Python 3
Advancing Formal Mathematical Reasoning via Reinforcement Learning
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Scaling Reinforcement Learning with LLMs
An Efficient Agentic Model for Computer Use
Foundation model for image generation
Fast-stable-diffusion + DreamBooth
Extension index for stable-diffusion-webui
VMZ: Model Zoo for Video Modeling
Controllable & emotion-expressive zero-shot TTS