Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Industrial-level controllable zero-shot text-to-speech system
Achieving 3+ generation speedup on reasoning tasks
Easy Docker setup for Stable Diffusion with user-friendly UI
Inference script for Oasis 500M
A Customizable Image-to-Video Model based on HunyuanVideo
Designed for text embedding and ranking tasks
Multimodal-Driven Architecture for Customized Video Generation
Generate Any 3D Scene in Seconds
PyTorch code and models for the DINOv2 self-supervised learning
Tooling for the Common Objects In 3D dataset
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Memory-efficient and performant finetuning of Mistral's models
New family of code large language models (LLMs)
Pokee Deep Research Model Open Source Repo
An AI-powered security review GitHub Action using Claude
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Chinese and English multimodal conversational language model
Official code for Style Aligned Image Generation via Shared Attention
Official repo for consistency models
Environment generation code for the paper "Emergent Tool Use"
High-compute ultra-reasoning model surpassing model surpassing GPT-5