Industrial-level controllable zero-shot text-to-speech system
Easy Docker setup for Stable Diffusion with user-friendly UI
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
PyTorch code and models for the DINOv2 self-supervised learning
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Chinese and English multimodal conversational language model
Tooling for the Common Objects In 3D dataset
Multimodal-Driven Architecture for Customized Video Generation
Memory-efficient and performant finetuning of Mistral's models
Designed for text embedding and ranking tasks
Official code for Style Aligned Image Generation via Shared Attention
Official repo for consistency models
Environment generation code for the paper "Emergent Tool Use"
High-compute ultra-reasoning model surpassing model surpassing GPT-5