Qwen3-TTS is an open-source series of TTS models
The most powerful local music generation model
Fast and Universal 3D reconstruction model for versatile tasks
Lets make video diffusion practical
Industrial-level controllable zero-shot text-to-speech system
Sharp Monocular Metric Depth in Less Than a Second
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Hunyuan Translation Model Version 1.5
A Unified Framework for Text-to-3D and Image-to-3D Generation
tiktoken is a fast BPE tokeniser for use with OpenAI's models
DeepMind model for tracking arbitrary points across videos & robotics
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Block Diffusion for Ultra-Fast Speculative Decoding
PyTorch code and models for the DINOv2 self-supervised learning
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
A trainable PyTorch reproduction of AlphaFold 3
Open-source framework for intelligent speech interaction
Official DeiT repository
Powerful open source image generation model
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
A method to increase the speed and lower the memory footprint
Reference implementation of the Transformer architecture optimized