Tool for exploring and debugging transformer model behaviors
Contexts Optical Compression
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Qwen2.5-VL is the multimodal large language model series
Global weather forecasting model using graph neural networks and JAX
Sharp Monocular Metric Depth in Less Than a Second
A Unified Framework for Text-to-3D and Image-to-3D Generation
VMZ: Model Zoo for Video Modeling
GLM-4-Voice | End-to-End Chinese-English Conversational Model
A state-of-the-art open visual language model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Revolutionizing Database Interactions with Private LLM Technology
Pushing the Limits of Mathematical Reasoning in Open Language Models
A Systematic Framework for Interactive World Modeling
Industrial-level controllable zero-shot text-to-speech system
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Block Diffusion for Ultra-Fast Speculative Decoding
State-of-the-art (SoTA) text-to-video pre-trained model
General-purpose image editing model that delivers high-fidelity
PyTorch code and models for the DINOv2 self-supervised learning
Official implementation of DreamCraft3D
A series of math-specific large language models of our Qwen2 series
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
The ChatGPT Retrieval Plugin lets you easily find personal documents