Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Open-source large language model family from Tencent Hunyuan
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
GLM-4 series: Open Multilingual Multimodal Chat LMs
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
DeepMind model for tracking arbitrary points across videos & robotics
Sharp Monocular Metric Depth in Less Than a Second
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
The ChatGPT Retrieval Plugin lets you easily find personal documents
A series of math-specific large language models of our Qwen2 series
Implementation of the Surya Foundation Model for Heliophysics
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
Qwen3-omni is a natively end-to-end, omni-modal LLM
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FlashMLA: Efficient Multi-head Latent Attention Kernels
High-Resolution Image Synthesis with Latent Diffusion Models
Code for the paper Hybrid Spectrogram and Waveform Source Separation
A Conversational Speech Generation Model