Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
A SOTA open-source image editing model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GPT4V-level open-source multi-modal model based on Llama3-8B
A Pragmatic VLA Foundation Model
Collection of Gemma 3 variants that are trained for performance
Open-weight, large-scale hybrid-attention reasoning model
Diversity-driven optimization and large-model reasoning ability
Open-source deep-learning framework
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Open-source framework for intelligent speech interaction
Phi-3.5 for Mac: Locally-run Vision and Language Models
Repo for SeedVR2 & SeedVR
The official PyTorch implementation of Google's Gemma models
Pokee Deep Research Model Open Source Repo
Implementation of the Surya Foundation Model for Heliophysics
Long-form streaming TTS system for multi-speaker dialogue generation
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Multimodal embedding and reranking models built on Qwen3-VL
Audio foundation model excelling in audio understanding
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A trainable PyTorch reproduction of AlphaFold 3
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing