Language modeling in a sentence representation space
Towards Real-World Vision-Language Understanding
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Code for running inference and finetuning with SAM 3 model
Z80-μLM is a 2-bit quantized language model
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Advanced language and coding AI model
Kimi K2 is the large language model series developed by Moonshot AI
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Large-language-model & vision-language-model based on Linear Attention
Official inference repo for FLUX.2 models
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Unified Multimodal Understanding and Generation Models
LLM-based Reinforcement Learning audio edit model
Qwen3-omni is a natively end-to-end, omni-modal LLM
High-resolution models for human tasks
A PyTorch library for implementing flow matching algorithms
Open-source framework for intelligent speech interaction
Video understanding codebase from FAIR for reproducing video models
Real-time behaviour synthesis with MuJoCo, using Predictive Control
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
Renderer for the harmony response format to be used with gpt-oss