GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A series of math-specific large language models of our Qwen2 series
Programmatic access to the AlphaGenome model
Memory-efficient and performant finetuning of Mistral's models
Video understanding codebase from FAIR for reproducing video models
A SOTA open-source image editing model
Fast and Universal 3D reconstruction model for versatile tasks
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Ling is a MoE LLM provided and open-sourced by InclusionAI
Open-source framework for intelligent speech interaction
Diversity-driven optimization and large-model reasoning ability
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Pokee Deep Research Model Open Source Repo
Implementation of the Surya Foundation Model for Heliophysics
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
High-Fidelity and Controllable Generation of Textured 3D Assets
Multi-modal large language model designed for audio understanding
The official PyTorch implementation of Google's Gemma models
GLM-4-Voice | End-to-End Chinese-English Conversational Model
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
LLM-based Reinforcement Learning audio edit model
Inference code for scalable emulation of protein equilibrium ensembles
Inference script for Oasis 500M