Real-time behaviour synthesis with MuJoCo, using Predictive Control
Memory-efficient and performant finetuning of Mistral's models
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A series of math-specific large language models of our Qwen2 series
Video understanding codebase from FAIR for reproducing video models
Fast and Universal 3D reconstruction model for versatile tasks
A PyTorch library for implementing flow matching algorithms
PyTorch code and models for the DINOv2 self-supervised learning
Official implementation of DreamCraft3D
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Controllable & emotion-expressive zero-shot TTS
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
The ChatGPT Retrieval Plugin lets you easily find personal documents
Inference framework for 1-bit LLMs
The Clay Foundation Model - An open source AI model and interface
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Ling is a MoE LLM provided and open-sourced by InclusionAI
A SOTA open-source image editing model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Open-weight, large-scale hybrid-attention reasoning model
Open-source framework for intelligent speech interaction
Revolutionizing Database Interactions with Private LLM Technology
Pokee Deep Research Model Open Source Repo