Guiding Instruction-based Image Editing via Multimodal Large Language
This repository contains the official implementation of FastVLM
Refer and Ground Anything Anywhere at Any Granularity
A Model Context Protocol (MCP) Gateway & Registry
Utilities intended for use with Llama models
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
PyTorch code and models for V-JEPA self-supervised learning from video
A PyTorch library for implementing flow matching algorithms
An implementation of a deep learning recommendation model (DLRM)
Official DeiT repository
ImageBind One Embedding Space to Bind Them All
PyTorch3D is FAIR's library of reusable components for deep learning
[CVPR 2025 Best Paper Award] VGGT
PyTorch code and models for the DINOv2 self-supervised learning
Provides code for running inference with the SegmentAnything Model
Anthropic's Interactive Prompt Engineering Tutorial
Memory-efficient and performant finetuning of Mistral's models
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Official implementation of DreamCraft3D
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Open-source large language model family from Tencent Hunyuan
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A simple screen parsing tool towards pure vision based GUI agent
Transformers4Rec is a flexible and efficient library