Official code for Style Aligned Image Generation via Shared Attention
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
Guiding Instruction-based Image Editing via Multimodal Large Language
Refer and Ground Anything Anywhere at Any Granularity
The official Meta Llama 3 GitHub site
Utilities intended for use with Llama models
Open-source platform for building enterprise-grade agents
ICLR2024 Spotlight: curation/training code, metadata, distribution
PyTorch code and models for V-JEPA self-supervised learning from video
A PyTorch library for implementing flow matching algorithms
An implementation of a deep learning recommendation model (DLRM)
Official DeiT repository
[CVPR 2025 Best Paper Award] VGGT
Code to accompany "A Method for Animating Children's Drawings"
Anthropic's Interactive Prompt Engineering Tutorial
Anthropic's educational courses
Memory-efficient and performant finetuning of Mistral's models
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
An advanced paper search agent powered by large language models
Large-language-model & vision-language-model based on Linear Attention
Generate blog articles from video or audio
Provider-agnostic, open-source evaluation infrastructure
Pokee Deep Research Model Open Source Repo