Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Research code artifacts for Code World Model (CWM)
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal Diffusion with Representation Alignment
The NVIDIA AgentIQ toolkit is an open-source library
Powerful and highly extensible command-line based document
CasADi is a symbolic framework for numeric optimization
Automatic SSRF fuzzer and exploitation tool
A best practices guide for day 2 operations
Mini website for testing both general CS knowledge and enforce coding
Blazing-fast vector DB with similarity search and metadata filtering
Official code for Style Aligned Image Generation via Shared Attention
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
Guiding Instruction-based Image Editing via Multimodal Large Language
This repository contains the official implementation of FastVLM
Refer and Ground Anything Anywhere at Any Granularity
The official Meta Llama 3 GitHub site
Utilities intended for use with Llama models
Tool for generating high quality Synthetic datasets
TorchMultimodal is a PyTorch library
ICLR2024 Spotlight: curation/training code, metadata, distribution
A library for differentiable nonlinear optimization