Global weather forecasting model using graph neural networks and JAX
CLIP, Predict the most relevant text snippet given an image
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
A Production-ready Reinforcement Learning AI Agent Library
Video understanding codebase from FAIR for reproducing video models
Multimodal-Driven Architecture for Customized Video Generation
Let us control diffusion models
Code for the paper Hybrid Spectrogram and Waveform Source Separation
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Code release for ConvNeXt V2 model
Reference implementation of the Transformer architecture optimized
Reproduces results of "Fixing the train-test resolution discrepancy"
Learning Continuous Signed Distance Functions for Shape Representation
Code for reproducing key results in the paper