Global weather forecasting model using graph neural networks and JAX
CLIP, Predict the most relevant text snippet given an image
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
A Production-ready Reinforcement Learning AI Agent Library
Multimodal-Driven Architecture for Customized Video Generation
Video understanding codebase from FAIR for reproducing video models
Let us control diffusion models
Code for the paper Hybrid Spectrogram and Waveform Source Separation
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Code release for ConvNeXt V2 model
Reference implementation of the Transformer architecture optimized
Reproduces results of "Fixing the train-test resolution discrepancy"
Learning Continuous Signed Distance Functions for Shape Representation
Code for reproducing key results in the paper