Global weather forecasting model using graph neural networks and JAX
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Multimodal-Driven Architecture for Customized Video Generation
A Production-ready Reinforcement Learning AI Agent Library
CLIP, Predict the most relevant text snippet given an image
Video understanding codebase from FAIR for reproducing video models
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Let us control diffusion models
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Code release for ConvNeXt V2 model
Reference implementation of the Transformer architecture optimized
Reproduces results of "Fixing the train-test resolution discrepancy"
Learning Continuous Signed Distance Functions for Shape Representation
Code for reproducing key results in the paper