26m function call model that runs on incredibly small devices
Multimodal-Driven Architecture for Customized Video Generation
Video understanding codebase from FAIR for reproducing video models
Let us control diffusion models
Code release for ConvNeXt V2 model
Code for reproducing key results in the paper