The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal integration strategies that influenced modern architectures like SlowFast and X3D.

Features

  • Implements R(2+1)D and MCx models for efficient spatiotemporal video representation learning
  • Enables reproducibility of FAIR’s published video understanding research
  • Built with both Caffe2 and PyTorch backends for flexibility
  • Supports Gradient Blending for audio-visual fusion (Caffe2 only)
  • Provides pre-trained models on IG-65M, one of the largest weakly-supervised video datasets
  • Includes CSN (Channel-Separated Networks) for computationally efficient video recognition

Project Activity

See All Activity >

Categories

Video, AI Models

License

Apache License V2.0

Follow VMZ (Video Model Zoo)

VMZ (Video Model Zoo) Web Site

Other Useful Business Software
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VMZ (Video Model Zoo)!

Additional Project Details

Operating Systems

Linux

Programming Language

C++, Python, Unix Shell

Related Categories

Unix Shell Video Software, Unix Shell AI Models, Python Video Software, Python AI Models, C++ Video Software, C++ AI Models

Registered

2025-10-08