Video understanding codebase from FAIR for reproducing video models
Multimodal-Driven Architecture for Customized Video Generation
Personalize Any Characters with a Scalable Diffusion Transformer
OpenMMLab Model Deployment Framework
Modular quant framework
Unified Multimodal Understanding and Generation Models
kaldi-asr/kaldi is the official location of the Kaldi project
Scalable machine learning for time series forecasting
Chat & pretrained large audio language model proposed by Alibaba Cloud
Benchmarking synthetic data generation methods
The best ChatGPT that $100 can buy
4M: Massively Multimodal Masked Modeling
ICLR2024 Spotlight: curation/training code, metadata, distribution
[CVPR 2025 Best Paper Award] VGGT
Code to accompany "A Method for Animating Children's Drawings"
Ray Aviary - evaluate multiple LLMs easily
The open-source data curation platform for LLMs
Build portable, production-ready MLOps pipelines
Environments and algorithms for research in general reinforcement
Powering Amazon custom machine learning chips
PyTorch code and models for VJEPA2 self-supervised learning from video
Easy-to-use and powerful NLP library with Awesome model zoo
Simplest working implementation of Stylegan2
A lightweight vision library for performing large object detection
An Autonomous LLM Agent for Complex Task Solving