Medical imaging toolkit for deep learning
State-of-the-art (SoTA) text-to-video pre-trained model
A text-to-speech, speech-to-text and speech-to-speech library
DeepMind model for tracking arbitrary points across videos & robotics
Benchmarking Multimodal Agents for Open-Ended Tasks
A trainable PyTorch reproduction of AlphaFold 3
Implementation of Make-A-Video, new SOTA text to video generator
Implementation of Video Diffusion Models
A Systematic Framework for Interactive World Modeling
State-of-the-art diffusion models for image and audio generation
The machine learning toolkit for time series analysis in Python
The data structure for multimodal data
SAPIEN Manipulation Skill Framework
Deep learning optimization library making distributed training easy
Graph Neural Network Library for PyTorch
An Open Source package that allows video game creators
Simple and easily configurable grid world environments
Framework for building neural networks
Geometric deep learning extension library for PyTorch
Deep learning optimization library: makes distributed training easy
Open Source Differentiable Computer Vision Library
Build cross-modal and multimodal applications on the cloud
Generate 3D objects conditioned on text or images
Framework that is dedicated to making neural data processing
CLIP + FFT/DWT/RGB = text to image/video