Wan2.2: Open and Advanced Large-Scale Video Generative Model
RGBD video generation model conditioned on camera input
Open-Sora: Democratizing Efficient Video Production for All
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Generate short videos with one click using AI LLM
Implementation of Make-A-Video, new SOTA text to video generator
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
text and image to video generation: CogVideoX (2024) and CogVideo
Multimodal-Driven Architecture for Customized Video Generation
A Customizable Image-to-Video Model based on HunyuanVideo
Implementation of Video Diffusion Models
Implementation of Phenaki Video, which uses Mask GIT
Implementation of Recurrent Interface Network (RIN)
CLIP + FFT/DWT/RGB = text to image/video
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
AI-powered tool to quickly remove watermarks from videos flawlessly
Overcoming Data Limitations for High-Quality Video Diffusion Models
A Customizable Image-to-Video Model based on HunyuanVideo
A walk along memory lane
Implementation of NÜWA, attention network for text to video synthesis
Implementation of NWT, audio-to-video generation, in Pytorch
Software tool that converts text to video for more engaging experience
The leading software for creating deepfakes
DCVGAN: Depth Conditional Video Generation, ICIP 2019.