ComfyUI wrapper nodes for HunyuanVideo
Open-Sora: Democratizing Efficient Video Production for All
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
RGBD video generation model conditioned on camera input
Generate high-definition story short videos with one click using AI
Director, Screenwriter, Producer, and Video Generator All-in-One
GPT4V-level open-source multi-modal model based on Llama3-8B
Capable of understanding text, audio, vision, video
Motion-controllable Video Generation via Latent Trajectory Guidance
Implementation of Phenaki Video, which uses Mask GIT
Lets make video diffusion practical
An unsupervised and free tool for image and video dataset analysis
HunyuanVideo: A Systematic Framework For Large Video Generation Model
ComfyUI wrapper nodes for WanVideo and related models
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A general fine-tuning kit geared toward image/video/audio diffusion
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Label Studio is a multi-type data labeling and annotation tool
Implementation of a U-net complete with efficient attention
Recovering the Visual Space from Any Views
Powerful open source team chat application
The most powerful and modular diffusion model GUI, api and backend
InvokeAI is a leading creative engine for Stable Diffusion models
Python data, Leaflet.js maps
We write your reusable computer vision tools