Generate high-definition story short videos with one click using AI
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Director, Screenwriter, Producer, and Video Generator All-in-One
Capable of understanding text, audio, vision, video
Motion-controllable Video Generation via Latent Trajectory Guidance
Lets make video diffusion practical
Implementation of a U-net complete with efficient attention
HunyuanVideo: A Systematic Framework For Large Video Generation Model
GPT4V-level open-source multi-modal model based on Llama3-8B
An unsupervised and free tool for image and video dataset analysis
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A general fine-tuning kit geared toward image/video/audio diffusion
Label Studio is a multi-type data labeling and annotation tool
Recovering the Visual Space from Any Views
Powerful open source team chat application
Python data, Leaflet.js maps
InvokeAI is a leading creative engine for Stable Diffusion models
We write your reusable computer vision tools
The most powerful and modular diffusion model GUI, api and backend
Generating Immersive, Explorable, and Interactive 3D Worlds
Dealing with all unstructured data, such as reverse image search
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Sharp Monocular Metric Depth in Less Than a Second
Convert AI papers to GUI
A Telegram bot that integrates with OpenAI's official ChatGPT APIs