Make videos programmatically with React
Inference script for Oasis 500M
A Customizable Image-to-Video Model based on HunyuanVideo
Official Python inference and LoRA trainer package
Multimodal Diffusion with Representation Alignment
NVR with realtime local object detection for IP cameras
Lets make video diffusion practical
Text mining using tidy tools
AI tool that removes hardcoded subtitles and text from videos locally
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Implementation of a U-net complete with efficient attention
Official repository for LTX-Video
Open-source multi-speaker long-form text-to-speech model
ESP32 Camera motion capture application to record JPEGs to SD card
Code for running inference with the SAM 3D Body Model 3DB
Convert AI papers to GUI
Behavior tree AI for Godot Engine
AI-assisted storyboard and video generation tool
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Video understanding codebase from FAIR for reproducing video models
Qwen2.5-VL is the multimodal large language model series
A reactive notebook for Python
MCP server enabling AI coding tools to access Figma design data
Doom-based AI research platform for reinforcement learning
Large Multimodal Models for Video Understanding and Editing