Official repository for LTX-Video
LTX-Video Support for ComfyUI
Sora AI Video Generator by Sora.FM
A suite of advanced multi-modal LLMs
Video understanding codebase from FAIR for reproducing video models
Large Multimodal Models for Video Understanding and Editing
Build Vision Agents quickly with any model or video provider
The python library for real-time communication
NVR with realtime local object detection for IP cameras
A react-based starter app for using the Live API over websockets
Sharp Monocular Metric Depth in Less Than a Second
Use Microsoft Edge's online text-to-speech service from Python
Document Image Parsing via Heterogeneous Anchor Prompting”
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A Customizable Image-to-Video Model based on HunyuanVideo
Detect faces in an image
A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
Computer vision and image processing library for Qt.
Sphere surface layers of visual cortex approach maximum info density
simple algorithm for a realtime interactive visual cortex for painting