Official repository for LTX-Video
LTX-Video Support for ComfyUI
Sora AI Video Generator by Sora.FM
Video understanding codebase from FAIR for reproducing video models
A suite of advanced multi-modal LLMs
Large Multimodal Models for Video Understanding and Editing
Build Vision Agents quickly with any model or video provider
The python library for real-time communication
NVR with realtime local object detection for IP cameras
A react-based starter app for using the Live API over websockets
Sharp Monocular Metric Depth in Less Than a Second
Use Microsoft Edge's online text-to-speech service from Python
Document Image Parsing via Heterogeneous Anchor Prompting”
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open Source Computer Vision Library
A Customizable Image-to-Video Model based on HunyuanVideo
Detect faces in an image
A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
Computer vision and image processing library for Qt.
Sphere surface layers of visual cortex approach maximum info density
simple algorithm for a realtime interactive visual cortex for painting