Document Image Parsing via Heterogeneous Anchor Prompting”
Convert AI papers to GUI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
An extensive node suite that enables ComfyUI to process 3D inputs
Project Lyra: Open Generative 3D World Models
PyTorch code and models for V-JEPA self-supervised learning from video
Data Lake for Deep Learning. Build, manage, and query datasets
Build cross-modal and multimodal applications on the cloud
A Customizable Image-to-Video Model based on HunyuanVideo
Convert various image, audio and video formats from your context menu.
Kemono Downloader - A cross-platform Python app built with PyQt6
Overcoming Data Limitations for High-Quality Video Diffusion Models
AI Suite for upscaling, interpolating & restoring images/videos
xSTUDIO is a high performance playback and review tool.
Automating making many trailer-like videos with a single click!
Stream low latency video from your desktop or webcam over TCP/IP
PyExe: YouTube thumbnail downloader (type-b) [I.S.A]
A fast, powerful, and simple hierarchical vision transformer
It's possible for machines to become self-aware.
Is a portable web server suite for windows 64Bit, for Web Development.
An advanced file manager with qss themes and iso and folder previews
computer vision projects | Fun AI projects related to computer vision
CLIP + FFT/DWT/RGB = text to image/video
YoloV3 Implemented in Tensorflow 2.0