An extensive node suite that enables ComfyUI to process 3D inputs
Autoregressive Model Beats Diffusion
StarVector is a foundation model for SVG generation
Unified Multimodal Understanding and Generation Models
This repository contains the official implementation of FastVLM
CogView4, CogView3-Plus and CogView3(ECCV 2024)
"VideoRAG: Chat with Your Videos
Official implementation of Watermark Anything with Localized Messages
Wan2.1: Open and Advanced Large-Scale Video Generative Model
VMZ: Model Zoo for Video Modeling
Taming Stable Diffusion for Lip Sync
Visual intelligence for your home.
Edit videos with Claude Code
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Parse files for optimal RAG
Python inference and LoRA trainer package for the LTX-2 audio–video
Video Object and Interaction Deletion
Driving with Graph Visual Question Answering
LISA: Reasoning Segmentation via Large Language Model
Full-stack AI Red Teaming platform
Refer and Ground Anything Anywhere at Any Granularity
Self-supervised visual learning using momentum contrast in PyTorch
AI tool that converts GitHub repositories into interactive diagrams
Effortless data labeling with AI support from Segment Anything
Weaving the Digital Agent Galaxy