This repo contains the code for 1D tokenizer and generator
Visual Causal Flow
LTX-Video Support for ComfyUI
Code for running inference and finetuning with SAM 3 model
Automated translation solution for visual novels
AI coding assistant skill (Claude Code, Codex, OpenCode, OpenClaw)
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System
Open source demo platform where you can easily showcase your AI models
Official Python inference and LoRA trainer package
Skywork-R1V is an advanced multimodal AI model series
Visual intelligence for your home.
Tiny vision language model
An extensive node suite that enables ComfyUI to process 3D inputs
StarVector is a foundation model for SVG generation
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
Recovering the Visual Space from Any Views
LISA: Reasoning Segmentation via Large Language Model
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Unified Multimodal Understanding and Generation Models
Python inference and LoRA trainer package for the LTX-2 audio–video
Machine learning image inpainting task that removes watermarks
A neural network that transforms a design mock-up into static websites
SAPIEN Manipulation Skill Framework
Reference PyTorch implementation and models for DINOv3