This repo contains the code for 1D tokenizer and generator
Visual Causal Flow
LTX-Video Support for ComfyUI
Code for running inference and finetuning with SAM 3 model
AI coding assistant skill (Claude Code, Codex, OpenCode, OpenClaw)
Automated translation solution for visual novels
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System
Open source demo platform where you can easily showcase your AI models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Official Python inference and LoRA trainer package
Skywork-R1V is an advanced multimodal AI model series
Visual intelligence for your home.
Tiny vision language model
An extensive node suite that enables ComfyUI to process 3D inputs
LISA: Reasoning Segmentation via Large Language Model
Recovering the Visual Space from Any Views
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
StarVector is a foundation model for SVG generation
Unified Multimodal Understanding and Generation Models
Python inference and LoRA trainer package for the LTX-2 audio–video
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Machine learning image inpainting task that removes watermarks
Video Object and Interaction Deletion
Multimodal Diffusion with Representation Alignment
A neural network that transforms a design mock-up into static websites