Unsupervised Learning for Image Registration
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System
Oobabooga - The definitive Web UI for local AI, with powerful features
Synthesizing and manipulating 2048x1024 images with conditional GANs
A self-hosted open source photo management service
Structured data extraction and instruction calling with ML, LLM
Medical imaging toolkit for deep learning
Stable Diffusion web UI
An open source implementation of CLIP
Effortless data labeling with AI support from Segment Anything
Models for object and human mesh reconstruction
Stable Diffusion built-in to Blender
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
An AI personal assistant for your digital brain
Kaggle Python docker image
Easily compute clip embeddings and build a clip retrieval system
An on-premises, OCR-free unstructured data extraction
Unified web UI for training and running open models locally
Claude code for everything except coding
A set of Docker images for training and serving models in TensorFlow
The official Python library for the OpenAI API
Skywork-R1V is an advanced multimodal AI model series
Code and models for ICML 2024 paper, NExT-GPT
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model