A Powerful Native Multimodal Model for Image Generation
A SOTA open-source image editing model
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Grab the color palette from an image using just Javascript
An image processing library written entirely in JavaScript for Node
Interactive video and image annotation tool for computer vision
The most powerful and modular diffusion model GUI, api and backend
Label Studio is a multi-type data labeling and annotation tool
A fast image processing library with low memory needs
text and image to video generation: CogVideoX (2024) and CogVideo
Guiding Instruction-based Image Editing via Multimodal Large Language
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
Wan2.1: Open and Advanced Large-Scale Video Generative Model
State-of-the-art diffusion models for image and audio generation
Chat & pretrained large vision language model
Easily turn large sets of image urls to an image dataset
A neural network that transforms a design mock-up into static websites
Stable Diffusion with Core ML on Apple Silicon
Generating Immersive, Explorable, and Interactive 3D Worlds
Open Source Differentiable Computer Vision Library
Code for running inference with the SAM 3D Body Model 3DB
Cross platform .Net wrapper to the OpenCV image processing library
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Awesome multilingual OCR toolkits based on PaddlePaddle
Open Source Computer Vision Library