Tooling for the Common Objects In 3D dataset
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Image Upscaling in Javascript. Increase image resolution up to 4x
Generating Immersive, Explorable, and Interactive 3D Worlds
A Photo Editor library with simple, easy support for image editing
Capable of understanding text, audio, vision, video
Build blazing fast, modern apps and websites with React
PS2 Covers Collection
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space
Easily compute clip embeddings and build a clip retrieval system
A tool to automatically resolve Git conflicts
An unsupervised and free tool for image and video dataset analysis
RGBD video generation model conditioned on camera input
Tensor search for humans
A distributed system for embedding-based vector retrieval
A lightweight vision library for performing large object detection
Flux 2 image generation model pure C inference
Implementation of "MobileCLIP" CVPR 2024
A Pioneering Open-Source Alternative to GPT-4o
Minimal scripts to run the emulator in a container for various systems
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
An app that upscales anime-styled images, gifs, and videos
Refine and quantize messy AI pixel art into clean, perfect pixels
DomainBed is a suite to test domain generalization algorithms