Usable Implementation of "Bootstrap Your Own Latent" self-supervised
Chinese and English multimodal conversational language model
Tensor search for humans
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Multilingual sentence & image embeddings with BERT
Phi-3.5 for Mac: Locally-run Vision and Language Models
Turn your website into a GIF
Implementation of 'lightweight' GAN, proposed in ICLR 2021
An MCP server that autonomously evaluates web applications
Embed images and sentences into fixed-length vectors
Make drawing and labeling bounding boxes easy as cake
A lightweight vision library for performing large object detection
AutoGluon: AutoML for Image, Text, and Tabular Data
AI Toolkit for Healthcare Imaging
A Telegram RSS bot that cares about your reading experience
Jittor is a high-performance deep learning framework
AI-data warehouse to enrich, transform and analyze unstructured data
Pretrained model hub for Keras 3
We write your reusable computer vision tools
Inference code for CodeLlama models
Implementation of "MobileCLIP" CVPR 2024
Implementation of Phenaki Video, which uses Mask GIT
A python library for self-supervised learning on images
Datasets, transforms and models specific to Computer Vision
Code for running inference and finetuning with SAM 3 model