Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Lets make video diffusion practical
Code for Cicero, an AI agent that plays the game of Diplomacy
ChatGPT extension for scientific research work
"Big Model" trains a visual multimodal VLM with 26M parameters
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Elyra extends JupyterLab with an AI centric approach
OCR expert VLM powered by Hunyuan's native multimodal architecture
Inference script for Oasis 500M
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Python package for AutoML on Tabular Data with Feature Engineering
LLM training in simple, raw C/CUDA
AI bridge enabling assistants to control and automate Unity Editor
Fast, powerful, git-native ticket tracking in a single bash script
Flexible Photo Recrafting While Preserving Your Identity
Pre-trained Deep Learning models and demos
The Unified Machine Learning Framework
A library for accelerating Transformer models on NVIDIA GPUs
Containerized automation engine for programmable CI/CD workflows
Jittor is a high-performance deep learning framework
Open Source Computer Vision Library
Guiding Instruction-based Image Editing via Multimodal Large Language
computer vision projects | Fun AI projects related to computer vision
Uranie is CEA's uncertainty analysis platform, based on ROOT
AI-powered PC monitoring that explains. Not shows numbers/spikes.