Lightning fast C++/CUDA neural network framework
Towards Real-World Vision-Language Understanding
OCR expert VLM powered by Hunyuan's native multimodal architecture
Large-language-model & vision-language-model based on Linear Attention
Interactive Machine Learning experiments
Chat & pretrained large vision language model
airda(Air Data Agent
Virtual AI anchor that combines state-of-the-art technology
Visual Automation IDE — automate anything you see on screen
Plug-n-play module turning text-to-image models into animation
dashAI: an interactive platform for training, evaluating and deploying
Visual Instruction Tuning: Large Language-and-Vision Assistant
computer vision projects | Fun AI projects related to computer vision
Guiding Instruction-based Image Editing via Multimodal Large Language
Open-source tool to visualise your RAG
CS2, Valorant, Fortnite, APEX, every game
Library of self-supervised methods for visual representation
Creation of a Taylorplot for several machine learning models
Official code for Style Aligned Image Generation via Shared Attention
Consistency Distilled Diff VAE
Visual localization made easy with hloc
Task-oriented finetuning for better embeddings on neural search
A reactive runtime for building durable AI agents
Enable sending and receiving images during chatting
A latent text-to-image diffusion model