Browse the web, directly from Cursor etc.
Automate native Android apps with AI using accessibility APIs
Elyra extends JupyterLab with an AI centric approach
"Big Model" trains a visual multimodal VLM with 26M parameters
Virtual AI anchor that combines state-of-the-art technology
Phi-3.5 for Mac: Locally-run Vision and Language Models
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Inference script for Oasis 500M
Guiding Instruction-based Image Editing via Multimodal Large Language
Open-source platform for building enterprise-grade agents
ICLR2024 Spotlight: curation/training code, metadata, distribution
[CVPR 2025 Best Paper Award] VGGT
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
Large-language-model & vision-language-model based on Linear Attention
Gemma open-weight LLM library, from Google DeepMind
airda(Air Data Agent
Flexible Photo Recrafting While Preserving Your Identity
Python package for AutoML on Tabular Data with Feature Engineering
Library of self-supervised methods for visual representation
Official code for Style Aligned Image Generation via Shared Attention
Plug-n-play module turning text-to-image models into animation
Creation of a Taylorplot for several machine learning models
Consistency Distilled Diff VAE
Open-source observability and analytics for LLM apps