Evaluate and compare LLM outputs, catch regressions, improve prompts
A gallery that showcases on-device ML/GenAI use cases
Local AI coding agent CLI with multi-agent orchestration tools
The smallest, simplest JavaScript pixel-level image comparison library
Evaluation and Tracking for LLM Experiments
An easy-to-use & supercharged open-source experiment tracker
An open-source, modern-design AI training tracking and visualization
Framework and no-code GUI for fine-tuning LLMs
A reinforcement learning package for Julia
https://github.com/iterative/vscode-dvc
A Claude skill that writes the accurate prompts for any AI tool
Open source codebase for Scale Agentex
Test and evaluate LLMs and model configurations
Lightweight Python library for adding real-time multi-object tracking
Interactively analyze ML models to understand their behavior
Open source platform for the machine learning lifecycle
Debug, evaluate, and monitor your LLMapps, RAG systems, and agentic AI
A powerful Zotero AI and MCP plugin with ChatGPT, Gemini 3.1, Claude
Ensure consistency and alignment between different codebases
The repository provides code for running inference with SAM 2
A Gym environment for web task automation
Tool for visualizing and tracking your machine learning experiments
An open-source visual programming environment
Deploy and share agents with open infrastructure
Advanced RAG cookbooks for building accurate LLM applications