Requirement-driven evaluation harness for AI agents and LLM
SWE-agent takes a GitHub issue and tries to automatically fix it
The AI toolkit for the AI developer
Arcade Tool Development Kit (TDK), Worker, Evals, and CLI
ComfyUI wrapper nodes for WanVideo and related models
The Library for LLM-based multi-agent applications
Python SDK for agent monitoring, LLM cost tracking, benchmarking, etc.
Autonomous harness engineering
An AI agent that automatically builds AI models
Outcome driven agent development framework that evolves
When LLM Meets Domain Experts
Open source codebase for Scale Agentex
An alignment auditing agent capable of exploring alignment hypothesis
Code for Cicero, an AI agent that plays the game of Diplomacy
A minimal yet professional single agent demo project
The python App/Skrypt automaticly add important events into calendar.
AI agent that streamlines the entire process of data analysis