Evaluation suite designed to assess the performance of LLMs
DoWhy is a Python library for causal inference
Ministack: Free, open-source local AWS emulator
Multi-agent autonomous startup system for Claude Code
Unites the best signal intelligence tools
Concurrent Python made simple
Comprehensive paid advertising audit & optimization skill
The AI toolkit for the AI developer
The first open-source agentic AI physicist
The modern API client that lives in your terminal
Lighter web automation with Python
A Python library for automating interaction with websites
DeepCode: Open Agentic Coding
Multi-Joint dynamics with Contact. A general purpose physics simulator
Aider is AI pair programming in your terminal
A collaboration friendly studio for NeRFs
Lightweight Markdown-only skills for autonomous ML research
The open source post-building layer for agents
Framework for jumpstarting production-ready Django projects quickly
One-stop solution for creating your digital avatar from chat history
Framework for automatic construction of vulnerable infrastructures
Lightweight framework for evaluating large language model performance
The comprehensive WSGI web application library
Evaluate your LLM's response with Prometheus and GPT4