source testing unit testing free download

LangCheck

Simple, Pythonic building blocks to evaluate LLM applications

Simple, Pythonic building blocks to evaluate LLM applications.

Downloads: 5 This Week

Last Update: 2024-12-12

See Project

Strix

Open-source AI hackers to find and fix your app’s vulnerabilities

Strix is an open source agent-driven security platform that uses autonomous AI agents to identify, investigate, and validate vulnerabilities in software applications. The system is designed to mimic the behavior of real attackers by executing dynamic testing and verifying findings through proof-of-concept exploitation. Unlike traditional vulnerability scanners that rely heavily on static analysis, Strix agents actively run code, probe systems, and attempt exploitation to confirm whether vulnerabilities are genuinely exploitable. ...

Downloads: 10 This Week

Last Update: 2026-03-23

See Project

FuzzyAI Fuzzer

A powerful tool for automated LLM fuzzing

...The framework can be integrated into development pipelines to continuously test AI APIs and detect weaknesses before deployment. FuzzyAI provides testing tools, datasets, and evaluation workflows that help researchers measure how well models resist harmful instructions or attempts to bypass safety mechanisms.

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

Rogue

AI Agent Evaluator & Red Team Platform

...The system allows developers to define specific scenarios, expected outcomes, and business rules so that the framework can verify whether an agent behaves according to required policies. During testing, Rogue records conversations and produces detailed reports that explain whether the agent passed or failed each scenario. These reports include reasoning and evidence, helping developers understand why a particular failure occurred.

Downloads: 4 This Week

Last Update: 2026-04-29

See Project

Easy DataSet

A powerful tool for creating datasets for LLM fine-tuning

...The system includes automated question-generation capabilities, hierarchical label trees, and answer generation pipelines that use LLM APIs to produce coherent paired data with customizable templates. Beyond dataset creation, Easy-dataset also provides a built-in evaluation system with model testing and blind-test features, helping teams validate model performance using curated test sets.

Downloads: 9 This Week

Last Update: 2026-04-10

See Project

Synthetic Data Generator

SDG is a specialized framework

...This makes the generated data suitable for tasks such as machine learning model training, testing software systems, sharing datasets across organizations, and conducting research without violating privacy regulations. The system supports multiple generation methods including statistical models, generative adversarial networks, and large language model–based synthesis. It also includes a data processing module capable of handling different data types, preprocessing columns, managing missing values, and converting formats automatically before model training.

Downloads: 8 This Week

Last Update: 2026-03-06

See Project

Claude Code Skills & Plugins Hub

270+ Claude Code plugins with 739 agent skills

Claude Code Plugins Plus Skills is a large open-source ecosystem of plugins and AI “skills” designed to extend the capabilities of Claude Code development agents. The repository functions as a marketplace-style collection of hundreds of plugins and specialized skills that enable Claude Code to perform complex development, automation, and operational tasks. These plugins cover a wide range of domains including DevOps automation, security testing, API debugging, infrastructure management, and AI workflow orchestration. ...

Downloads: 6 This Week

Last Update: 2 days ago

See Project

BruteForceAI

Advanced LLM-powered brute-force tool combining AI intelligence

BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation.

Downloads: 5 This Week

Last Update: 2026-03-09

See Project

LangWatch

The platform for LLM evaluations and AI agent testing

LangWatch is an open-source observability and monitoring platform designed to help developers evaluate and improve applications built with large language models. The platform provides tools for tracking model interactions, analyzing prompt behavior, and identifying issues such as hallucinations, latency problems, or unexpected responses. By collecting telemetry data from AI applications, LangWatch allows developers to understand how their systems perform in real-world usage scenarios. The...

Downloads: 2 This Week

Last Update: 5 days ago

See Project

mistral.rs

Fast, flexible LLM inference

mistral.rs is a fast and flexible LLM inference engine implemented in Rust, designed to run and serve modern language models with an emphasis on performance and practical deployment. It provides multiple entry points for developers, including a CLI for running models locally and an HTTP server that exposes an OpenAI-compatible API surface for easy integration with existing clients. The project includes hardware-aware tooling that can benchmark a system and choose sensible quantization and...

Downloads: 4 This Week

Last Update: 2026-04-02

See Project

AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

...These environments require agents to interpret instructions, take actions, and adapt their strategies based on feedback from the environment. AgentBench also includes an evaluation framework that measures success rates, rewards, and task completion performance across different agent implementations. By testing models across diverse scenarios, the benchmark highlights strengths and weaknesses in reasoning, long-term planning, and tool usage.

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

langrocks

Tools like web browser, computer access and code runner for LLMs

Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.

Downloads: 1 This Week

Last Update: 2024-11-21

See Project

promptmap2

A security scanner for custom LLM applications

promptmap is an automated security scanner for custom LLM applications that focuses on prompt injection and related attack classes. The project supports both white-box and black-box testing, which means it can either run tests directly against a known model and system prompt configuration or attack an external HTTP endpoint without internal access. Its scanning workflow uses a dual-LLM architecture in which one model acts as the target being tested and another acts as a controller that evaluates whether an attack succeeded. ...

Downloads: 0 This Week

Last Update: 2026-03-10

See Project

Langflow

Low-code app builder for RAG and multi-agent AI applications

Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.

Downloads: 11 This Week

Last Update: 2026-05-01

See Project

Deta Surf

Personal AI Notebooks. Organize files & webpages and generate notes

Surf is an open-source AI-driven development tool designed to simplify the process of building and experimenting with artificial intelligence applications. The platform provides a streamlined development environment where developers can test models, run experiments, and deploy small AI services with minimal infrastructure overhead. It focuses on simplicity and speed, allowing developers to prototype ideas quickly without managing complex cloud configurations. Surf integrates modern AI...

Downloads: 9 This Week

Last Update: 2026-04-29

See Project

super-agent-party

All-in-one AI companion! Desktop girlfriend + virtual streamer

Super Agent Party is an open-source experimental framework designed to demonstrate collaborative multi-agent AI systems interacting within a shared environment. The project explores how multiple specialized AI agents can coordinate to solve complex tasks by communicating with each other and sharing information. Instead of relying on a single monolithic model, the framework organizes agents with different roles or capabilities that cooperate to achieve goals. Each agent may handle different...

Downloads: 11 This Week

Last Update: 2026-05-01

See Project

Agent Development Kit (ADK) for Java

An open-source, code-first Java toolkit

Google’s Agent Development Kit for Java is an open-source toolkit that helps developers design, evaluate, and deploy advanced AI agents using the Java programming language. The framework follows a code-first approach that treats agent development as a structured software engineering task rather than a collection of prompt scripts. It provides abstractions and tools that allow developers to create agents capable of executing complex workflows, calling tools, and interacting with external...

Downloads: 6 This Week

Last Update: 2026-04-27

See Project

Agent Behavior Monitoring

The open source post-building layer for agents

Agent Behavior Monitoring is an open-source framework designed to monitor, evaluate, and improve the behavior of AI agents operating in real or simulated environments. The system focuses on agent behavior monitoring by collecting interaction data and analyzing how agents perform across different scenarios and tasks. Developers can use the framework to observe agent actions in both online production environments and offline evaluation settings, making it useful for debugging and performance...

Downloads: 5 This Week

Last Update: 2026-04-09

See Project

WeClone

One-stop solution for creating your digital avatar from chat history

...By processing large volumes of conversation data, WeClone can build a profile of an individual’s writing tone, vocabulary preferences, and conversational tendencies. Developers can use the resulting model to create chatbots that simulate a specific user’s communication patterns for testing or research purposes. Overall, WeClone explores the idea of digital identity replication through machine learning and conversational modeling.

Downloads: 3 This Week

Last Update: 2026-03-04

See Project

Opik

Debug, evaluate, and monitor your LLMapps, RAG systems, and agentic AI

Confidently evaluate, test, and monitor LLM applications. Opik is an open-source platform for evaluating, testing, and monitoring LLM applications. Built by Comet. Record, sort, search, and understand each step your LLM app takes to generate a response. Manually annotate, view, and compare LLM responses in a user-friendly table. Log traces during development and in production. Run experiments with different prompts and evaluate against a test set.

Downloads: 4 This Week

Last Update: 2 days ago

See Project

Prometheus-Eval

Evaluate your LLM's response with Prometheus and GPT4

Prometheus-Eval is an open-source framework designed to evaluate the outputs of large language models using specialized evaluator models known as Prometheus. The project provides tools, datasets, and scripts that allow developers and researchers to measure the quality of LLM responses through automated scoring rather than relying solely on human evaluators. It implements an “LLM-as-a-judge” approach in which a dedicated language model analyzes instruction–response pairs and assigns scores or...

Downloads: 3 This Week

Last Update: 2026-03-09

See Project

Bedrock Chat

AWS-native chatbot using Bedrock

Bedrock Chat is a mirrored version of an open-source project that provides a conversational interface for interacting with large language models and AI services through a chat-style application. The project typically focuses on delivering a user interface that allows individuals or teams to communicate with AI models, manage conversations, and experiment with prompts and responses. Implementations like Bedrock Chat often integrate with model hosting platforms or APIs that provide access to...

Downloads: 3 This Week

Last Update: 2026-04-09

See Project

Unity MCP

AI-powered bridge connecting LLMs and advanced AI agents

Unity-MCP is an open-source integration that connects artificial intelligence assistants with the Unity game development environment through the Model Context Protocol. The project enables AI tools such as coding assistants and autonomous agents to interact directly with Unity projects, allowing them to analyze scenes, modify assets, and generate code within the development environment. By exposing Unity editor functionality through MCP tools, the plugin allows external AI systems to...

Downloads: 6 This Week

Last Update: 1 day ago

See Project

Mosec

A high-performance ML model serving framework, offers dynamic batching

Mosec is a high-performance and flexible model-serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

Downloads: 1 This Week

Last Update: 2026-04-15

See Project

Hallucination Leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations

Hallucination Leaderboard is an open research project that tracks and compares the tendency of large language models to produce hallucinated or inaccurate information when generating summaries. The project provides a standardized benchmark that evaluates different models using a dedicated hallucination detection system known as the Hallucination Evaluation Model. Each model is tested on document summarization tasks to measure how often generated responses introduce information that is not...

Downloads: 1 This Week

Last Update: 2026-04-29

See Project

Search Results for "source testing unit testing"

Showing 36 open source projects for "source testing unit testing"

LangCheck

Strix

FuzzyAI Fuzzer

Rogue

Easy DataSet

Synthetic Data Generator

Claude Code Skills & Plugins Hub

BruteForceAI

LangWatch

mistral.rs

AgentBench

langrocks

promptmap2

Langflow

Deta Surf

super-agent-party

Agent Development Kit (ADK) for Java

Agent Behavior Monitoring

WeClone

Opik

Prometheus-Eval

Bedrock Chat

Unity MCP

Mosec

Hallucination Leaderboard

Search Results for "source testing unit testing"

Showing 36 open source projects for "source testing unit testing"

LangCheck

Strix

FuzzyAI Fuzzer

Rogue

Easy DataSet

Synthetic Data Generator

Claude Code Skills & Plugins Hub

BruteForceAI

LangWatch

mistral.rs

AgentBench

langrocks

promptmap2

Langflow

Deta Surf

super-agent-party

Agent Development Kit (ADK) for Java

Agent Behavior Monitoring

WeClone

Opik

Prometheus-Eval

Bedrock Chat

Unity MCP

Mosec

Hallucination Leaderboard

Related Searches

Related Categories