Search Results for "model based testing tool"

Showing 216 open source projects for "model based testing tool"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    Cactus Needle

    Cactus Needle

    26m function call model that runs on incredibly small devices

    Needle is an experimental 26-million-parameter function-calling model designed to run on extremely small devices such as phones, watches, glasses, and low-power personal AI hardware. It is based on a Simple Attention Network architecture and was distilled from a much larger model to focus on fast, compact tool-use behavior. The project provides open weights, training details, dataset generation resources, and a playground for testing the model with custom tools. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    FuzzyAI Fuzzer

    FuzzyAI Fuzzer

    A powerful tool for automated LLM fuzzing

    FuzzyAI is an open-source fuzzing framework designed to test the security and reliability of large language model applications. The tool automates the process of generating adversarial prompts and input variations to identify vulnerabilities such as jailbreaks, prompt injections, or unsafe model responses. It allows developers and security researchers to systematically evaluate the robustness of LLM-based systems by simulating a wide range of malicious or unexpected inputs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Locust

    Locust

    Scalable open source load testing tool

    Locust is an open source user load testing tool written in Python. The idea behind Locust is to swarm your web site or other systems with attacks from simulated users during a test, with each user behavior defined by you using Python code. This swarming process is then monitored from a web UI in real-time, and will help identify any bottlenecks in your code before real users can come in.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    Giskard

    Giskard

    Collaborative & Open-Source Quality Assurance for all AI models

    The testing framework dedicated to ML models, from tabular to LLMs. Giskard is an open-source testing framework dedicated to ML models, from tabular models to LLMs. Testing Machine Learning applications can be tedious. Since ML models depend on data, testing scenarios depend on the domain specificities and are often infinite. At Giskard, we believe that Machine Learning needs its own testing framework.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SERA CLI

    SERA CLI

    A tool to use the Ai2 Open Coding Agents Soft-Verified Agents

    SERA CLI is a command-line tool created by AllenAI to enable developers to interact with the SERA (Soft-Verified Efficient Repository Agents) model family using Claude Code as the execution front end. It provides a convenient interface for deploying, testing, and using SERA models without needing to write scaffold code from scratch, acting as both a proxy and utility wrapper to simplify workflows that involve large agent models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to...
    Downloads: 133 This Week
    Last Update:
    See Project
  • 8
    VulnClaw

    VulnClaw

    Based on AI Agent + MCP toolchain + penetration Skill orchestration

    VulnClaw is an AI-powered penetration testing agent that turns natural language security goals into structured testing workflows. It combines LLM agents, MCP toolchains, penetration testing skills, and command-line automation to support authorized security assessments. The project can guide information gathering, vulnerability discovery, validation, and report generation while keeping the workflow organized through sessions and tools. Its newer architecture uses a goal-driven solving engine...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Arcade AI

    Arcade AI

    Arcade Tool Development Kit (TDK), Worker, Evals, and CLI

    Arcade AI Platform is a developer-oriented toolkit for building, deploying, and managing tools tailored to AI agents, structured as modular Python packages for flexibility and extensibility. Core platform functionality and schemas. This repository contains the core Arcade libraries, organized as separate packages for maximum flexibility and modularity. Evaluation framework for testing tool performance. Test your MCP server's tools, resources, prompts, elicitation, and OAuth 2. MCPJam is...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 10
    Heretic

    Heretic

    Fully automatic censorship removal for language models

    Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    Transformer Debugger

    Transformer Debugger

    Tool for exploring and debugging transformer model behaviors

    ...It automatically identifies and explains the most influential components, highlights activation patterns, and maps relationships across circuits within the model. The tool includes both a React-based neuron viewer for exploring model components and a backend activation server for running inferences and serving data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    AWS SAM CLI

    AWS SAM CLI

    CLI tool to build, test, debug, and deploy Serverless applications

    The AWS Serverless Application Model (SAM) CLI is an open-source CLI tool that helps you develop serverless applications containing Lambda functions, Step Functions, API Gateway, EventBridge, SQS, SNS and more. The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AgentBench

    AgentBench

    A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

    ...AgentBench also includes an evaluation framework that measures success rates, rewards, and task completion performance across different agent implementations. By testing models across diverse scenarios, the benchmark highlights strengths and weaknesses in reasoning, long-term planning, and tool usage.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Synthetic Data Generator

    Synthetic Data Generator

    SDG is a specialized framework

    ...The platform enables developers and data scientists to create artificial datasets that preserve important relationships between variables without containing sensitive personal information. This makes the generated data suitable for tasks such as machine learning model training, testing software systems, sharing datasets across organizations, and conducting research without violating privacy regulations. The system supports multiple generation methods including statistical models, generative adversarial networks, and large language modelbased synthesis. It also includes a data processing module capable of handling different data types, preprocessing columns, managing missing values, and converting formats automatically before model training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    RedAmon

    RedAmon

    AI-powered framework for automated penetration testing and red teaming

    RedAmon is an AI-powered red team framework designed to automate offensive cybersecurity operations from reconnaissance to exploitation and post-exploitation. It combines artificial intelligence with traditional penetration testing tools to create a fully autonomous pipeline capable of discovering vulnerabilities and executing security assessments without human intervention. It begins with a multi-phase reconnaissance engine that maps the entire attack surface of a target, collecting...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    firerpa LAMDA

    firerpa LAMDA

    The most powerful Android RPA agent framework

    lamda is an Android RPA agent framework that provides visual remote desktop control and automation at scale, geared toward testing, automation validation, and device management. It exposes a clean UI to monitor and interact with connected devices and includes tooling to script actions reliably across apps and OS versions. The project emphasizes low-friction setup and powerful control primitives so teams can move from interactive validation to repeatable automation. A public wiki, releases,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    reg-factory

    reg-factory

    Google Mail registration

    reg-factory is an automation-oriented account registration pipeline aimed at research, testing, and controlled platform workflow experiments. It combines browser automation, environment-based configuration, and a local web control panel to run multi-step registration scripts. The project includes modules for email registration, platform sign-up flows, token handling, cookie export, and account workflow management. It uses local scripts and external tool integrations to coordinate browser profiles, network settings, Android-based Gmail workflows, and runtime logs. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation.
    Downloads: 418 This Week
    Last Update:
    See Project
  • 20
    alphageometry

    alphageometry

    AI-driven neuro-symbolic solver for high-school geometry problems

    AlphaGeometry, developed by Google DeepMind, is a theorem-proving system that combines symbolic reasoning with deep learning to solve challenging geometry problems, such as those found in mathematical Olympiads. The repository provides the full implementation of DDAR (Deductive Difference and Abductive Reasoning) and AlphaGeometry, two automated geometry solvers described in the 2024 Nature paper “Solving Olympiad Geometry without Human Demonstrations.” AlphaGeometry integrates a symbolic...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. ...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 22
    mitmproxy

    mitmproxy

    A free and open source interactive HTTPS proxy

    mitmproxy is an open source, interactive SSL/TLS-capable intercepting HTTP proxy, with a console interface fit for HTTP/1, HTTP/2, and WebSockets. It's the ideal tool for penetration testers and software developers, able to debug, test, and make privacy measurements. It can intercept, inspect, modify and replay web traffic, and can even prettify and decode a variety of message types. Its web-based interface mitmweb gives you a similar experience as Chrome's DevTools, with the addition of...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 23
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 25
    ACI.dev

    ACI.dev

    Open platform connecting AI agents to tools via unified MCP server

    ACI is an open source platform designed to enable AI agents to interact with external tools through a unified and structured interface. It focuses on simplifying tool integration by connecting hundreds of pre-built services into agentic environments, allowing developers to avoid building custom API clients and authentication flows for each service. ACI provides intent-aware tool access, meaning agents can dynamically discover and use tools based on context rather than rigid configurations. It supports both direct function calling and a unified Model Context Protocol (MCP) server, offering flexibility in how integrations are exposed to AI systems. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo