43 projects for "tests" with 2 filters applied:

  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    Qodo Cover

    Qodo Cover

    AI tool that generates tests to improve code coverage quickly

    Qodo Cover is an open source developer tool designed to automate the creation of unit tests using generative AI, helping teams improve code coverage with minimal manual effort. It operates as a command-line interface and can also be integrated into continuous integration workflows, making it adaptable to different development environments. It analyzes an existing codebase, identifies gaps in test coverage, and generates new tests that target uncovered or weakly tested areas. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Micro Agent

    Micro Agent

    AI CLI agent that writes code by iterating until tests pass

    Micro Agent is a command-line tool designed to generate and refine code using a test-driven approach powered by large language models. Instead of producing one-shot code outputs, it creates or uses test cases and repeatedly iterates on the generated code until those tests pass successfully. This workflow emphasizes reliability by using structured feedback from failing tests to guide improvements, reducing the need for manual debugging and iteration. Micro Agent intentionally limits its scope to a focused task, avoiding complex multi-file operations or full project automation in order to minimize compounding errors. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    ...The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    AutoBE

    AutoBE

    AI Vibe Coding Agent of TS backend server

    ...Its main value is giving developers and non-programmers a structured way to generate backend systems from requirements while still producing documentation and tests.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Anthropic's Original Performance

    Anthropic's Original Performance

    Anthropic's original performance take-home, now open for you to try

    ...The project sets up a baseline performance problem where participants work to reduce simulated “clock cycles” required to run a given workload, effectively challenging them to engineer faster code under constraints. This take-home includes starter code, tests, and tools to debug performance, aiming to measure how effectively one can apply algorithmic improvements and optimizations. Because it’s framed around beating baseline scores — and even outperforming previous automated systems — it encourages both deep knowledge of Python and creative problem-solving.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Happy Coder

    Happy Coder

    Mobile and Web client for Codex and Claude Code, with realtime voice

    ...You can start a coding session locally through the Happy CLI or connect from a phone or browser, allowing developers to inspect, interact with, and guide the AI as it generates, tests, or explains code. The project includes components like a dedicated backend server for encrypted sync, a rich front-end experience across web and native apps, and support for push notifications when your coding agent encounters permission requests or errors. Happy prioritizes security with end-to-end encryption so your code and interactions remain private and auditable.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 7
    agents.md

    agents.md

    A simple, open format for guiding coding agents

    ...Instead of putting everything in README or doc files (which are more human-oriented and might mix high-level narrative), AGENTS.md is intended to surface agent-relevant details that help them “do the right thing” (tests, style, project structure, tooling).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Agentless

    Agentless

    An agentless approach to automatically solve software development

    ...It then generates multiple candidate patches for the identified locations using language model reasoning and diff-style edits. In the final stage, the framework validates potential patches by running regression tests and additional reproduction tests to confirm whether the fix resolves the original error. Based on these results, the system ranks the candidate patches and selects the most reliable solution to submit.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    ...With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and improving markdown conversion, reflecting active community use in research flows. It’s designed to be drop-in for MCP clients, giving them typed inputs/outputs and predictable errors around a well-known academic corpus. For developers building research copilots, it removes the glue work of wiring arXiv APIs into an agent toolchain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    rep+

    rep+

    Burp-style HTTP Repeater for Chrome DevTools with built‑in AI

    rep+ is a lightweight browser extension for Chrome DevTools that brings a Burp Suite-style HTTP repeater directly into the developer console, enhanced with built-in AI to help explain requests and suggest tests. It captures HTTP traffic from the inspected page without needing a proxy, allowing users to replay, modify, and analyze individual requests with fine-grained control over headers, bodies, and methods. The tool offers hierarchical grouping, tagging, and filtering of captured requests so that developers and security testers can manage complex traffic flows efficiently. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    darwin-skill

    darwin-skill

    Autoresearch-inspired autonomous skill optimization for Claude Code

    darwin-skill is an experimental framework designed to automatically improve AI agent “skills” through iterative evaluation and optimization loops inspired by machine learning training processes. Instead of treating prompts or skill definitions as static assets, the system applies a continuous improvement cycle that evaluates performance, proposes changes, tests outcomes, and either retains or reverts modifications. The framework introduces a scoring system across multiple dimensions, enabling quantitative assessment of skill quality and ensuring that only improvements are preserved over time. It incorporates a “ratchet mechanism” similar to version control workflows, guaranteeing that performance never degrades as iterations progress. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    AIPex

    AIPex

    AI browser automation assistant, no migration and privacy first

    AIPex is an AI-augmented development toolkit and workflow platform that aims to accelerate software productivity by integrating intelligent assistants, code generation tools, and customizable automation patterns directly into developer workflows. Rather than treating AI as a separate helper, AIPex embeds AI capabilities into common tasks like scaffolding components, generating tests, analyzing code quality, and performing refactors, allowing developers to stay in flow while benefiting from model-assisted insights. It supports modular plugin architecture so teams can extend or customize how assistants behave based on project conventions, code standards, or tooling preferences. AIPex also includes orchestration pipelines that let teams define multi-step AI-driven transformations — for example, generating code then running validation, producing documentation, and opening change requests — all within a unified pattern.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    LLM Colosseum

    LLM Colosseum

    Benchmark LLMs by fighting in Street Fighter 3

    ...The system places language models inside the environment of the classic video game Street Fighter III, where they must interpret the game state and decide which actions to perform during combat. This setup creates a dynamic environment that tests reasoning, situational awareness, and decision-making abilities in real time. Instead of relying purely on reward signals as in reinforcement learning agents, the models analyze contextual information and generate strategic actions based on the game environment. Performance is evaluated using a competitive ranking system that assigns models an ELO rating based on their results across matches against other models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 15
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    Bifrost

    Bifrost

    The Fastest LLM Gateway with built in OTel observability

    ...It abstracts away the complexity of working directly with multiple backend providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, etc.), enabling you to plug in providers and switch between them without touching your client code. It is built to be high performance: in benchmark tests at 5,000 requests per second, it reportedly adds only microseconds of overhead and achieves perfect success rates with no failed requests. Bifrost supports features such as automatic fallback (failover between providers), load balancing across API keys/providers, and semantic caching to reduce latency and cost. It also includes observability with built-in metrics, tracing, logging, and supports governance features like rate limiting, access control, and cost budgeting. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Expect

    Expect

    Let agents test your code in a real browser

    ...It may support chaining conditions, enabling complex validation logic without introducing unnecessary verbosity. The design suggests a focus on productivity, reducing cognitive load when writing and reviewing tests or validation scripts. It is likely adaptable across multiple contexts, including unit testing, integration testing, and runtime assertions. By abstracting repetitive validation logic, expect helps developers focus on behavior rather than implementation details. Overall, it serves as a lightweight but powerful tool for improving software reliability and clarity in testing workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    OpenReview

    OpenReview

    An open-source, self-hosted AI code review bot powered by Vercel

    ...Built by Vercel Labs, it integrates directly with GitHub workflows, allowing developers to trigger intelligent code reviews by simply mentioning a bot in a pull request. The system operates in a sandboxed environment with access to the repository, enabling it to run linters, tests, and formatting tools as part of its review process. It provides detailed, line-by-line feedback and can suggest or even apply fixes directly to the codebase. OpenReview is designed for extensibility, supporting custom review skills that can be tailored to specific development needs or coding standards. Its architecture leverages Vercel’s infrastructure for scalable and reliable execution, ensuring that reviews can be resumed or retried if interrupted.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    ...Users can extend Earth2Studio with optional model packs, advanced data interfaces, statistical operators, and backend integrations that support flexible workflows from simple tests to large-scale operational inference.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    AI Marketing Skills

    AI Marketing Skills

    Open-source AI marketing skills for Claude Code

    ...The system is organized into multiple domains such as growth experimentation, sales pipeline generation, content production, outbound marketing, SEO optimization, and financial analysis, effectively covering the entire revenue lifecycle of a business. Each skill functions as an executable capability that can be invoked on demand, enabling users to perform tasks like running A/B tests, generating high-quality content, or analyzing conversion funnels with minimal manual effort.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    ASSERT

    ASSERT

    Requirement-driven evaluation harness for AI agents and LLM

    ASSERT is a requirement-driven evaluation harness for AI agents and LLM applications. It turns natural-language specifications, policies, product requirements, and launch criteria into structured tests that can be reviewed, executed, scored, and improved. The pipeline derives behavior categories, generates single-turn and multi-turn test cases, runs them against a target system, and uses an LLM judge to score conversations against the stated policies. It can evaluate hosted models, custom agents, multi-agent systems, REST clients, and frameworks such as LangGraph, CrewAI, AutoGen, DSPy, LlamaIndex, and OpenAI Agents SDK. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MiroThinker

    MiroThinker

    MiroThinker is an open source deep research agent

    ...The platform is optimized for research tasks such as financial forecasting, knowledge discovery, and large-scale information synthesis. MiroThinker has been evaluated on several agent benchmarks and has demonstrated strong performance on tests designed to measure deep research capabilities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VibeTensor

    VibeTensor

    Our first fully AI generated deep learning system

    ...What makes VibeTensor remarkable is that every major component, from core libraries and dispatch systems to CUDA runtime support, caching allocators, and language bindings, was created and validated by coding agents using automated builds and tests rather than manual line-by-line human coding. The system includes both a Python frontend via a torch-like API and an experimental Node.js/TypeScript interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Claude Code Subagents Command Collection

    Claude Code Subagents Command Collection

    Claude Code Subagents & Commands Collection + CLI Tool

    This repository aggregates a large set of specialized subagents and slash commands designed for Claude Code, giving developers domain-focused “teammates” they can summon on demand. Each subagent is defined by a concise role, tools, and behaviors, and ships as Markdown you can drop into your .claude/agents/ directory. The collection targets common developer workflows such as scaffolding, refactoring, test writing, documentation, security checks, and project management. It includes a CLI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    promptmap2

    promptmap2

    A security scanner for custom LLM applications

    promptmap is an automated security scanner for custom LLM applications that focuses on prompt injection and related attack classes. The project supports both white-box and black-box testing, which means it can either run tests directly against a known model and system prompt configuration or attack an external HTTP endpoint without internal access. Its scanning workflow uses a dual-LLM architecture in which one model acts as the target being tested and another acts as a controller that evaluates whether an attack succeeded. The repository emphasizes broad coverage, including test rules for prompt stealing, jailbreaks, harmful content generation, hate-related outputs, social bias, and distraction attacks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo