Showing 324 open source projects for "test"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    FireRedASR

    FireRedASR

    Open-source industrial-grade ASR models

    FireRedASR is an industrial-grade family of open-source automatic speech recognition models designed to provide high-precision speech-to-text performance across languages including Mandarin, English, and various Chinese dialects, achieving new state-of-the-art benchmarks on public test sets. The project includes multiple model variants to meet different application needs, such as high-accuracy end-to-end interaction using an encoder-adapter-LLM framework and efficient real-time recognition using attention-based encoder-decoder architectures, giving developers flexibility in balancing performance and resource constraints. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Sled

    Sled

    Teleport Claude Code, Codex or Gemini CLI to your phone

    ...Although specific details in the repository are limited without direct project documentation, context and related online mentions indicate it functions as a local interface layer that abstracts development agent workflows and Teleport-style interactions, bringing parts of modern assistant capabilities to phone or web UIs. This project resembles modern agent front ends where developers can test, iterate, and prompt their local models or backends without complex setup. The interface is light and integrates into broader development stacks, and the repository’s activity suggests ongoing maintenance with an MIT license and community engagement.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Transformer Debugger

    Transformer Debugger

    Tool for exploring and debugging transformer model behaviors

    Transformer Debugger (TDB) is a research tool developed by OpenAI’s Superalignment team to investigate and interpret the behaviors of small language models. It combines automated interpretability methods with sparse autoencoders, enabling researchers to analyze how specific neurons, attention heads, and latent features contribute to a model’s outputs. TDB allows users to intervene directly in the forward pass of a model and observe how such interventions change predictions, making it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    aisuite

    aisuite

    Simple, unified interface to multiple Generative AI providers

    ...Using an interface similar to OpenAI's, aisuite makes it easy to interact with the most popular LLMs and compare the results. It is a thin wrapper around Python client libraries and allows creators to seamlessly swap out and test responses from different LLM providers without changing their code. Today, the library is primarily focused on chat completions. We will expand it to cover more use cases in the near future. Currently supported providers are - OpenAI, Anthropic, Azure, Google, AWS, Groq, Mistral, HuggingFace and Ollama. To maximize stability, aisuite uses either the HTTP endpoint or the SDK for making calls to the provider.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    ADK Go

    ADK Go

    Code-first Go toolkit for building, evaluating, and deploying AI agent

    ...It is part of the Agent Development Kit ecosystem and follows a code-first approach that allows developers to define agent behavior, tools, and orchestration logic directly in Go code. ADK-Go applies traditional software engineering principles to agent development, making it easier to structure, test, and maintain complex agent-based systems. It supports building both simple task-oriented agents and more advanced multi-agent architectures that collaborate to perform workflows. It is designed to be modular and flexible, allowing developers to integrate custom tools, external services, or existing functionality into agent workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    TensorFlow Quantum

    TensorFlow Quantum

    Open-source Python framework for hybrid quantum-classical ml learning

    ...TensorFlow Quantum integrates with the Cirq quantum computing framework to define and manipulate quantum circuits, while leveraging TensorFlow’s infrastructure for optimization, automatic differentiation, and large-scale computation. The library also supports high-performance simulation of quantum circuits, enabling researchers to test and evaluate quantum models even without direct access to quantum hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    LongBench

    LongBench

    LongBench v2 and LongBench (ACL 25'&24')

    ...It supports bilingual evaluation in English and Chinese to assess multilingual capabilities across extended contexts. Newer versions of the benchmark introduce extremely long context windows ranging from thousands to millions of tokens, enabling researchers to test the limits of modern long-context models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    AICGSecEval

    AICGSecEval

    A.S.E (AICGSecEval) is a repository-level AI-generated code security

    ...AICGSecEval combines static and dynamic evaluation techniques to analyze generated code for vulnerabilities and functional correctness. The framework includes datasets, test cases, and evaluation metrics that measure how AI programming tools perform across multiple programming languages and vulnerability categories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LLM Colosseum

    LLM Colosseum

    Benchmark LLMs by fighting in Street Fighter 3

    LLM-Colosseum is an experimental benchmarking framework designed to evaluate the capabilities of large language models through gameplay interactions rather than traditional text-based benchmarks. The system places language models inside the environment of the classic video game Street Fighter III, where they must interpret the game state and decide which actions to perform during combat. This setup creates a dynamic environment that tests reasoning, situational awareness, and decision-making...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    Paddler

    Paddler

    Open-source LLM load balancer and serving platform for hosting LLMs

    ...Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. A built-in administrative interface allows developers and operations teams to manage models, observe system performance, and test inference endpoints.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Agentless

    Agentless

    An agentless approach to automatically solve software development

    Agentless is an open-source framework that applies large language models to automatically resolve software development issues without relying on complex autonomous agent systems. The project proposes an alternative approach to AI-driven code repair that avoids the overhead of multi-agent orchestration by using a structured pipeline for identifying and fixing bugs. When solving a problem, the system first performs localization to determine which files, functions, or code segments are most...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    GitHub Agentic Workflows

    GitHub Agentic Workflows

    GitHub Agentic Workflows

    GitHub Agentic Workflows is an experimental CLI extension and framework for the gh GitHub CLI that lets developers author automation driven by natural language specifications instead of hand-written code, compiling those descriptions into GitHub Actions workflows that run AI agents (like Copilot, Claude Code, or Codex) on schedule or in response to repository events. By writing intent in markdown files, a developer can quickly generate .yml Actions workflows that perform tasks such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AIPex

    AIPex

    AI browser automation assistant, no migration and privacy first

    AIPex is an AI-augmented development toolkit and workflow platform that aims to accelerate software productivity by integrating intelligent assistants, code generation tools, and customizable automation patterns directly into developer workflows. Rather than treating AI as a separate helper, AIPex embeds AI capabilities into common tasks like scaffolding components, generating tests, analyzing code quality, and performing refactors, allowing developers to stay in flow while benefiting from...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HunyuanWorld-Mirror

    HunyuanWorld-Mirror

    Fast and Universal 3D reconstruction model for versatile tasks

    ...The project sits within a broader family of Hunyuan models that explore world generation and 3D-consistent understanding, and this mirror variant makes the reconstruction stack easier to test. It’s attractive for rapid prototyping of scenes, environment scans, or reference assets when you need repeatable 3D results from ordinary media.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Claude Code Subagents Command Collection

    Claude Code Subagents Command Collection

    Claude Code Subagents & Commands Collection + CLI Tool

    ...Each subagent is defined by a concise role, tools, and behaviors, and ships as Markdown you can drop into your .claude/agents/ directory. The collection targets common developer workflows such as scaffolding, refactoring, test writing, documentation, security checks, and project management. It includes a CLI helper and documentation site that streamline installation, customization, and authoring of your own agents. The project’s framing mirrors modern software teams—delegate tasks to experts that can run in parallel under Claude’s subagent support. Frequent updates and community contributions keep the catalog current with new roles and best practices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AWS CodeDeploy Agent

    AWS CodeDeploy Agent

    Host Agent for AWS CodeDeploy

    ...AWS CodeDeploy fully automates your software deployments, allowing you to deploy reliably and rapidly. You can consistently deploy your application across your development, test, and production environments whether deploying to Amazon EC2, AWS Fargate, AWS Lambda, or your on-premises servers. The service scales with your infrastructure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Magnitude

    Magnitude

    Vision AI browser agent for automation, testing, and extraction

    Browser Agent by Magnitude is an open source, vision-first browser automation framework that enables users to control web interfaces using natural language instructions. It leverages visually grounded AI models to interpret and interact with web pages based on what is seen on the screen rather than relying solely on the DOM structure. This approach allows the agent to generalize better across complex and modern websites, making it more robust than traditional selector-based automation tools....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    promptmap2

    promptmap2

    A security scanner for custom LLM applications

    ...Its scanning workflow uses a dual-LLM architecture in which one model acts as the target being tested and another acts as a controller that evaluates whether an attack succeeded. The repository emphasizes broad coverage, including test rules for prompt stealing, jailbreaks, harmful content generation, hate-related outputs, social bias, and distraction attacks. It also supports multiple providers such as OpenAI, Anthropic, Google, xAI, and open-source models through Ollama, making it flexible for both commercial and local deployments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Microsoft Agent Skills

    Microsoft Agent Skills

    Skills, MCP servers, Custom Agents, Agents.md for SDKs

    Microsoft Agent Skills is an actively maintained repository of skills, custom agents, templates, and MCP configuration files designed to extend AI coding assistants with deep knowledge about Azure SDKs and Microsoft AI Foundry services. The project bundles over a hundred domain-specific skills that teach AI agents how to perform tasks like Azure resource provisioning, SDK usage patterns, infrastructure setup, and common DevOps workflows, bridging the gap between agent reasoning and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Aden Hive

    Aden Hive

    Outcome driven agent development framework that evolves

    Hive is an open-source agent development framework that helps developers build autonomous, reliable, self-improving AI agents by letting them describe goals in ordinary natural language instead of hand-coding detailed workflows. Rather than manually defining execution graphs, Hive’s coding agent generates the agent graph, connection code, and test cases based on your high-level objectives, enabling outcome-driven agent creation that fits real business processes. Once deployed, agents can capture failure data, evolve automatically to meet their success criteria, and redeploy without constant manual intervention, delivering continual improvement over time. The framework also includes human-in-the-loop nodes, credential management, cost and budget controls, and real-time observability so teams can monitor execution and intervene as needed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Cake

    Cake

    Distributed LLM and StableDiffusion inference

    ...Unlike many simple proxies, Cake can act as a full connection broker: it can bind to arbitrary interfaces, handle simultaneous upstream/downstream sessions, and apply traffic rules on the fly. This makes it suitable for troubleshooting tricky network behavior, simulating network conditions, or chaining services in a modular test environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OpenAGI

    OpenAGI

    When LLM Meets Domain Experts

    OpenAGI is a package for AI agent creation designed to connect large language models with domain-specific tools and workflows in the AIOS (AI Operating System) ecosystem. It provides a structured Python framework, pyopenagi, for defining agents as modular units that encapsulate execution logic, configuration, and dependency metadata. Agents are organized in a well-defined folder structure that includes code (agent.py), configuration (config.json), and extra requirements...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Agentex

    Agentex

    Open source codebase for Scale Agentex

    AgentEX is an open framework from Scale for building, running, and evaluating agentic workflows, with an emphasis on reproducibility and measurable outcomes rather than ad-hoc demos. It treats an “agent” as a composition of a policy (the LLM), tools, memory, and an execution runtime so you can test the whole loop, not just prompting. The repo focuses on structured experiments: standardized tasks, canonical tool interfaces, and logs that make it possible to compare models, prompts, and tool sets fairly. It also includes evaluation harnesses that capture success criteria and partial credit, plus traces you can inspect to understand where reasoning or tool use failed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Inspect Petri

    Inspect Petri

    An alignment auditing agent capable of exploring alignment hypothesis

    Inspect Petri is an open-source alignment auditing agent that lets researchers rapidly test concrete safety hypotheses against target models using realistic, multi-turn scenarios. Instead of building bespoke evals, Inspect Petri automatically generates audit environments from seed “special instructions,” orchestrates an auditor model to probe a target model, and simulates tool use and rollbacks to surface risky behaviors.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB