Showing 324 open source projects for "test"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Codeflash

    Codeflash

    Optimize your code automatically with AI

    Codeflash is a general-purpose optimizer for Python that uses advanced large language models (LLMs) to automatically generate, test, and benchmark multiple optimization ideas, then creates merge-ready pull requests with the best improvements for your code. Optimize an entire existing codebase by running codeflash --all. Automate optimizing all future code you will write by installing Codeflash as a GitHub action. Optimize a Python workflow python myscript.py end-to-end by running codeflash optimize myscript.py. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    MiniMax-M2

    MiniMax-M2

    MiniMax-M2, a model built for Max coding & agentic workflows

    ...It uses a Mixture-of-Experts (MoE) architecture with 230 billion total parameters but only 10 billion activated per token, giving it the behavior of a very large model at a fraction of the runtime cost. The model is tuned for end-to-end developer flows such as multi-file edits, compile–run–fix loops, and test-validated repairs across real repositories and diverse programming languages. It is also optimized for multi-step agent tasks, planning and executing long toolchains that span shell commands, browsers, retrieval systems, and code runners. Benchmarks show that it achieves highly competitive scores on a wide range of intelligence and agent benchmarks, including SWE-Bench variants, Terminal-Bench, BrowseComp, GAIA, and several long-context reasoning suites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    ...Its modular design allows easy extension—components like differentiable rasterizers, mesh blending, or signed distance field (SDF) modules can be swapped or combined to test new architectures quickly.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    OctoMind MCP

    OctoMind MCP

    An MCP server for octomind tools, resources and prompts

    The Octomind MCP Server is designed to integrate Octomind's end-to-end testing tools and resources into local development environments. It enables AI-powered interfaces to create, execute, and manage e2e tests, enhancing the testing workflow. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    ...The server also offers per-word timestamped captions, which makes it useful for creating subtitles or aligning audio with text. A built in web UI, API documentation, and debug endpoints for monitoring system status help users explore voices, test requests, and integrate the service into larger systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Ninjabot

    Ninjabot

    A fast cryptocurrency platform for trading bot in Go

    A fast cryptocurrency trading bot framework implemented in Go. Ninjabot permits users to create and test custom strategies for spot markets. Ninjabot is an open-source platform that provides tools to implement custom strategies and backtests for trading cryptocurrencies in Go. Ninjabot CLI provides utilities commands to support backtesting and bot development. Currently, we only support Binance exchange. If you want to include support for other exchanges, you need to implement a new struct that implements the interface Exchange. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    CodeBurn

    CodeBurn

    See where your AI coding tokens go

    CodeBurn is a security-focused tool designed to evaluate and stress-test codebases using adversarial techniques, often leveraging AI to identify vulnerabilities and weaknesses. It simulates attack scenarios against code to uncover potential security risks, helping developers proactively identify issues before they reach production. The system is designed to integrate into development workflows, allowing continuous testing as code evolves.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    LabClaw

    LabClaw

    Operating Layer for LabOS (Stanford-Princeton AI Co-Scientists)

    LabClaw is an open-source AI experimentation and agent orchestration platform designed to help developers build, test, and iterate on complex autonomous workflows in a controlled and modular environment. It provides a framework for composing multiple tools, prompts, and execution steps into structured pipelines that can be reused and evaluated across different scenarios. The system emphasizes experimentation, allowing users to run multiple variations of agent workflows, compare outputs, and refine performance over time. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    DeepSeekMath-V2

    DeepSeekMath-V2

    Towards self-verifiable mathematical reasoning

    DeepSeekMath-V2 is a large-scale open-source AI model designed specifically for advanced mathematical reasoning, theorem proving, and rigorous proof verification. It’s built by DeepSeek as a successor to their earlier math-specialist models. Unlike general-purpose LLMs that might generate plausible-looking math but sometimes hallucinate or mishandle rigorous logic, Math-V2 is engineered to not only generate solutions but also self-verify them, meaning it examines the derivations, checks...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    RL Baselines3 Zoo

    RL Baselines3 Zoo

    Training framework for Stable Baselines3 reinforcement learning agents

    rl-baselines3-zoo is a collection of pre-trained models, benchmarks, and hyperparameter tuning tools built on top of Stable Baselines3, a reinforcement learning library. It provides an easy way to test, evaluate, and train RL agents across a wide variety of environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LiteMultiAgent

    LiteMultiAgent

    The Library for LLM-based multi-agent applications

    LiteMultiAgent is a lightweight and extensible multi-agent reinforcement learning (MARL) platform designed for rapid experimentation. It allows researchers to design and test coordination, competition, and collaboration scenarios in simulated environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AgentOps

    AgentOps

    Python SDK for agent monitoring, LLM cost tracking, benchmarking, etc.

    Industry-leading developer platform to test and debug AI agents. We built the tools so you don't have to. Visually track events such as LLM calls, tools, and multi-agent interactions. Rewind and replay agent runs with point-in-time precision. Keep a full data trail of logs, errors, and prompt injection attacks from prototype to production. Native integrations with the top agent frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    ...Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Skills For Real Engineers

    Skills For Real Engineers

    Skills for Real Engineers. Straight from my .claude directory

    Skills For Real Engineers is a curated collection of modular AI “skills” designed to improve how developers interact with coding agents by enforcing structured engineering workflows. Each skill is a small, focused instruction set that guides an AI through tasks such as planning, refactoring, testing, or architectural analysis. Instead of relying on vague prompts, the system encodes repeatable processes that ensure consistent and higher-quality outputs. The repository includes tools for...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    FlowLens MCP

    FlowLens MCP

    Open-source MCP server that gives your coding agent

    FlowLens MCP Server is an open-source tool designed to give AI-powered coding agents (like Claude Code, Cursor, GitHub Copilot / Codex, and others) full, replayable browser context to dramatically improve debugging, bug reporting, and regression testing for web applications. It works together with a companion browser extension: when a user reproduces a bug or a complicated UI interaction, the extension captures a rich session log, including screen/video recording, network traffic, console...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mistral Inference

    Mistral Inference

    Official inference library for Mistral models

    Open and portable generative AI for devs and businesses. We release open-weight models for everyone to customize and deploy where they want it. Our super-efficient model Mistral Nemo is available under Apache 2.0, while Mistral Large 2 is available through both a free non-commercial license, and a commercial license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    imodelsX

    imodelsX

    Interpretable prompting and models for NLP

    ...Fit better decision trees using an LLM to expand features. Finetune a single linear layer on top of LLM embeddings. Use these just a like a sci-kit-learn model. During training, they fit better features via LLMs, but at test-time, they are extremely fast and completely transparent.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CoStrict

    CoStrict

    Strict AI coder for enterprises, quality first

    ...Unlike typical AI coding tools that prioritize speed over rigor, CoStrict introduces a “strict mode” methodology that enforces disciplined processes such as requirements analysis, architecture planning, task decomposition, and test generation before producing code. This makes it particularly suitable for organizations that require consistency, auditability, and reliability in AI-assisted development. The system integrates repository-wide analysis using retrieval-augmented generation, allowing it to understand large codebases and provide context-aware suggestions, reviews, and modifications. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Kiln

    Kiln

    Open source platform for managing, testing, and deploying AI apps

    Kiln is an open source platform designed to help developers build, evaluate, and deploy AI-powered applications with greater structure and reliability. It provides a unified environment for managing prompts, datasets, and evaluation workflows, allowing teams to iterate on AI behavior in a controlled and measurable way. Kiln emphasizes reproducibility, enabling users to track changes to prompts and models while comparing outputs across different configurations. Kiln also supports systematic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Deta Surf

    Deta Surf

    Personal AI Notebooks. Organize files & webpages and generate notes

    Surf is an open-source AI-driven development tool designed to simplify the process of building and experimenting with artificial intelligence applications. The platform provides a streamlined development environment where developers can test models, run experiments, and deploy small AI services with minimal infrastructure overhead. It focuses on simplicity and speed, allowing developers to prototype ideas quickly without managing complex cloud configurations. Surf integrates modern AI workflows such as prompt-based applications, lightweight APIs, and automated deployment pipelines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    The Minimalist Entrepreneur

    The Minimalist Entrepreneur

    Claude Code skills based on The Minimalist Entrepreneur

    ...The repository reflects a broader shift toward treating AI behavior as programmable and modular rather than monolithic. It also supports experimentation, enabling users to test how different skill combinations affect performance and output quality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Sandbox Agent

    Sandbox Agent

    Run Coding Agents in Sandboxes

    Sandbox Agent by Rivet is an experimental framework for running AI agents in controlled, isolated environments where they can safely execute code, interact with tools, and perform autonomous tasks without risking system integrity. It is designed to provide a secure sandbox that allows agents to test actions, manipulate files, and run commands while enforcing strict boundaries and monitoring capabilities. The project focuses on enabling more reliable and auditable agent behavior by separating execution from the host environment, which is especially important for applications involving automation, code generation, or system-level operations. ...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB