Showing 45 open source projects for "production test"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 1
    Ragas

    Ragas

    Supercharge Your LLM Application Evaluations

    Objective metrics, intelligent test generation, and data-driven insights for LLM apps. Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows. Don't have a test dataset ready? We also do production-aligned test set generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Opik

    Opik

    Debug, evaluate, and monitor your LLMapps, RAG systems, and agentic AI

    Confidently evaluate, test, and monitor LLM applications. Opik is an open-source platform for evaluating, testing, and monitoring LLM applications. Built by Comet. Record, sort, search, and understand each step your LLM app takes to generate a response. Manually annotate, view, and compare LLM responses in a user-friendly table. Log traces during development and in production.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Evidently

    Evidently

    Evaluate and monitor ML models from validation to production

    Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    PySpur

    PySpur

    Visual tool for building, testing, and deploying AI agent workflows

    PySpur is a visual development environment designed to help AI engineers build, test, and iterate on agent-based workflows more efficiently. It provides a structured playground where users can define test cases, construct agents either through Python code or a graphical interface, and continuously refine their behavior. It addresses common challenges in AI agent development such as prompt tuning difficulties and lack of visibility into workflow execution.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    TONL

    TONL

    TONL (Token-Optimized Notation Language)

    ...The platform comes with a complete command-line interface that supports interactive dashboards and cross-platform usage in browsers and server environments, and its high test coverage gives developers confidence in stability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Jovo Framework

    Jovo Framework

    The React for Voice and Chat, build apps for Alexa, Google Assistant

    The multimodal experience platform enables professional teams to build and run apps that work across smart speakers, the web, mobile, and more. Fully customizable and open source. The Jovo product ecosystem allows you to build, test, and run powerful experiences for voice, chat, and web platforms. From local development to production, Jovo allows you to build robust experiences, faster. Build across devices and platforms and use all supported modalities thanks to the Jovo output template engine. Our component and plugin architecture makes it possible to make Jovo work for your specific use case, across projects. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AgentOps

    AgentOps

    Python SDK for agent monitoring, LLM cost tracking, benchmarking, etc.

    Industry-leading developer platform to test and debug AI agents. We built the tools so you don't have to. Visually track events such as LLM calls, tools, and multi-agent interactions. Rewind and replay agent runs with point-in-time precision. Keep a full data trail of logs, errors, and prompt injection attacks from prototype to production. Native integrations with the top agent frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Coze Loop

    Coze Loop

    Next-generation AI Agent Optimization Platform

    Coze Loop is a developer-oriented platform that provides full lifecycle management for AI agents, covering everything from prompt engineering to production monitoring. The project aims to simplify the increasingly complex workflow of building reliable AI agents by offering integrated tools for debugging, evaluation, observability, and optimization. Through its visual playground, developers can test prompts interactively and compare outputs across different language models. The platform also includes automated evaluation capabilities that assess agent performance across multiple quality dimensions such as accuracy and compliance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Sandbox Agent

    Sandbox Agent

    Run Coding Agents in Sandboxes

    ...Developers can use Sandbox Agent to simulate real-world workflows, debug agent decisions, and evaluate outcomes in a contained setting before deploying to production. It also supports extensibility, allowing integration with custom tools, APIs, and workflows tailored to specific use cases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Helicone

    Helicone

    Open source LLM-Observability Platform for Developers

    Open source LLM-Observability Platform for Developers. One-line integration for monitoring, metrics, evals, agent tracing, prompt management, playground, etc. Supports OpenAI SDK, Vercel AI SDK, Anthropic SDK, LiteLLM, LLamaIndex, LangChain, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Pezzo

    Pezzo

    Open-source, developer-first LLMOps platform

    Pezzo enables you to build, test, monitor and instantly ship AI all in one platform, while constantly optimizing for cost and performance. Packed with powerful features to streamline your workflow, so you can focus on what matters. Pezzo is a fully cloud-native and open-source LLMOps platform. Seamlessly observe and monitor your AI operations, troubleshoot issues, save up to 90% on costs and latency, collaborate and manage your prompts in one place, and instantly deliver AI changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Agent Chat UI

    Agent Chat UI

    Web app for interacting with any LangGraph agent (PY & TS) via a chat

    Agent Chat UI is an open-source web application that provides a graphical interface for interacting with AI agents built using LangGraph and related frameworks. The project is implemented as a modern Next.js application and allows users to chat with agent workflows running on remote or local LangGraph servers. Through a simple configuration process, developers can connect the interface to a deployed agent by specifying the server URL, assistant identifier, and authentication credentials....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    ...Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Kiln

    Kiln

    Open source platform for managing, testing, and deploying AI apps

    Kiln is an open source platform designed to help developers build, evaluate, and deploy AI-powered applications with greater structure and reliability. It provides a unified environment for managing prompts, datasets, and evaluation workflows, allowing teams to iterate on AI behavior in a controlled and measurable way. Kiln emphasizes reproducibility, enabling users to track changes to prompts and models while comparing outputs across different configurations. Kiln also supports systematic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Deta Surf

    Deta Surf

    Personal AI Notebooks. Organize files & webpages and generate notes

    ...The platform is particularly useful for developers who want to experiment with AI models locally while maintaining the option to deploy them in production environments later. Its architecture is designed to minimize setup complexity while still supporting scalable application structures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Recommenders

    Recommenders

    Best practices on recommendation systems

    The Recommenders repository provides examples and best practices for building recommendation systems, provided as Jupyter notebooks. The module reco_utils contains functions to simplify common tasks used when developing and evaluating recommender systems. Several utilities are provided in reco_utils to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    AWS CodeDeploy Agent

    AWS CodeDeploy Agent

    Host Agent for AWS CodeDeploy

    ...AWS CodeDeploy fully automates your software deployments, allowing you to deploy reliably and rapidly. You can consistently deploy your application across your development, test, and production environments whether deploying to Amazon EC2, AWS Fargate, AWS Lambda, or your on-premises servers. The service scales with your infrastructure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Aden Hive

    Aden Hive

    Outcome driven agent development framework that evolves

    ...Hive is designed for production environments and supports a wide range of large language models, local models, and business system connectivity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Agentex

    Agentex

    Open source codebase for Scale Agentex

    ...The design encourages clean separation between experiment configuration and code, which makes sharing results or re-running baselines straightforward. Teams use it to progress from prototypes to production-ready agent behaviors by iterating on prompts, adding tools, and validating improvements with consistent metrics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PyTorch Geometric Temporal

    PyTorch Geometric Temporal

    Spatiotemporal Signal Processing with Neural Machine Learning Models

    The library consists of various dynamic and temporal geometric deep learning, embedding, and Spatio-temporal regression methods from a variety of published research papers. Moreover, it comes with an easy-to-use dataset loader, train-test splitter and temporal snaphot iterator for dynamic and temporal graphs. The framework naturally provides GPU support. It also comes with a number of benchmark datasets from the epidemiological forecasting, sharing economy, energy production and web traffic management domains. Finally, you can also create your own datasets. The package interfaces well with Pytorch Lightning which allows training on CPUs, single and multiple GPUs out-of-the-box. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Mini Agent

    Mini Agent

    A minimal yet professional single agent demo project

    Mini-Agent is a minimal yet production-minded demo project that shows how to build a serious command-line AI agent around the MiniMax-M2 model. It is designed both as a reference implementation and as a usable agent, demonstrating a full execution loop that includes planning, tool calls, and iterative refinement. The project exposes an Anthropic-compatible API interface and fully supports interleaved thinking, letting the agent alternate between reasoning steps and tool invocations during...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Arthur Bench

    Arthur Bench

    Bench is a tool for evaluating LLMs for production use cases

    Bench is a tool for evaluating LLMs for production use cases. Whether you are comparing different LLMs, considering different prompts, or testing generation hyperparameters like temperature and # tokens, Bench provides one touch point for all your LLM performance evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Downloads: 236 This Week
    Last Update:
    See Project
  • 24
    Email to Event - ETE

    Email to Event - ETE

    The python App/Skrypt automaticly add important events into calendar.

    It is use AI running localy and model you can choose. Skrypt have a tool for automatic add to scheduler. It now not working with Microsoft outlook and Google gmail, for certifications and API polici reasons . Fuly tested on Seznam.cz* services, if you have difrent provier with same type of security it will be working. *Email is using standart IMAP, Calendar use iCalendar API and authentification method. Fast setup: 1. Download and unpack 2. Install LM studio - recomended for GPU...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    YiVal

    YiVal

    Your Automatic Prompt Engineering Assistant for GenAI Applications

    YiVal is an open-source framework designed to automate prompt engineering and evaluation workflows for generative AI applications, enabling developers to systematically improve the performance of large language models. It focuses on experimentation and optimization by allowing users to test multiple prompt variations, configurations, and model parameters in parallel, then evaluate their outputs using structured metrics and scoring systems. The platform is particularly useful in production environments where prompt quality directly impacts user experience, as it provides a repeatable and data-driven approach to refining prompts rather than relying on manual trial and error. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB