Showing 52 open source projects for "test case generation"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Ragas

    Ragas

    Supercharge Your LLM Application Evaluations

    Objective metrics, intelligent test generation, and data-driven insights for LLM apps. Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows. Don't have a test dataset ready? We also do production-aligned test set generation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Pixelle-Video

    Pixelle-Video

    AI Fully Automated Short Video Engine

    ...It focuses on enabling automated video creation workflows where visual content can be synthesized, edited, or enhanced through AI models. The project integrates different components of video processing, such as frame generation, transformation, and sequencing, into a unified pipeline. It is built to support experimentation with generative video models, making it useful for research and creative applications. The system emphasizes modularity, allowing developers to plug in different models or processing steps depending on the use case. It can be used for tasks such as content generation, video editing, or visual storytelling. ...
    Downloads: 79 This Week
    Last Update:
    See Project
  • 3
    Guidance

    Guidance

    A guidance language for controlling large language models

    Guidance is an efficient programming paradigm for steering language models. With Guidance, you can control how output is structured and get high-quality output for your use case—while reducing latency and cost vs. conventional prompting or fine-tuning. It allows users to constrain generation (e.g. with regex and CFGs) as well as to interleave control (conditionals, loops, tool use) and generation seamlessly.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    amrlib

    amrlib

    A python library that makes AMR parsing, generation and visualization

    A python library that makes AMR parsing, generation and visualization simple. amrlib is a python module designed to make processing for Abstract Meaning Representation (AMR) simple by providing the following functions. Sentence to Graph (StoG) parsing to create AMR graphs from English sentences. Graph to Sentence (GtoS) generation for turning AMR graphs into English sentences. A QT-based GUI to facilitate the conversion of sentences to graphs and back to sentences.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    ComfyUI-WanVideoWrapper

    ComfyUI-WanVideoWrapper

    ComfyUI wrapper nodes for WanVideo and related models

    The ComfyUI-WanVideoWrapper project is a custom node extension for ComfyUI that enables advanced video generation workflows using WanVideo diffusion models. It acts as a standalone wrapper layer that allows developers and creators to integrate experimental features and models without modifying the core ComfyUI codebase. This design makes it easier to rapidly test new capabilities such as text-to-video and image-to-video generation while avoiding compatibility issues with the main framework. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    MiniCPM4.1

    MiniCPM4.1

    Achieving 3+ generation speedup on reasoning tasks

    MiniCPM4.1 is an enhanced iteration of the MiniCPM4 architecture, introducing improvements in reasoning capabilities, inference speed, and hybrid operation modes that allow dynamic switching between deep reasoning and standard generation. It builds upon the same efficiency-focused philosophy but further optimizes decoding performance, achieving substantial speed gains in reasoning-intensive tasks while maintaining high-quality outputs. One of its key innovations is the hybrid reasoning mode, which allows developers to control whether the model engages in deeper reasoning processes or faster responses depending on the use case. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    CodiumAI Cover-Agent

    CodiumAI Cover-Agent

    CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation

    CodiumAI Cover Agent aims to help efficiently increasing code coverage, by automatically generating qualified tests to enhance existing test suites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    UltraRAG

    UltraRAG

    Less Code, Lower Barrier, Faster Deployment

    UltraRAG 2.0 is a low-code, MCP-enabled RAG framework that aims to lower the barrier to building complex retrieval pipelines for research and production. It provides end-to-end recipes—from encoding and indexing corpora to deploying retrievers and LLMs—so users can reproduce baselines and iterate rapidly. The toolkit comes with built-in support for popular RAG datasets, large corpora, and canonical baselines, plus documentation that walks from “quick start” to debugging and case analysis. It...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Giskard

    Giskard

    Collaborative & Open-Source Quality Assurance for all AI models

    ...The Giskard scan automatically detects vulnerability issues such as performance bias, data leakage, unrobustness, spurious correlation, overconfidence, underconfidence, unethical issue, etc. Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    ...The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. Additionally, it can generate embeddings for extracted content and integrate with vector databases like Milvus, making it well-suited for retrieval-augmented generation pipelines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Qodo Cover

    Qodo Cover

    AI tool that generates tests to improve code coverage quickly

    ...It operates as a command-line interface and can also be integrated into continuous integration workflows, making it adaptable to different development environments. It analyzes an existing codebase, identifies gaps in test coverage, and generates new tests that target uncovered or weakly tested areas. It follows an iterative workflow where generated tests are executed, validated, and refined to ensure they contribute meaningful coverage improvements. Internally, Qodo Cover uses a modular architecture that includes components for prompt generation, AI interaction, coverage analysis, and test validation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Cactus Needle

    Cactus Needle

    26m function call model that runs on incredibly small devices

    ...It is based on a Simple Attention Network architecture and was distilled from a much larger model to focus on fast, compact tool-use behavior. The project provides open weights, training details, dataset generation resources, and a playground for testing the model with custom tools. Needle is optimized for single-shot function calling rather than broad conversational ability, so its core use case is selecting the right tool and producing structured arguments. It can be fine-tuned locally, including on consumer machines, which makes it useful for experimentation with small personalized agents. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    AIBuildAI

    AIBuildAI

    An AI agent that automatically builds AI models

    ...The project explores recursive AI development, where models are used not only as tools but as builders capable of constructing other AI systems, workflows, or components. It provides a structured environment for orchestrating agents that can plan, execute, and refine tasks such as code generation, system design, and iterative improvement loops. The framework is designed to support experimentation with self-improving AI pipelines, allowing developers to test concepts like automated architecture search or adaptive system evolution. It integrates multiple components including prompt management, execution control, and feedback loops to ensure that generated outputs can be evaluated and improved over time.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Context Engineering Template

    Context Engineering Template

    Context engineering is the new vibe coding

    Context Engineering Template is a comprehensive template and workflow repository designed to teach and implement context engineering, a structured approach to preparing and organizing the information necessary for AI coding assistants to complete complex tasks reliably. Instead of relying solely on short prompts, this project encourages developers to create rich, structured context files that include project rules, examples, and validation criteria so that AI systems can act more like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    ...Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    ...The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through full 3D rendering processes. Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. Its modular design allows easy extension—components like differentiable rasterizers, mesh blending, or signed distance field (SDF) modules can be swapped or combined to test new architectures quickly.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    FuzzyAI Fuzzer

    FuzzyAI Fuzzer

    A powerful tool for automated LLM fuzzing

    FuzzyAI is an open-source fuzzing framework designed to test the security and reliability of large language model applications. The tool automates the process of generating adversarial prompts and input variations to identify vulnerabilities such as jailbreaks, prompt injections, or unsafe model responses. It allows developers and security researchers to systematically evaluate the robustness of LLM-based systems by simulating a wide range of malicious or unexpected inputs. The framework can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ASSERT

    ASSERT

    Requirement-driven evaluation harness for AI agents and LLM

    ASSERT is a requirement-driven evaluation harness for AI agents and LLM applications. It turns natural-language specifications, policies, product requirements, and launch criteria into structured tests that can be reviewed, executed, scored, and improved. The pipeline derives behavior categories, generates single-turn and multi-turn test cases, runs them against a target system, and uses an LLM judge to score conversations against the stated policies. It can evaluate hosted models, custom...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Spring AI Alibaba Examples

    Spring AI Alibaba Examples

    Spring AI Alibaba examples for building and testing AI apps

    ...It is designed to help developers understand core concepts, explore practical implementations, and follow best practices when building AI-powered systems using the Spring ecosystem. Each module focuses on a specific use case such as chat, image processing, audio handling, graph workflows, and retrieval-augmented generation. The examples highlight how to integrate AI models, manage prompts, handle memory, and build multi-model or multi-agent workflows. Developers can explore individual project folders for detailed instructions and implementation guidance. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Groq Python

    Groq Python

    The official Python Library for the Groq API

    Groq Python is the official Python SDK for the Groq REST API, giving Python developers straightforward access to Groq’s LLM, chat, audio, and other AI services. Through this library, you can call Groq’s models from Python code — for example to request chat completions, code generation, transcription, or any supported endpoint — using idiomatic Python syntax. The SDK handles authentication (via environment variable or parameter), defines proper type-safe request/response data types, and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    FlowLens MCP

    FlowLens MCP

    Open-source MCP server that gives your coding agent

    FlowLens MCP Server is an open-source tool designed to give AI-powered coding agents (like Claude Code, Cursor, GitHub Copilot / Codex, and others) full, replayable browser context to dramatically improve debugging, bug reporting, and regression testing for web applications. It works together with a companion browser extension: when a user reproduces a bug or a complicated UI interaction, the extension captures a rich session log, including screen/video recording, network traffic, console...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Aden Hive

    Aden Hive

    Outcome driven agent development framework that evolves

    Hive is an open-source agent development framework that helps developers build autonomous, reliable, self-improving AI agents by letting them describe goals in ordinary natural language instead of hand-coding detailed workflows. Rather than manually defining execution graphs, Hive’s coding agent generates the agent graph, connection code, and test cases based on your high-level objectives, enabling outcome-driven agent creation that fits real business processes. Once deployed, agents can...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo