tests free download - SourceForge

Showing 52 open source projects for "tests"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
1

CodiumAI Cover-Agent

CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation

CodiumAI Cover Agent aims to help efficiently increasing code coverage, by automatically generating qualified tests to enhance existing test suites.

Downloads: 3 This Week

Last Update: 2025-05-21
See Project
2

Qodo Cover

AI tool that generates tests to improve code coverage quickly

Qodo Cover is an open source developer tool designed to automate the creation of unit tests using generative AI, helping teams improve code coverage with minimal manual effort. It operates as a command-line interface and can also be integrated into continuous integration workflows, making it adaptable to different development environments. It analyzes an existing codebase, identifies gaps in test coverage, and generates new tests that target uncovered or weakly tested areas. ...

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
3

HumanEval

Code for the paper "Evaluating Large Language Models Trained on Code"

human-eval is a benchmark dataset and evaluation framework created by OpenAI for measuring the ability of language models to generate correct code. It consists of hand-written programming problems with unit tests, designed to assess functional correctness rather than superficial metrics like text similarity. Each task includes a natural language prompt and a function signature, requiring the model to generate an implementation that passes all provided tests. The benchmark has become a standard for evaluating code generation models, including those in the Codex and GPT families. ...

Downloads: 4 This Week

Last Update: 2 days ago
See Project
4

HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

...The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU memory usage / improve efficiency. Parallel inference code to speed up sampling, utilities and tests included.

1 Review

Downloads: 7 This Week

Last Update: 2025-09-23
See Project
Build Securely on Azure with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
5

Giskard

Collaborative & Open-Source Quality Assurance for all AI models

...The Giskard scan automatically detects vulnerability issues such as performance bias, data leakage, unrobustness, spurious correlation, overconfidence, underconfidence, unethical issue, etc. Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.

Downloads: 0 This Week

Last Update: 2026-07-13
See Project
6

Fara-7B

An Efficient Agentic Model for Computer Use

...It provides stakeholders with a way to benchmark and evaluate models across dimensions such as fairness, robustness, security, privacy, and ethical considerations. Rather than relying on ad-hoc or manual review processes, FARA enables organizations to profile AI behavior using standardized tests, metrics, and reporting templates, making evaluations reproducible and comparable over time. The framework supports plugin-based modules that can be tailored to industry-specific concerns or regulatory requirements, helping compliance teams, auditors, and engineers collaborate on shared assessment goals.

Downloads: 2 This Week

Last Update: 2026-07-22
See Project
7

LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

...Its design supports reuse beyond strict prefix matching and enables sharing across serving instances, improving efficiency under real multi-tenant traffic. The broader project includes examples, tests, a server component, and public posts describing cross-engine sharing and inter-GPU KV transfers. These capabilities aim to lower latency, cut GPU cycles, and stabilize performance for production workloads with overlapping prompts or retrieval-augmented contexts. The end result is a cache fabric for LLMs that complements engines rather than replacing them.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
8

Anthropic's Original Performance

Anthropic's original performance take-home, now open for you to try

...The project sets up a baseline performance problem where participants work to reduce simulated “clock cycles” required to run a given workload, effectively challenging them to engineer faster code under constraints. This take-home includes starter code, tests, and tools to debug performance, aiming to measure how effectively one can apply algorithmic improvements and optimizations. Because it’s framed around beating baseline scores — and even outperforming previous automated systems — it encourages both deep knowledge of Python and creative problem-solving.

Downloads: 0 This Week

Last Update: 2026-01-27
See Project
9

AI Runner

Offline inference engine for art, real-time voice conversations

...The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.

Downloads: 1 This Week

Last Update: 2025-12-11
See Project
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
10

Agentless

An agentless approach to automatically solve software development

...It then generates multiple candidate patches for the identified locations using language model reasoning and diff-style edits. In the final stage, the framework validates potential patches by running regression tests and additional reproduction tests to confirm whether the fix resolves the original error. Based on these results, the system ranks the candidate patches and selects the most reliable solution to submit.

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
11

ArXiv MCP Server

A Model Context Protocol server for searching and analyzing arXiv

...With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and improving markdown conversion, reflecting active community use in research flows. It’s designed to be drop-in for MCP clients, giving them typed inputs/outputs and predictable errors around a well-known academic corpus. For developers building research copilots, it removes the glue work of wiring arXiv APIs into an agent toolchain.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
12

Windows Copilot API

Reverse engineered Windows Copilot into an OpenAI-compatible API

...Because compatible applications can target the local endpoint, existing OpenAI SDK workflows require few changes. The project runs on Windows, macOS, and Linux and also includes Docker configuration, tests, examples, and command-line tools. It relies on automation of the signed-in Copilot website and is not affiliated with Microsoft.

Downloads: 5 This Week

Last Update: 2026-07-16
See Project
13

Evidently

Evaluate and monitor ML models from validation to production

Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
14

Claude Blog

Claude Code blog skill suite

Claude Blog is a Claude Code blog skill suite for planning, writing, optimizing, and auditing long-form content. It is built as a full-lifecycle blog engine with 30 sub-skills, 5 agents, 12 content templates, on-demand references, Python scripts, and automated tests. The workflow is designed to produce production-ready content rather than one-shot AI drafts. It uses a five-gate delivery contract covering capability, format, visual quality, content review, and asset integrity. Commands support new blog posts, rewrites, content audits, briefs, editorial calendars, strategy, outlines, SEO checks, personas, taxonomy, multilingual workflows, research, audio narration, and Google data. ...

Downloads: 6 This Week

Last Update: 6 days ago
See Project
15

pmdarima

Statistical library designed to fill the void in Python's time series

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Downloads: 0 This Week

Last Update: 2025-11-17
See Project
16

AI Chatbot Framework

Python chatbot framework with Natural Language Understanding

Building a chatbot can sound daunting, but it’s totally doable. AI Chatbot Framework is an AI powered conversational dialog interface built in Python. With this tool, it’s easy to create Natural Language conversational scenarios with no coding efforts whatsoever. The smooth UI makes it effortless to create and train conversations to the bot and it continuously gets smarter as it learns from conversations it has with people. AI Chatbot Framework can live on any channel of your choice (such as...

Downloads: 2 This Week

Last Update: 2025-02-03
See Project
17

Devon

Open source AI pair programmer for coding, debugging, automation

...Devon integrates with multiple large language models, allowing users to choose between different providers for performance, cost, and latency considerations. It is capable of performing tasks such as debugging, writing tests, analyzing code structure, and navigating complex repositories. Devon also includes features for session management, enabling users to start, pause, and revert actions while maintaining context.

Downloads: 2 This Week

Last Update: 3 days ago
See Project
18

GLM-4.5

GLM-4.5: Open-source LLM for intelligent agents by Z.ai

GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...

1 Review

Downloads: 16 This Week

Last Update: 2026-02-01
See Project
19

Serena

Agent toolkit providing semantic retrieval and editing capabilities

Serena is a coding-focused agent toolkit that turns an LLM into a practical software-engineering agent with semantic retrieval and editing over real repositories. It operates as an MCP server (and other integrations), exposing IDE-like tools so agents can locate symbols, reason about code structure, make targeted edits, and validate changes. The toolkit is LLM-agnostic and framework-agnostic, positioning itself as a drop-in capability for different chat UIs, orchestrators, or custom agent...

Downloads: 2 This Week

Last Update: 2026-07-21
See Project
20

AWS Deep Learning Containers

A set of Docker images for training and serving models in TensorFlow

AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). The AWS DLCs are used in Amazon SageMaker as the default vehicles for your SageMaker jobs such as training, inference,...

Downloads: 3 This Week

Last Update: 1 day ago
See Project
21

NVIDIA Earth2Studio

Open-source deep-learning framework

...Users can extend Earth2Studio with optional model packs, advanced data interfaces, statistical operators, and backend integrations that support flexible workflows from simple tests to large-scale operational inference.

Downloads: 1 This Week

Last Update: 2026-06-29
See Project
22

CodiumAI PR-Agent

AI-Powered tool for automated pull request analysis

CodiumAI PR-Agent is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of commands. See the Usage Guide for instructions how to run the different tools from CLI, online usage, Or by automatically triggering them when a new PR is opened. You can try GPT-4 powered PR-Agent, on your public GitHub repository, instantly. Just mention @CodiumAI-Agent and add the desired command in...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
23

ASSERT

Requirement-driven evaluation harness for AI agents and LLM

ASSERT is a requirement-driven evaluation harness for AI agents and LLM applications. It turns natural-language specifications, policies, product requirements, and launch criteria into structured tests that can be reviewed, executed, scored, and improved. The pipeline derives behavior categories, generates single-turn and multi-turn test cases, runs them against a target system, and uses an LLM judge to score conversations against the stated policies. It can evaluate hosted models, custom agents, multi-agent systems, REST clients, and frameworks such as LangGraph, CrewAI, AutoGen, DSPy, LlamaIndex, and OpenAI Agents SDK. ...

Downloads: 0 This Week

Last Update: 2026-06-04
See Project
24

MiroThinker

MiroThinker is an open source deep research agent

...The platform is optimized for research tasks such as financial forecasting, knowledge discovery, and large-scale information synthesis. MiroThinker has been evaluated on several agent benchmarks and has demonstrated strong performance on tests designed to measure deep research capabilities.

Downloads: 0 This Week

Last Update: 2026-04-13
See Project
25

LLM Colosseum

Benchmark LLMs by fighting in Street Fighter 3

...The system places language models inside the environment of the classic video game Street Fighter III, where they must interpret the game state and decide which actions to perform during combat. This setup creates a dynamic environment that tests reasoning, situational awareness, and decision-making abilities in real time. Instead of relying purely on reward signals as in reinforcement learning agents, the models analyze contextual information and generate strategic actions based on the game environment. Performance is evaluated using a competitive ranking system that assigns models an ELO rating based on their results across matches against other models.

Downloads: 0 This Week

Last Update: 2026-03-07
See Project