A microbenchmark support library
HardeningKitty and Windows Hardening Settings
IPerf3 startup program for windows with GUI
RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner
GPU benchmark testing graphics performance with realistic 3D scenes.
A command-line benchmarking tool
Agentic, Reasoning, and Coding (ARC) foundation models
A benchmarking framework for the Julia language
A Heterogeneous Benchmark for Information Retrieval
A simple disk benchmark software
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
CrystalMark Retro is a comprehensive benchmarking software.
Benchmarking synthetic data generation methods
Code for the paper "Evaluating Large Language Models Trained on Code"
MTEB: Massive Text Embedding Benchmark
A Fast and Easy to use microframework for the web
LongBench v2 and LongBench (ACL 25'&24')
A.S.E (AICGSecEval) is a repository-level AI-generated code security
Visual Causal Flow
Reference implementations of MLPerf™ training benchmarks
Code for running inference and finetuning with SAM 3 model
Checks whether Kubernetes is deployed
Integrates the JMH benchmarking framework with Gradle
Meta Agents Research Environments is a comprehensive platform
Leaderboard Comparing LLM Performance at Producing Hallucinations