A microbenchmark support library
RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner
Reference implementations of MLPerf™ training benchmarks
A command-line benchmarking tool
GPU benchmark testing graphics performance with realistic 3D scenes.
Agentic, Reasoning, and Coding (ARC) foundation models
A benchmarking framework for the Julia language
A Heterogeneous Benchmark for Information Retrieval
A simple disk benchmark software
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Code for the paper "Evaluating Large Language Models Trained on Code"
Code for running inference and finetuning with SAM 3 model
LongBench v2 and LongBench (ACL 25'&24')
A.S.E (AICGSecEval) is a repository-level AI-generated code security
Visual Causal Flow
A Fast and Easy to use microframework for the web
Checks whether Kubernetes is deployed
Benchmarking synthetic data generation methods
Meta Agents Research Environments is a comprehensive platform
CrystalMark Retro is a comprehensive benchmarking software.
Integrates the JMH benchmarking framework with Gradle
Leaderboard Comparing LLM Performance at Producing Hallucinations
MTEB: Massive Text Embedding Benchmark
A GPU overclock & undervolt tool for various Snapdragon chips
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)