PRM800K is a process supervision dataset accompanying the paper Let’s Verify Step by Step, providing 800,000 step-level correctness labels on model-generated solutions to problems from the MATH dataset. The repository releases the raw labels and the labeler instructions used in two project phases, enabling researchers to study how human raters graded intermediate reasoning. Data are stored as newline-delimited JSONL files tracked with Git LFS, where each line is a full solution sample that can contain many step-level labels and rich metadata such as labeler UUIDs, timestamps, generation identifiers, and quality-control flags. Each labeled step can include multiple candidate completions with ratings of -1, 0, or +1, optional human-written corrections (phase 1), and a chosen completion index, along with a final finish reason such as found_error, solution, bad_problem, or give_up.

Features

  • 800,000 step-level correctness labels for MATH problems via JSONL
  • Detailed schema with labeler IDs, timestamps, generations, QC flags, and finish reasons
  • Multi-candidate step ratings of -1, 0, +1 with optional human-completion entries
  • Labeler instruction docs for both phase 1 and phase 2
  • Python grading logic using math normalization and sympy equivalence checks
  • Nonstandard MATH train/test split and large-scale scored samples with PRM/ORM eval scripts

Project Samples

Project Activity

See All Activity >

Categories

AI Models

License

MIT License

Follow PRM800K

PRM800K Web Site

Other Useful Business Software
Earn up to 16% annual interest with Nexo. Icon
Earn up to 16% annual interest with Nexo.

Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
Get started with Nexo.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of PRM800K!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Models

Registered

2025-10-04