← Concept library

Reasoning Models

Process Reward Models and Verifiable Rewards

Why scoring every step of a reasoning trace beats scoring only the final answer, and how Ai2 and DeepSeek replaced PRMs entirely with programmatic correctness checks.

advanced · 9 min read · Premium

This concept is for Pro members.

Unlock the full library, study plans, the AI mentor, and daily emails.

See plans