← Concept library

Architectures & Scaling

The Constitutional AI Data Loop

Constitutional AI replaces most human preference labels with a self-critique-and-revise loop guided by a written list of principles, producing both supervised fine-tuning data and AI-labelled preference pairs that train a reward model.

advanced · 8 min read · Premium

Anthropic's Claude models were trained on a feedback signal that was, in large part, written by Claude itself. Roughly 96 principles - the "constitution" - governed what the model was allowed to say, and the labelling work that would ordinarily require thousands of human annotators was delegated back to the model under instruction. That is the Constitutional AI (CAI) loop: a closed synthetic data pipeline where the policy model is simultaneously the student, the critic, and most of the grader.

The Two-Stage Architecture

CAI separates training into two sequential stages, each producing a distinct dataset.

Keep reading with Pro.

You're reading the preview. Unlock the full concept plus the library, study plans, the AI mentor, and daily emails.

Sign in to save and react.
Share Copied