Interactive Thought Piece

The Rating Game

An interactive lesson in Goodhart's Law & RLHF

You'll rate pairs of AI responses — just pick the better one. Your preferences will train a reward model. Then you'll see what happens when an AI optimizes for your ratings.

Phase 1

Rate 8 pairs

Phase 2

Watch deployment

Phase 3

See the divergence

jakelawrence.xyz · AI Concepts Series

Theme

Trust

Both demonstrate Goodhart's Law - how measurement systems corrupt authentic behavior

Theme

The Invisible Architecture

Both show how measurement systems become invisible infrastructure that shapes behavior

Theory

The New Sorting Hat

Both examine how AI classification systems create systematic biases through optimization pressure

Theme

Subgenre Survival

Both explore tension between authentic expression and external validation/success metrics

The Rating Game

Related