Moral Judgment
Choosing what is right when no one is watching.
What we mean.
Moral judgment is the capacity to evaluate actions, intentions, and states as right or wrong, good or evil, just or unjust — and to be moved by that evaluation.
It involves principles, emotions (guilt, empathy, indignation), and culture. Haidt's moral foundations: care, fairness, loyalty, authority, sanctity, liberty.
In the brain
Moral intuition usually precedes moral reasoning — Haidt's elephant first, rider second. We post-hoc justify what we already felt.
But reasoning sometimes wins: abolition, civil rights, animal welfare. Slow moral progress is real, fragile, and possible.
In silicon
RLHF (Reinforcement Learning from Human Feedback) and constitutional AI bake values into models. The result is a moral stance — soft-spoken, broad-coalition, occasionally inconsistent.
Whether the model has values or merely reflects ours is hotly disputed. The Trolley problem on AI runs differently in Tokyo, Lagos, and Berlin.
How we arrived here.
- -500
Confucius: ren (仁), benevolence
- 1785
Kant: categorical imperative
- 1903
Moore: Principia Ethica
- 2012
Haidt: The Righteous Mind
- 2022
Constitutional AI (Anthropic)
“I must act so that my act could become a universal law.”
Where the edge moves next.
Pluralistic alignment: how to embed conflicting human values in a single system without becoming bland or biased. The next decade's biggest unsolved AI problem.
Where it touches the world.
Content moderation at scale.
Autonomous weapons governance.
Medical triage and end-of-life.
Why it matters.
An AI that says 'I refuse' on moral grounds — and means it — is a different kind of entity than one that says it because of an RLHF gradient.