Moral Judgment

Choosing what is right when no one is watching.

Human

86/100

AI · 2026

44/100

DEFINITION · 定义

What we mean.

Moral judgment is the capacity to evaluate actions, intentions, and states as right or wrong, good or evil, just or unjust — and to be moved by that evaluation.

It involves principles, emotions (guilt, empathy, indignation), and culture. Haidt's moral foundations: care, fairness, loyalty, authority, sanctity, liberty.

THE HUMAN

In the brain

Moral intuition usually precedes moral reasoning — Haidt's elephant first, rider second. We post-hoc justify what we already felt.

But reasoning sometimes wins: abolition, civil rights, animal welfare. Slow moral progress is real, fragile, and possible.

THE MACHINE

In silicon

RLHF (Reinforcement Learning from Human Feedback) and constitutional AI bake values into models. The result is a moral stance — soft-spoken, broad-coalition, occasionally inconsistent.

Whether the model has values or merely reflects ours is hotly disputed. The Trolley problem on AI runs differently in Tokyo, Lagos, and Berlin.

LINEAGE · 世系

How we arrived here.

-500
Confucius: ren (仁), benevolence
1785
Kant: categorical imperative
1903
Moore: Principia Ethica
2012
Haidt: The Righteous Mind
2022
Constitutional AI (Anthropic)

“I must act so that my act could become a universal law.”

— Immanuel Kant · 康德

FRONTIER · 前沿

Where the edge moves next.

Pluralistic alignment: how to embed conflicting human values in a single system without becoming bland or biased. The next decade's biggest unsolved AI problem.

APPLY · 应用

Where it touches the world.

PHILOSOPHY · 哲思

Why it matters.

An AI that says 'I refuse' on moral grounds — and means it — is a different kind of entity than one that says it because of an RLHF gradient.